Multi-task Pairwise Neural Ranking for Hashtag Segmentation

Maddela, Mounica; Xu, Wei; Preoţiuc-Pietro, Daniel

Computer Science > Computation and Language

arXiv:1906.00790 (cs)

[Submitted on 3 Jun 2019 (v1), last revised 13 Jun 2019 (this version, v2)]

Title:Multi-task Pairwise Neural Ranking for Hashtag Segmentation

Authors:Mounica Maddela, Wei Xu, Daniel Preoţiuc-Pietro

View PDF

Abstract:Hashtags are often employed on social media and beyond to add metadata to a textual utterance with the goal of increasing discoverability, aiding search, or providing additional semantics. However, the semantic content of hashtags is not straightforward to infer as these represent ad-hoc conventions which frequently include multiple words joined together and can include abbreviations and unorthodox spellings. We build a dataset of 12,594 hashtags split into individual segments and propose a set of approaches for hashtag segmentation by framing it as a pairwise ranking problem between candidate segmentations. Our novel neural approaches demonstrate 24.6% error reduction in hashtag segmentation accuracy compared to the current state-of-the-art method. Finally, we demonstrate that a deeper understanding of hashtag semantics obtained through segmentation is useful for downstream applications such as sentiment analysis, for which we achieved a 2.6% increase in average recall on the SemEval 2017 sentiment analysis dataset.

Comments:	12 pages, ACL 2019
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1906.00790 [cs.CL]
	(or arXiv:1906.00790v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1906.00790

Submission history

From: Mounica Maddela [view email]
[v1] Mon, 3 Jun 2019 13:28:33 UTC (409 KB)
[v2] Thu, 13 Jun 2019 20:07:28 UTC (421 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2019-06

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Mounica Maddela
Wei Xu
Daniel Preotiuc-Pietro

export BibTeX citation

Computer Science > Computation and Language

Title:Multi-task Pairwise Neural Ranking for Hashtag Segmentation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Multi-task Pairwise Neural Ranking for Hashtag Segmentation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators