skweak: Weak Supervision Made Easy for NLP

Lison, Pierre; Barnes, Jeremy; Hubin, Aliaksandr

doi:10.18653/v1/2021.acl-demo.40

Computer Science > Computation and Language

arXiv:2104.09683 (cs)

[Submitted on 19 Apr 2021]

Title:skweak: Weak Supervision Made Easy for NLP

Authors:Pierre Lison, Jeremy Barnes, Aliaksandr Hubin

View PDF

Abstract:We present skweak, a versatile, Python-based software toolkit enabling NLP developers to apply weak supervision to a wide range of NLP tasks. Weak supervision is an emerging machine learning paradigm based on a simple idea: instead of labelling data points by hand, we use labelling functions derived from domain knowledge to automatically obtain annotations for a given dataset. The resulting labels are then aggregated with a generative model that estimates the accuracy (and possible confusions) of each labelling function. The skweak toolkit makes it easy to implement a large spectrum of labelling functions (such as heuristics, gazetteers, neural models or linguistic constraints) on text data, apply them on a corpus, and aggregate their results in a fully unsupervised fashion. skweak is especially designed to facilitate the use of weak supervision for NLP tasks such as text classification and sequence labelling. We illustrate the use of skweak for NER and sentiment analysis. skweak is released under an open-source license and is available at: this https URL

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2104.09683 [cs.CL]
	(or arXiv:2104.09683v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2104.09683
Related DOI:	https://doi.org/10.18653/v1/2021.acl-demo.40

Submission history

From: Pierre Lison [view email]
[v1] Mon, 19 Apr 2021 23:26:51 UTC (5,327 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-04

Change to browse by:

cs
cs.AI
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Pierre Lison
Jeremy Barnes
Aliaksandr Hubin

export BibTeX citation

Computer Science > Computation and Language

Title:skweak: Weak Supervision Made Easy for NLP

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:skweak: Weak Supervision Made Easy for NLP

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators