research-article

LabelUX! Guidelines to support software engineers to design data labeling systems

Authors:

Leticia Carvalho Passos,

Lucas Viana,

Edson Oliveira,

Tayana ConteAuthors Info & Claims

SBQS '21: Proceedings of the XX Brazilian Symposium on Software Quality

Article No.: 9, Pages 1 - 10

https://doi.org/10.1145/3493244.3493252

Published: 14 December 2021 Publication History

Get Access

Abstract

The demand for systems using artificial intelligence has substantially boosted in recent times, especially with Machine Learning (ML) techniques. Systems that use ML supervision techniques need representative and correctly categorized data to ensure its quality. In this context, a data labeling step plays a fundamental role during the development of such systems. The labeling is performed by users specialized in the data domain and aims to generate a database to enable a supervised ML model. However, labeling is exhausting for users, which can compromise the quality of the ML system, especially if the labeling is being done on systems that were not designed to assist the user in this activity. On the one hand, it can be difficult for a software engineer to design these kinds of systems. Depending on the type of data to be labeled, the interface needs different graphics and strategies to present and request user feedback. Aiming to help software engineers develop these kinds of systems, this work proposes the LabelUX guidelines. These guidelines aim to support software engineers in designing data labeling systems, defining a design with quality that provides a better user experience during the labeling task. We developed these guidelines from studies carried out in the literature and industry. We selected software engineers working on ML projects to participate in a feasibility study to evaluate the use of guidelines. The qualitative results obtained through the interview improved that the LabelUX guidelines supported a better design of textual type data labeling systems.

References

[1]

Xavier Amatriain. 2013. Big & Personal: Data and Models behind Netflix Recommendations. In Proceedings of the 2nd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications (Chicago, Illinois) (BigMine ’13). Association for Computing Machinery, New York, NY, USA, 1–6. https://doi.org/10.1145/2501221.2501222

Abstract

References

Cited By

Recommendations

OneLabeler: A Flexible System for Building Data Labeling Tools

Applying Usability Heuristics in the Context of Data Labeling Systems

A Cluster-then-label Approach for Few-shot Learning with Application to Automatic Image Data Labeling

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

HTML Format

Share

Share this Publication link

Share on social media

Affiliations