MLinter: Learning Coding Practices from Examples-Dream or Reality?

Latappy, Corentin; Perez, Quentin; Degueule, Thomas; Falleri, Jean-Rémy; Urtado, Christelle; Vauttier, Sylvain; Blanc, Xavier; Teyton, Cédric

Computer Science > Software Engineering

arXiv:2301.10082 (cs)

[Submitted on 24 Jan 2023]

Title:MLinter: Learning Coding Practices from Examples-Dream or Reality?

Authors:Corentin Latappy, Quentin Perez (Euromov DHM), Thomas Degueule, Jean-Rémy Falleri (IUF), Christelle Urtado (Euromov DHM), Sylvain Vauttier (Euromov DHM), Xavier Blanc, Cédric Teyton

View PDF

Abstract:Coding practices are increasingly used by software companies. Their use promotes consistency, readability, and maintainability, which contribute to software quality. Coding practices were initially enforced by general-purpose linters, but companies now tend to design and adopt their own company-specific practices. However, these company-specific practices are often not automated, making it challenging to ensure they are shared and used by developers. Converting these practices into linter rules is a complex task that requires extensive static analysis and language engineering expertise. In this paper, we seek to answer the following question: can coding practices be learned automatically from examples manually tagged by developers? We conduct a feasibility study using CodeBERT, a state-of-the-art machine learning approach, to learn linter rules. Our results show that, although the resulting classifiers reach high precision and recall scores when evaluated on balanced synthetic datasets, their application on real-world, unbalanced codebases, while maintaining excellent recall, suffers from a severe drop in precision that hinders their usability.

Subjects:	Software Engineering (cs.SE)
Cite as:	arXiv:2301.10082 [cs.SE]
	(or arXiv:2301.10082v1 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2301.10082
Journal reference:	30th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Mar 2023, Macao SAR, Macau SAR China

Submission history

From: Thomas Degueule [view email] [via CCSD proxy]
[v1] Tue, 24 Jan 2023 15:40:24 UTC (457 KB)

Computer Science > Software Engineering

Title:MLinter: Learning Coding Practices from Examples-Dream or Reality?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:MLinter: Learning Coding Practices from Examples-Dream or Reality?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators