Detecting API documentation errors

H Zhong, Z Su - Proceedings of the 2013 ACM SIGPLAN international …, 2013 - dl.acm.org
Proceedings of the 2013 ACM SIGPLAN international conference on Object …, 2013dl.acm.org
When programmers encounter an unfamiliar API library, they often need to refer to its
documentations, tutorials, or discussions on development forums to learn its proper usage.
These API documents contain valuable information, but may also mislead programmers as
they may contain errors (eg, broken code names and obsolete code samples). Although
most API documents are actively maintained and updated, studies show that many new and
latent errors do exist. It is tedious and error-prone to find such errors manually as API …
When programmers encounter an unfamiliar API library, they often need to refer to its documentations, tutorials, or discussions on development forums to learn its proper usage. These API documents contain valuable information, but may also mislead programmers as they may contain errors (e.g., broken code names and obsolete code samples). Although most API documents are actively maintained and updated, studies show that many new and latent errors do exist. It is tedious and error-prone to find such errors manually as API documents can be enormous with thousands of pages. Existing tools are ineffective in locating documentation errors because traditional natural language (NL) tools do not understand code names and code samples, and traditional code analysis tools do not understand NL sentences. In this paper, we propose the first approach, DOCREF, specifically designed and developed to detect API documentation errors. We formulate a class of inconsistencies to indicate potential documentation errors, and combine NL and code analysis techniques to detect and report such inconsistencies. We have implemented DOCREF and evaluated its effectiveness on the latest documentations of five widely-used API libraries. DOCREF has detected more than 1,000 new documentation errors, which we have reported to the authors. Many of the errors have already been confirmed and fixed, after we reported them.
ACM Digital Library