skip to main content
article
Free access

An empirical study of static call graph extractors

Published: 01 April 1998 Publication History

Abstract

Informally, a call graph represents calls between entities in a given program. The call graphs that compilers compute to determine the applicability of an optimization must typically be conservative: a call may be omitted only if it can never occur in any execution of the program. Numerous software engineering tools also extract call graphs with the expectation that they will help software engineers increase their understanding of a program. The requirements placed on software engineering tools that compute call graphs are typically more relaxed than for compilers. For example, some false negatives—calls that can in fact take place in some execution of the program, but which are omitted from the call graph—may be acceptable, depending on the understanding task at hand. In this article, we empirically show a consequence of this spectrum of requirements by comparing the C call graphs extracted from three software systems (mapmaker, mosaic, and gcc) by nine tools (cflow, cawk, CIA, Field, GCT, Imagix, LSME, Mawk, and Rigiparse). A quantitative analysis of the call graphs extracted for each system shows considerable variation, a result that is counterintuitive to many experienced software engineers. A qualitative analysis of these results reveals a number of reasons for this variation: differing treatments of macros, function pointers, input formats, etc. The fundamental problem is not that variances among the graphs extracted by different tools exist, but that software engineers have little sense of the dimensions of approximation in any particular call graph. In this article, we describe and discuss the study, sketch a design space for static call graph extractors, and discuss the impact of our study on practitioners, tool developers, and researchers. Although this article considers only one kind of information, call graphs, many of the observations also apply to static extractors of other kinds of information, such as inheritance structures, file dependences, and references to global variables.

References

[1]
AHO, A. V., KERNIGHAN, B. W., AND WEINBERGER, P.J. 1979. Awk: A pattern scanning and processing language. Softw. Pract. Exper. 9, 4, 267-280.
[2]
ALLEN, F. 1974. Interprocedural data flow anlaysis. In Proceedings of Information Processing 74 (Software). North-Holland Publishing Co., Amsterdam, The Netherlands, 398-402.
[3]
BANNING, J. P. 1979. An efficient way to find the side effects of procedure calls and the aliases of variables. In Conference Record of the 6th ACM Symposium on Principles of Programming Languages. ACM, New York, NY, 29-41.
[4]
BARTH, J. M. 1978. A practical interprocedural data flow analysis algorithm. Commun. ACM 21, 9 (Sept.), 29-41.
[5]
CALLAHAN, D., eARLE, A., HALL, M. W., AND KENNEDY, K. 1990. Constructing the procedure call multigraph. IEEE Trans. Softw. Eng. 16, 4 (Apr.), 483-487.
[6]
CHEN, Y.-F., NISHIMOTO, M. Y., AND RAMAMOORTHY, C.V. 1990. The C information abstraction system. IEEE Trans. Softw. Eng. 16, 3 (Mar.), 325-334.
[7]
COOPER, K. AND KENNEDY, K. 1984. Efficient computation of flow insensitive interprocedural summary information. In Proceedings of the ACM SIGPLAN '84 Symposium on Compiler Construction. ACM Press, New York, NY, 247-258.
[8]
COOPER, K. D., KENNEDY, K., AND TORCZON, L. 1986. Interprocedural optimization: Eliminating unnecessary recompilation. SIGPLAN Not. 21, 7 (July), 58-67.
[9]
FELDMAN, S. I. 1978. Make: A program for maintaining computer programs. Tech. Rep. 57. AT&T Bell Laboratories, Inc., Murray Hill, NJ.
[10]
GRAHAM, S. L., KESSLER, P. B., AND McKuSICK, M.K. 1982. Gprof: A call graph execution profiler. In Proceedings of the SIGPLAN '82 Symposium on Compiler Construction. ACM, New York, NY, 120-126.
[11]
GRISWOLD, W. G., ATKINSON, D., AND MCCURDY, C. 1996. Fast, flexible syntactic pattern matching and processing. In Proceedings of the IEEE 1996 4th Workshop on Program Comprehension (WPC '96) (Berlin, Germany). IEEE Press, Piscataway, NJ, 144-153.
[12]
HALL, M. W. AND KENNEDY, K. 1992. Efficient call graph analysis. ACM Lett. Program. Lang. Syst. 1, 3 (Sept.), 227-242.
[13]
HATTON, L. AND ROBERTS, A. 1994. How accurate is scientific software?. IEEE Trans. Softw. Eng. 20, 10 (Oct.), 785-797.
[14]
JOHNSON, S. C. 1975. Yacc: Yet another compiler compiler. Tech. Rep. 32. AT&T Bell Laboratories, Inc., Murray Hill, NJ.
[15]
LAKHOTIA, A. 1993. Constructing call multigraphs using dependence graphs. In Conference Record of the 20th ACM Symposium on Principles of Programming Languages (Charleston, SC, Jan. 10-13, 1993). ACM Press, New York, NY, 273-284.
[16]
LESK, M. 1975. Lex--A lexical analyzer generator. Tech. Rep. 39. AT&T Bell Laboratories, Inc., Murray Hill, NJ.
[17]
MARICK, B. 1994. Craft of Software Testing. Prentice-Hall, Inc., Upper Saddle River, NJ.
[18]
MOLLER, H. A. AND KLASHINSKY, K. 1988. Rigi--A system for programming-in-the-large. In Proceedings of the lOth International Conference on Software Engineering (Singapore, April 11-15, 1988). IEEE Computer Society Press, Los Alamitos, CA, 80-86.
[19]
MURPHY, G. C. AND NOTKIN, D. 1996. Lightweight lexical source model extraction. ACM Trans. Softw. Eng. Methodol. 5, 3 (July), 262-292.
[20]
MURPHY, G., NOTKIN, D., AND LAN, E.-C. 1996. An empirical study of static call graph extractors. In Proceedings of the 18th International Conference on Software Engineering (Berlin, Germany, Mar. 25-29). IEEE Computer Society Press, Los Alamitos, CA, 90-99.
[21]
MYERS, E. 1981. A precise inter-procedural data flow algorithm. In Conference Record of the 8th ACM Symposium on Principles of Programming Languages. ACM, New York, NY, 219-230.
[22]
REISS, S. 1995. The Field Programming Environment: A Friendly Integrated Environment for Learning and Development. Kluwer Academic Publishers, Hingham, MA.
[23]
RYDER, B. G. 1979. Constructing the call graph of a program. IEEE Trans. Softw. Eng. SE-5, 3 (May), 216-226.

Cited By

View all

Index Terms

  1. An empirical study of static call graph extractors

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Software Engineering and Methodology
    ACM Transactions on Software Engineering and Methodology  Volume 7, Issue 2
    April 1998
    106 pages
    ISSN:1049-331X
    EISSN:1557-7392
    DOI:10.1145/279310
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 April 1998
    Published in TOSEM Volume 7, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. call graphs
    2. design space
    3. empirical study
    4. software system analysis
    5. static analysis

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)155
    • Downloads (Last 6 weeks)23
    Reflects downloads up to 22 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media