In compiler theory, dead-code elimination (DCE, dead-code removal, dead-code stripping, or dead-code strip) is a compiler optimization to remove dead code (code that does not affect the program results). Removing such code has several benefits: it shrinks program size, an important consideration in some contexts, it reduces resource usage such as the number of bytes to be transferred [1] and it allows the running program to avoid executing irrelevant operations, which reduces its running time. It can also enable further optimizations by simplifying program structure. Dead code includes code that can never be executed (unreachable code), and code that only affects dead variables (written to, but never read again), that is, irrelevant to the program.
Consider the following example written in C.
intfoo(void){inta=24;intb=25;/* Assignment to dead variable */intc;c=a*4;returnc;b=24;/* Unreachable code */return0;}
Simple analysis of the uses of values would show that the value of b
after the first assignment is not used inside foo
. Furthermore, b
is declared as a local variable inside foo
, so its value cannot be used outside foo
. Thus, the variable b
is dead and an optimizer can reclaim its storage space and eliminate its initialization.
Furthermore, because the first return statement is executed unconditionally and there is no label after it which a "goto" could reach, no feasible execution path reaches the second assignment to b
. Thus, the assignment is unreachable and can be removed. If the procedure had a more complex control flow, such as a label after the return statement and a goto
elsewhere in the procedure, then a feasible execution path might exist to the assignment to b
.
Also, even though some calculations are performed in the function, their values are not stored in locations accessible outside the scope of this function. Furthermore, given the function returns a static value (96), it may be simplified to the value it returns (this simplification is called constant folding).
Most advanced compilers have options to activate dead-code elimination, sometimes at varying levels. A lower level might only remove instructions that cannot be executed. A higher level might also not reserve space for unused variables. A yet higher level might determine instructions or functions that serve no purpose and eliminate them.
A common use of dead-code elimination is as an alternative to optional code inclusion via a preprocessor. Consider the following code.
intmain(void){inta=5;intb=6;intc;c=a*(b/2);if(0){/* DEBUG */printf("%d\n",c);}returnc;}
Because the expression 0 will always evaluate to false, the code inside the if statement can never be executed, and dead-code elimination would remove it entirely from the optimized program. This technique is common in debugging to optionally activate blocks of code; using an optimizer with dead-code elimination eliminates the need for using a preprocessor to perform the same task.
In practice, much of the dead code that an optimizer finds is created by other transformations in the optimizer. For example, the classic techniques for operator strength reduction insert new computations into the code and render the older, more expensive computations dead. [2] Subsequent dead-code elimination removes those calculations and completes the effect (without complicating the strength-reduction algorithm).
Historically, dead-code elimination was performed using information derived from data-flow analysis. [3] An algorithm based on static single-assignment form (SSA) appears in the original journal article on SSA form by Ron Cytron et al. [4] Robert Shillingsburg (aka Shillner) improved on the algorithm and developed a companion algorithm for removing useless control-flow operations. [5]
Dead code is normally considered dead unconditionally. Therefore, it is reasonable attempting to remove dead code through dead-code elimination at compile time.
However, in practice it is also common for code sections to represent dead or unreachable code only under certain conditions, which may not be known at the time of compilation or assembly. Such conditions may be imposed by different runtime environments (for example different versions of an operating system, or different sets and combinations of drivers or services loaded in a particular target environment), which may require different sets of special cases in the code, but at the same time become conditionally dead code for the other cases. [6] [7] Also, the software (for example, a driver or resident service) may be configurable to include or exclude certain features depending on user preferences, rendering unused code portions useless in a particular scenario. [6] [7] While modular software may be developed to dynamically load libraries on demand only, in most cases, it is not possible to load only the relevant routines from a particular library, and even if this would be supported, a routine may still include code sections which can be considered dead code in a given scenario, but could not be ruled out at compile time, already.
The techniques used to dynamically detect demand, identify and resolve dependencies, remove such conditionally dead code, and to recombine the remaining code at load or runtime are called dynamic dead-code elimination [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] or dynamic dead-instruction elimination. [19]
Most programming languages, compilers and operating systems offer no or little more support than dynamic loading of libraries and late linking, therefore software utilizing dynamic dead-code elimination is very rare in conjunction with languages compiled ahead-of-time or written in assembly language. [8] [12] [9] However, language implementations doing just-in-time compilation may dynamically optimize for dead-code elimination. [18] [20] [21]
Although with a rather different focus, similar approaches are sometimes also utilized for dynamic software updating and hot patching.
Scroll Lock is a lock key on most IBM-compatible computer keyboards. Depending on the operating system, it may be used for different purposes, and applications may assign functions to the key or change their behavior depending on its toggling state. The key is not frequently used, and therefore some reduced or specialized keyboards lack Scroll Lock altogether.
Bytecode is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references that encode the result of compiler parsing and performing semantic analysis of things like type, scope, and nesting depths of program objects.
In computer science, dynamic recompilation is a feature of some emulators and virtual machines, where the system may recompile some part of a program during execution. By compiling during execution, the system can tailor the generated code to reflect the program's run-time environment, and potentially produce more efficient code by exploiting information that is not available to a traditional static compiler.
In computer science, self-modifying code is code that alters its own instructions while it is executing – usually to reduce the instruction path length and improve performance or simply to reduce otherwise repetitively similar code, thus simplifying maintenance. The term is usually only applied to code where the self-modification is intentional, not in situations where code accidentally modifies itself due to an error such as a buffer overflow.
The A20, or address line 20, is one of the electrical lines that make up the system bus of an x86-based computer system. The A20 line in particular is used to transmit the 21st bit on the address bus.
In DOS memory management, the high memory area (HMA) is the RAM area consisting of the first 65520 bytes above the one megabyte in an IBM AT or compatible computer.
A fat binary is a computer executable program or library which has been expanded with code native to multiple instruction sets which can consequently be run on multiple processor types. This results in a file larger than a normal one-architecture binary file, thus the name.
In computer science, dynamic dispatch is the process of selecting which implementation of a polymorphic operation to call at run time. It is commonly employed in, and considered a prime characteristic of, object-oriented programming (OOP) languages and systems.
Relocation is the process of assigning load addresses for position-dependent code and data of a program and adjusting the code and data to reflect the assigned addresses. Prior to the advent of multiprocess systems, and still in many embedded systems, the addresses for objects are absolute starting at a known location, often zero. Since multiprocessing systems dynamically link and switch between programs it became necessary to be able to relocate objects using position-independent code. A linker usually performs relocation in conjunction with symbol resolution, the process of searching files and libraries to replace symbolic references or names of libraries with actual usable addresses in memory before running a program.
In computer programming, a runtime system or runtime environment is a sub-system that exists in the computer where a program is created, as well as in the computers where the program is intended to be run. The name comes from the compile time and runtime division from compiled languages, which similarly distinguishes the computer processes involved in the creation of a program (compilation) and its execution in the target machine.
In computer programming, the term hooking covers a range of techniques used to alter or augment the behaviour of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed between software components. Code that handles such intercepted function calls, events or messages is called a hook.
Remote Initial Program Load is a protocol for starting a computer and loading its operating system from a server via a network. Such a server runs a network operating system such as LAN Manager, LAN Server, Windows NT Server, Novell NetWare, LANtastic, Solaris or Linux.
In a general computing sense, overlaying means "the process of transferring a block of program code or other data into main memory, replacing what is already stored". Overlaying is a programming method that allows programs to be larger than the computer's main memory. An embedded system would normally use overlays because of the limitation of physical memory, which is internal memory for a system-on-chip, and the lack of virtual memory facilities.
.sys is a filename extension used in MS-DOS applications and Microsoft Windows operating systems. They are system files that contain device drivers or hardware configurations for the system.
Computer hardware or software is said to be bug compatible if it exactly replicates an undesirable feature of a previous version. The phrase is found in the Jargon File.
Ralf Brown's Interrupt List is a comprehensive list of interrupts, calls, hooks, interfaces, data structures, CMOS settings, memory and port addresses, as well as processor opcodes and special function registers for x86 machines from the 1981 IBM PC up to 2000, most of it still applying to IBM PC compatibles today.
In computer programming, a self-relocating program is a program that relocates its own address-dependent instructions and data when run, and is therefore capable of being loaded into memory at any address. In many cases, self-relocating code is also a form of self-modifying code.
In computer programming, profile-guided optimization, also known as profile-directed feedback (PDF) or feedback-directed optimization (FDO), is the compiler optimization technique of using prior analyses of software artifacts or behaviors ("profiling") to improve the expected runtime performance of the program.
An object code optimizer, sometimes also known as a post pass optimizer or, for small sections of code, peephole optimizer, forms part of a software compiler. It takes the output from the source language compile step - the object code or binary file - and tries to replace identifiable sections of the code with replacement code that is more algorithmically efficient.
In computer science, dynamic software updating (DSU) is a field of research pertaining to upgrading programs while they are running. DSU is not currently widely used in industry. However, researchers have developed a wide variety of systems and techniques for implementing DSU. These systems are commonly tested on real-world programs.
[…] any of the […] options can be "permanently" excluded at installation time (will also save the memory for the corresponding code excerpts due to our Dynamic Dead Code Elimination), or it can be disabled or enabled at any later time via API functions in case someone wants to keep a user from being able to reboot the machine. […] we are considering to add more synchronous cache flush calls […] Due to our Dynamic Dead Code Elimination method this would not cause any kind of bloating when not needed in a particular target configuration as a particular cache flush call would be included in FreeKEYB's runtime image only if the corresponding disk cache is loaded as well or FreeKEYB was told by command line switches to load the corresponding support.
[…] FreeKEYB builds the driver's runtime image at initialization time depending on the type of machine it is being loaded on, the type of keyboard, layout, country and code page used, the type of mouse and video adapter(s) installed, the other drivers loaded on that system, the operating system and the load and relocation method(s) used, the individual features included, and the configuration options specified in the command line. Due to the large number of command line switches and options supported […] (around fifty switches […] with multiple possible settings) there is a high number of feature combinations with uncountable dependencies […] resulting in […] endless number of […] different target images. FreeKEYB's Dynamic Dead Code Elimination technique manages to resolve […] these […] dependencies and […] remove dead code and data […] is not restricted to […] include or exclude a somewhat limited number of modules or whole sub-routines and fix up some dispatch tables as in classical TSR programming, but […] works […] at […] byte level […] able to remove […] individual instructions in the middle of larger routines […] distributed all over the code to handle a particular case or support a specific feature […] special tools are used to analyze the code […] and create […] fixup tables […] automated […] using conditional defines […] to declare the various cases […] not only optional at assembly time but at initialization time […] without the […] overhead of having at least some amount of dead code left in the runtime image […] to keep track of all the dependencies between […] these conditionals, dynamically build and relocate the runtime image, fix up all the references between these small, changing, and moving binary parts […] still allowing to use the tiny .COM/.SYS style […] model […] is done at initialization time […] API to import and export object structures between FreeKEYB and the calling application […] to transparently resize and move them internally […] at runtime […]
[…] brandneue[s] Feature, der dynamischen Dead-Code-Elimination, die die jeweils notwendigen Bestandteile des Treibers erst zum Installationszeitpunkt zusammenbastelt und reloziert, so daß keine ungenutzten Code- oder Datenbereiche mehr resident bleiben (z.B. wenn jemand ein bestimmtes FreeKEYB-Feature nicht benötigt). […](NB. This represents the first known implementation of byte-level granular dynamic dead-code elimination for software assembled or compiled ahead-of-time.)
[…] a […] unique feature […] we call dynamic dead code elimination, so you can at installation time […] specify which components of the driver you want and which you don't. This goes to an extent of dynamic loadable modularization and late linkage I have not seen under DOS so far. If you do not like the screen saver, macros, the calculator, or mouse support, or <almost anything else>, you can specify this at the command line, and FreeKEYB, while taking all the dependencies between the routines into account, will completely remove all the code fragments, which deal with that feature and are not necessary to provide the requested functionality, before the driver relocates the image into the target location and makes itself resident. Removing some smaller features just saves a couple of bytes but excluding more complex components can save you half a Kb and more. You can also specify the size of the data areas […]
[…] the loader will dynamically optimize the resident data areas and code sections at byte level to remove any redundancy from the driver depending on the given hardware/software/driver configuration and locale. […]
[…] Since FreeKEYB uses our dynamic dead-code-elimination to optimize its memory image for the target environment at load time, I would certainly like to add special support into FreeKEYB for GEOS which could be controlled by a command line option, so the extra code is only loaded when GEOS is used as well. […]
[…] I would be willing to add special hooks into FreeKEYB, our much advanced DOS keyboard driver, would it prove to improve the usability under GEOS […] Due to our sophisticated new Dynamic Dead Code Elimination technology which removes at byte level any code snippets unused in a particular given driver, user, or system configuration and hardware environment when the driver's installer builds and relocates the load image of itself, this would have no memory impact for non-GEOS users at all, so there's not much to worry about (memory footprint etc.) as in traditionally coded DOS drivers. […]
[…] Lazy evaluation is basically dynamic dead code elimination. […](NB. Possibly the first public use of the term dynamic dead-code elimination, though only conceptually and with a focus on lazy evaluation in functional languages.)