A debug symbol is a special kind of symbol that attaches additional information to the symbol table of an object file, such as a shared library or an executable. This information allows a symbolic debugger to gain access to information from the source code of the binary, such as the names of identifiers, including variables and routines.

The symbolic information may be compiled together with the module's binary file, or distributed in a separate file, or simply discarded during the compilation and/or linking.

This information can be helpful while trying to investigate and fix a crashing application or any other fault.[1]

Debugging information

edit

Debug symbols typically include not only the name of a function or global variable, but also the name of the source code file in which the symbol occurs, as well as the line number at which it is defined. Other information includes the type of the symbol (integer, float, function, exception, etc.), the scope (block scope or global scope), the size, and, for classes, the name of the class, and the methods and members in it.

Part of the debug information includes the line of code in the source file which defines that symbol (a function or global variable), as well as symbols associated with exception frames.

This information may be stored in the symbol table of an object file, executable file, or shared library, or may be in a separate file.

On some systems, e.g., z/OS, the debug information contains more than just the symbol tabled, e.g., the ADATA discussed in § OS/390 et al contains source code.

Debugging information can take up quite a bit of space, especially the filenames and line numbers. Thus, binaries with debug symbols can become quite large, often several times the stripped file size.[2] To avoid this extra size, most operating system distributions ship binaries that are stripped, i.e. from which all of the debugging symbols have been removed. This is accomplished, for example, with the strip command in Unix. If the debugging information is in separate files, those files are usually not shipped with the distribution.

Embedded symbols

edit

Unix-like systems

edit

stabs was an early format for debugging symbols on Unix-like systems. The newer DWARF format, for which formal specifications exist, has largely supplanted it. The specification allows any compatible compiler or assembler to create debug symbols in a standardized format, and for any debugger, such as the GNU Debugger (GDB), to gain access and display these symbols.

The compilers for the IBM mainframe line descended from the System/360 have a TEST option that causes the compiler to include debugging information[3][4][5] in the object file. Similarly, the Binder and linkage editors have a TEST option that causes the debug information to be retained[6] in the load module. Various debug tools, e.g., OS/360 TESTRAN, TSO TEST, have the ability to use the embedded symbol definitions.

External debug files

edit

The IBM High Level Assembler (HLASM) and other compilers running on, e.g., z/OS, have an ADATA option that produces an Associated data (ADATA) file[7] containing more information than that produced by the old TEST option. In particular, the ADATA file includes lines of source code and their metadata.

Microsoft debug symbols

edit

Microsoft compilers generate a program database (PDB) file containing debug symbols. Some companies ship the PDB on their CD/DVD to enable troubleshooting and other companies (like Microsoft, and the Mozilla Corporation) allow downloading debug symbols from the Internet. The WinDbg debugger and the Visual Studio IDE can be configured to automatically download debug symbols for Windows dynamic-link libraries (DLLs) on demand. The PDB debug symbols that Microsoft distributes include only public functions, global variables and their data types. The Mozilla Corporation has similar infrastructure but distributes full debug information.

Apple

edit

On Apple platforms, debug symbols are optionally emitted during the build process as dSYM files. Apple uses the term "symbolicate" to refer to the replacement of addresses in diagnostic files with human readable values.[8]

History

edit

Symbolic debuggers have existed since the mainframe era, almost since the first introduction of suitable computer displays on which to display the symbolic debugging information (and even earlier with symbolic dumps on paper). They were not restricted to high level compiled languages and were available also for assembly language programs. For the IBM/360, these produced object code (on request) that included "SYM cards". These were usually ignored by the program loader but were useful to a symbolic debugger as they were kept on the same program library as the executable logic code.

See also

edit

References

edit
  1. ^ "Debugging with Symbols". Windows Dev Center. Microsoft. Archived from the original on 2020-01-11. Retrieved 2020-01-11.
  2. ^ "What are Symbols For?". TechNet. Microsoft. 2008-07-15. Archived from the original on 2014-12-26. Retrieved 2015-01-04.
  3. ^ "Appendix D: TESTRAN Editor Input Record Formats" (PDF). IBM System/360 Operating System - TESTRAN - Program Logic Manual - Program Number 3605-PT-516 (PDF). TNL GN26-8016. IBM. 1971-04-01. pp. 119–120. GY28-6611-0. Retrieved 2024-07-11.
  4. ^ "Appendix. Input conventions and Record Formats" (PDF). MVS/370 - Linkage Editor Logic - Data Facility Product 5665-295 - Release 1.0 (PDF) (First ed.). IBM. April 1983. pp. 195–206. LY26-3921-0. Retrieved 2024-07-11.
  5. ^ LY26-3921-0, p. 195, Figure 69. SYM Input Record (Card Image).
  6. ^ LY26-3921-0, p. 199, Figure 76. SYM Record (Load Module).
  7. ^ "Appendix C. Associated data file output" (PDF). High Level Assembler for z/OS & z/VM & z/VSE - Programmer's Guide - Version 1 Release 6 (PDF). IBM. 2015. pp. 227–275. SC26-4941-07. Retrieved 2024-07-11.
  8. ^ "Understanding and Analyzing iOS Application Crash Reports". iOS Developer Library. Apple, Inc. 2018-01-08 [2009-01-29]. Technical Note TN2151. Archived from the original on 2019-12-19. Retrieved 2020-01-11.
edit