The Tcpdump Group git mirrors - tcpdump/blob - CONTRIBUTING.md

   1 # Some Information for Contributors
   2 Thank you for considering to make a contribution to tcpdump! Please use the
   3 guidelines below to achieve the best results and experience for everyone.
   4
   5 ## How to report bugs and other problems
   6 **To report a security issue (segfault, buffer overflow, infinite loop, arbitrary
   7 code execution etc) please send an e-mail to security@tcpdump.org, do not use
   8 the bug tracker!**
   9
  10 To report a non-security problem (failure to compile, incorrect output in the
  11 protocol printout, missing support for a particular protocol etc) please check
  12 first that it reproduces with the latest stable release of tcpdump and the latest
  13 stable release of libpcap. If it does, please check that the problem reproduces
  14 with the current git master branch of tcpdump and the current git master branch of
  15 libpcap. If it does (and it is not a security-related problem, otherwise see
  16 above), please navigate to the
  17 [bug tracker](https://github.com/the-tcpdump-group/tcpdump/issues)
  18 and check if the problem has already been reported. If it has not, please open
  19 a new issue and provide the following details:
  20
  21 * tcpdump and libpcap version (`tcpdump --version`)
  22 * operating system name and version and any other details that may be relevant
  23   (`uname -a`, compiler name and version, CPU type etc.)
  24 * custom `configure`/`cmake` flags, if any
  25 * statement of the problem
  26 * steps to reproduce
  27
  28 Please note that if you know exactly how to solve the problem and the solution
  29 would not be too intrusive, it would be best to contribute some development time
  30 and to open a pull request instead as discussed below.
  31
  32 Still not sure how to do? Feel free to
  33 [subscribe to the mailing list](https://www.tcpdump.org/#mailing-lists)
  34 and ask!
  35
  36
  37 ## How to add new code and to update existing code
  38
  39 1) Check that there isn't a pull request already opened for the changes you
  40    intend to make.
  41
  42 2) [Fork](https://help.github.com/articles/fork-a-repo/) the tcpdump
  43    [repository](https://github.com/the-tcpdump-group/tcpdump).
  44
  45 3) The easiest way to test your changes on multiple operating systems and
  46    architectures is to let the upstream CI test your pull request (more on
  47    this below).
  48
  49 4) Setup your git working copy
  50    ```
  51    git clone https://github.com/<username>/tcpdump.git
  52    cd tcpdump
  53    git remote add upstream https://github.com/the-tcpdump-group/tcpdump
  54    git fetch upstream
  55    ```
  56
  57 5) Do a `touch .devel` in your working directory.
  58    Currently, the effect is
  59    * add (via `configure`, in `Makefile`) some warnings options (`-Wall`,
  60      `-Wmissing-prototypes`, `-Wstrict-prototypes`, ...) to the compiler if it
  61      supports these options,
  62    * have the `Makefile` support `make depend` and the `configure` script run it.
  63
  64 6) Configure and build
  65    ```
  66    ./configure && make -s && make check
  67    ```
  68
  69 7) Add/update tests
  70    The `tests` directory contains regression tests of the dissection of captured
  71    packets.  Those captured packets were saved running tcpdump with option
  72    `-w sample.pcap`.  Additional options, such as `-n`, are used to create relevant
  73    and reproducible output; `-#` is used to indicate which particular packets
  74    have output that differs.  The tests are run with the `TZ` environment
  75    variable set to `GMT0`, so that UTC, rather than the local time where the
  76    tests are being run, is used when "local time" values are printed.  The
  77    actual test compares the current text output with the expected result
  78    (`sample.out`) saved from a previous version.
  79
  80    Any new/updated fields in a dissector must be present in a `sample.pcap` file
  81    and the corresponding output file.
  82
  83    Configuration is set in `tests/TESTLIST`.
  84    Each line in this file has the following format:
  85    ```
  86    test-name   sample.pcap   sample.out   tcpdump-options
  87    ```
  88
  89    The `sample.out` file can be produced as follows:
  90    ```
  91    (cd tests && TZ=GMT0 ../tcpdump -# -n -r sample.pcap tcpdump-options > sample.out)
  92    ```
  93
  94    Or, for convenience, use `./update-test.sh test-name`
  95
  96    It is often useful to have test outputs with different verbosity levels
  97    (none, `-v`, `-vv`, `-vvv`, etc.) depending on the code.
  98
  99 8) Test using `make check` (current build options) and `./build_matrix.sh`
 100    (a multitude of build options, build systems and compilers). If you can,
 101    test on more than one operating system. Don't send a pull request until
 102    all tests pass.
 103
 104 9) Try to rebase your commits to keep the history simple.
 105    ```
 106    git fetch upstream
 107    git rebase upstream/master
 108    ```
 109    (If the rebase fails and you cannot resolve, issue `git rebase --abort`
 110    and ask for help in the pull request comment.)
 111
 112 10) Once 100% happy, put your work into your forked repository using `git push`.
 113
 114 11) [Initiate and send](https://help.github.com/articles/using-pull-requests/)
 115     a pull request.
 116     This will trigger the upstream repository CI tests.
 117
 118
 119 ## Code style and generic remarks
 120 1) A thorough reading of some other printers code is useful.
 121
 122 2) To help learn how tcpdump works or to help debugging:
 123    You can configure and build tcpdump with the instrumentation of functions:
 124    ```
 125    $ ./configure --enable-instrument-functions
 126    $ make -s clean all
 127    ```
 128
 129    This generates instrumentation calls for entry and exit to functions.
 130    Just after function entry and just before function exit, these
 131    profiling functions are called and print the function names with
 132    indentation and call level.
 133
 134    If entering in a function, it prints also the calling function name with
 135    file name and line number. There may be a small shift in the line number.
 136
 137    In some cases, with Clang 11, the file number is unknown (printed '??')
 138    or the line number is unknown (printed '?'). In this case, use GCC.
 139
 140    If the environment variable INSTRUMENT is
 141    - unset or set to an empty string, print nothing, like with no
 142      instrumentation
 143    - set to "all" or "a", print all the functions names
 144    - set to "global" or "g", print only the global functions names
 145
 146    This allows to run:
 147    ```
 148    $ INSTRUMENT=a ./tcpdump ...
 149    $ INSTRUMENT=g ./tcpdump ...
 150    $ INSTRUMENT= ./tcpdump ...
 151    ```
 152    or
 153    ```
 154    $ export INSTRUMENT=global
 155    $ ./tcpdump ...
 156    ```
 157
 158    The library libbfd is used, therefore the binutils-dev package is required.
 159
 160 3) Put the normative reference if any as comments (RFC, etc.).
 161
 162 4) Put the format of packets/headers/options as comments if there is no
 163    published normative reference.
 164
 165 5) The printer may receive incomplete packet in the buffer, truncated at any
 166    random position, for example by capturing with `-s size` option.
 167    This means that an attempt to fetch packet data based on the expected
 168    format of the packet may run the risk of overrunning the buffer.
 169
 170    Furthermore, if the packet is complete, but is not correctly formed,
 171    that can also cause a printer to overrun the buffer, as it will be
 172    fetching packet data based on the expected format of the packet.
 173
 174    Therefore, integral, IPv4 address, and octet sequence values should
 175    be fetched using the `GET_*()` macros, which are defined in
 176    `extract.h`.
 177
 178    If your code reads and decodes every byte of the protocol packet, then to
 179    ensure proper and complete bounds checks it would be sufficient to read all
 180    packet data using the `GET_*()` macros.
 181
 182    If your code uses the macros above only on some packet data, then the gaps
 183    would have to be bounds-checked using the `ND_TCHECK_*()` macros:
 184    ```
 185    ND_TCHECK_n(p), n in { 1, 2, 3, 4, 5, 6, 7, 8, 16 }
 186    ND_TCHECK_SIZE(p)
 187    ND_TCHECK_LEN(p, l)
 188    ```
 189
 190    where *p* points to the data not being decoded.  For `ND_CHECK_n()`,
 191    *n* is the length of the gap, in bytes.  For `ND_CHECK_SIZE()`, the
 192    length of the gap, in bytes, is the size of an item of the data type
 193    to which *p* points.  For `ND_CHECK_LEN()`, *l* is the length of the
 194    gap, in bytes.
 195
 196    For the `GET_*()` and `ND_TCHECK_*` macros (if not already done):
 197    * Assign: `ndo->ndo_protocol = "protocol";`
 198    * Define: `ND_LONGJMP_FROM_TCHECK` before including `netdissect.h`
 199    * Make sure that the intersection of `GET_*()` and `ND_TCHECK_*()` is minimal,
 200      but at the same time their union covers all packet data in all cases.
 201
 202    You can test the code via:
 203    ```
 204    sudo ./tcpdump -s snaplen [-v][v][...] -i lo # in a terminal
 205    sudo tcpreplay -i lo sample.pcap             # in another terminal
 206    ```
 207    You should try several values for snaplen to do various truncation.
 208
 209 *  The `GET_*()` macros that fetch integral values are:
 210    ```
 211    GET_U_1(p)
 212    GET_S_1(p)
 213    GET_BE_U_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
 214    GET_BE_S_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
 215    GET_LE_U_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
 216    GET_LE_S_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
 217    ```
 218
 219    where *p* points to the integral value in the packet buffer. The
 220    macro returns the integral value at that location.
 221
 222    `U` indicates that an unsigned value is fetched; `S` indicates that a
 223    signed value is fetched.  For multi-byte values, `BE` indicates that
 224    a big-endian value ("network byte order") is fetched, and `LE`
 225    indicates that a little-endian value is fetched. *n* is the length,
 226    in bytes, of the multi-byte integral value to be fetched.
 227
 228    In addition to the bounds checking the `GET_*()` macros perform,
 229    using those macros has other advantages:
 230
 231    * tcpdump runs on both big-endian and little-endian systems, so
 232      fetches of multi-byte integral values must be done in a fashion
 233      that works regardless of the byte order of the machine running
 234      tcpdump.  The `GET_BE_*()` macros will fetch a big-endian value and
 235      return a host-byte-order value on both big-endian and little-endian
 236      machines, and the `GET_LE_*()` macros will fetch a little-endian
 237      value and return a host-byte-order value on both big-endian and
 238      little-endian machines.
 239
 240    * tcpdump runs on machines that do not support unaligned access to
 241      multi-byte values, and packet values are not guaranteed to be
 242      aligned on the proper boundary.  The `GET_BE_*()` and `GET_LE_*()`
 243      macros will fetch values even if they are not aligned on the proper
 244      boundary.
 245
 246 *  The `GET_*()` macros that fetch IPv4 address values are:
 247    ```
 248    GET_IPV4_TO_HOST_ORDER(p)
 249    GET_IPV4_TO_NETWORK_ORDER(p)
 250    ```
 251
 252    where *p* points to the address in the packet buffer.
 253   `GET_IPV4_TO_HOST_ORDER()` returns the address in the byte order of
 254    the host that is running tcpdump; `GET_IPV4_TO_NETWORK_ORDER()`
 255    returns it in network byte order.
 256
 257    Like the integral `GET_*()` macros, these macros work correctly on
 258    both big-endian and little-endian machines and will fetch values even
 259    if they are not aligned on the proper boundary.
 260
 261 *  The `GET_*()` macro that fetches an arbitrary sequences of bytes is:
 262    ```
 263    GET_CPY_BYTES(dst, p, len)
 264    ```
 265
 266    where *dst* is the destination to which the sequence of bytes should
 267    be copied, *p* points to the first byte of the sequence of bytes, and
 268    *len* is the number of bytes to be copied.  The bytes are copied in
 269    the order in which they appear in the packet.
 270
 271 *  To fetch a network address and convert it to a printable string, use
 272    the following `GET_*()` macros, defined in `addrtoname.h`, to
 273    perform bounds checks to make sure the entire address is within the
 274    buffer and to translate the address to a string to print:
 275    ```
 276    GET_IPADDR_STRING(p)
 277    GET_IP6ADDR_STRING(p)
 278    GET_MAC48_STRING(p)
 279    GET_EUI64_STRING(p)
 280    GET_EUI64LE_STRING(p)
 281    GET_LINKADDR_STRING(p, type, len)
 282    GET_ISONSAP_STRING(nsap, nsap_length)
 283    ```
 284
 285    `GET_IPADDR_STRING()` fetches an IPv4 address pointed to by *p* and
 286    returns a string that is either a host name, if the `-n` flag wasn't
 287    specified and a host name could be found for the address, or the
 288    standard XXX.XXX.XXX.XXX-style representation of the address.
 289
 290    `GET_IP6ADDR_STRING()` fetches an IPv6 address pointed to by *p* and
 291    returns a string that is either a host name, if the `-n` flag wasn't
 292    specified and a host name could be found for the address, or the
 293    standard XXXX::XXXX-style representation of the address.
 294
 295    `GET_MAC48_STRING()` fetches a 48-bit MAC address (Ethernet, 802.11,
 296    etc.) pointed to by *p* and returns a string that is either a host
 297    name, if the `-n` flag wasn't specified and a host name could be
 298    found in the ethers file for the address, or the standard
 299    XX:XX:XX:XX:XX:XX-style representation of the address.
 300
 301    `GET_EUI64_STRING()` fetches a 64-bit EUI pointed to by *p* and
 302    returns a string that is the standard XX:XX:XX:XX:XX:XX:XX:XX-style
 303    representation of the address.
 304
 305    `GET_EUI64LE_STRING()` fetches a 64-bit EUI, in reverse byte order,
 306    pointed to by *p* and returns a string that is the standard
 307    XX:XX:XX:XX:XX:XX:XX:XX-style representation of the address.
 308
 309    `GET_LINKADDR_STRING()` fetches an octet string, of length *length*
 310    and type *type*,  pointed to by *p* and returns a string whose format
 311    depends on the value of *type*:
 312
 313    * `LINKADDR_MAC48` - if the length is 6, the string has the same
 314    value as `GET_MAC48_STRING()` would return for that address,
 315    otherwise, the string is a sequence of XX:XX:... values for the bytes
 316    of the address;
 317
 318    * `LINKADDR_FRELAY` - the string is "DLCI XXX", where XXX is the
 319    DLCI, if the address is a valid Q.922 header, and an error indication
 320    otherwise;
 321
 322    * `LINKADDR_EUI64`, `LINKADDR_ATM`, `LINKADDR_OTHER` -
 323    the string is a sequence of XX:XX:... values for the bytes
 324    of the address.
 325
 326 6) When defining a structure corresponding to a packet or part of a
 327    packet, so that a pointer to packet data can be cast to a pointer to
 328    that structure and that structure pointer used to refer to fields in
 329    the packet, use the `nd_*` types for the structure members.
 330
 331    Those types all are aligned only on a 1-byte boundary, so a
 332    compiler will not assume that the structure is aligned on a boundary
 333    stricter than one byte; there is no guarantee that fields in packets
 334    are aligned on any particular boundary.
 335
 336    This means that all padding in the structure must be explicitly
 337    declared as fields in the structure.
 338
 339    The `nd_*` types for integral values are:
 340
 341    * `nd_uintN_t`, for unsigned integral values, where *N* is the number
 342       of bytes in the value.
 343    * `nd_intN_t`, for signed integral values, where *N* is the number
 344       of bytes in the value.
 345
 346    The `nd_*` types for IP addresses are:
 347
 348    * `nd_ipv4`, for IPv4 addresses;
 349    * `nd_ipv6`, for IPv6 addresses.
 350
 351    The `nd_*` types for link-layer addresses are:
 352
 353    * `nd_mac48`, for MAC-48 (Ethernet, 802.11, etc.) addresses;
 354    * `nd_eui64`, for EUI-64 values.
 355
 356    The `nd_*` type for a byte in a sequence of bytes is `nd_byte`; an
 357    *N*-byte sequence should be declared as `nd_byte[N]`.
 358
 359 7) Do invalid packet checks in code: Think that your code can receive in input
 360    not only a valid packet but any arbitrary random sequence of octets (packet
 361    * built malformed originally by the sender or by a fuzz tester,
 362    * became corrupted in transit or for some other reason).
 363
 364    Print with: `nd_print_invalid(ndo);  /* to print " (invalid)" */`
 365
 366 8) Use `struct tok` for indexed strings and print them with
 367    `tok2str()` or `bittok2str()` (for flags).
 368    All `struct tok` must end with `{ 0, NULL }`.
 369
 370 9) Avoid empty lines in output of printers.
 371
 372 10) A commit message must have:
 373    ```
 374    First line: Capitalized short summary in the imperative (50 chars or less)
 375
 376    If the commit concerns a protocol, the summary line must start with
 377    "protocol: ".
 378
 379    Body: Detailed explanatory text, if necessary. Fold it to approximately
 380    72 characters. There must be an empty line separating the summary from
 381    the body.
 382    ```
 383
 384 11) Avoid non-ASCII characters in code and commit messages.
 385
 386 12) Use the style of the modified sources.
 387
 388 13) Don't mix declarations and code.
 389
 390 14) tcpdump requires a compiler that supports C99 or later, so C99
 391    features may be used in code, but C11 or later features should not be
 392    used.
 393
 394 15) Avoid trailing tabs/spaces