Skip to content
churchilldu edited this page Dec 16, 2025 · 2 revisions

Welcome to the ClassTrim wiki!

Rule to compute metrics

Strictly follow the tool CKjm used in the paper Shatnawi, R. The application of ROC analysis in threshold identification, data imbalance and metrics selection for software fault prediction. Innovations Syst Softw Eng 13, 201–217 (2017). https://doi.org/10.1007/s11334-017-0295-0.

Where are the parsed classes and methods possibly from?

  1. JDK
  2. Third-party libraries
  3. Local project

WMC (Weighted Methods per Class)

Number of methods declared in the class.

CBO (Coupling Between Objects)

Cardinality of the set of referenced classes (excluding JDK classes), including:

  1. Superclass
  2. Implemented interfaces
  3. Field types
  4. Declared method exception types
  5. Declared method argument types
  6. Declared method return types
  7. Other classes referenced via fields
  8. Classes of invoked methods

Note: When using global variables from other classes, primitive constants (e.g., int, boolean) cannot be traced back. See the discussion: https://stackoverflow.com/questions/75954598/how-to-record-visited-constants-by-methodvisitor-in-asm

RFC (Response For a Class)

Number of methods that can execute in response to a message to the class: sum of the class's own methods and the distinct external methods they directly invoke.

QA

  1. Why are class files visited multiple times?

    • Different traversals collect different facts (e.g., signatures, calls, fields) to compute metrics accurately and cache results.
  2. Why not use a database instead of files?

    • The dataset size is modest, and TSV files (not csv file becasue method name may contain comma) plus object serialization keep the footprint and complexity low. A database could improve querying, but is intentionally avoided to reduce operational overhead.