Code statistics

CodeDiffs.Stats.extract_statsFunction
extract_stats(::Val{code_type}, code, stats_opts=(;))

Analyses code and extracts high-level information about it (instruction count, register usage, function calls...).

stats_opts are passed by the user, and have specific meaning depending on the code_type.

source
extract_stats(::Val{:ptx}, ptx_source, stats_opts)

Extracts various statistics about the ptx_source.

The PTX source isn't parsed, but regex expressions are used to analyze all instructions and declarations within the source.

Declaration and memory (loads and stores) statistics are grouped by address space and type. All address spaces are supported. All funcdamental PTX types are supported. Usage of dynamic shared memory is detected. Vector variables account for 2, 4 or 8 variables of the base type.

Register statistics (mainly .param) exclude external functions defined in ptx_source, but not calls within the main function to those external functions.

Info

PTX uses virtual registers: therefore the compiler can declare 1000s of register variables, yet it is only in the SASS code that registers are allocated physically. Only use register information to monitor address space usage, not for estimating performance.

source
extract_stats(::Val{:sass}, sass_source, stats_opts)

Extracts various statistics about the sass_source.

The SASS source isn't parsed, but regex expressions are used to analyze all instructions and declarations within the source.

The following instructions are searched for:

  • BSYNC counts as a workgroup synchronization
  • WARPSYNC counts as a warp synchronization
  • CALL counts as a function call
source
extract_stats(::Val{:cuda_stats}, (ptx_source, sass_source), stats_opts)

Combines the statistics of ptx_source and sass_source into one. See extract_stats(::Val{:ptx_stats}, ptx_source, stats_opts) and extract_stats(::Val{:sass_stats}, sass_source, stats_opts).

source
extract_stats(::Val{:gcn}, gcn_source::AbstractString, stats_opts)

Extracts various statistics about the gcn_source.

gcn_source should still have all metadata sections generated by the compiler (LLVM docs) attached to it, otherwise some fields will be defaulted to 0.

Instructions statistics are done using regexes.

source