Cleanup

CodeDiffs.Cleanup.cleanup_codeFunction
cleanup_code(::Val{code_type}, code, dbinfo=true, cleanup_opts=(;))

Perform minor changes to code to improve readability and the quality of the differences.

dbinfo is a superset of debuginfo. It is compatible with all code types, but it may have no effect.

cleanup_opts are passed by the user, and have specific meaning depending on the code_type.

source
cleanup_code(::Val{:typed}, code, dbinfo, cleanup_opts)

Cleanup Julia typed IR code.

Accepted cleanup_opts and their default values:

  • expand_llvmcall=true: replace raw inline LLVM IR with multiline blocks with syntax highlighting.
source
cleanup_code(::Val{:ptx}, code, dbinfo, cleanup_opts)

Cleanup PTX code.

Accepted cleanup_opts and their default values:

  • demangle=true: demangle names within the code
  • keep_loop_comments=false: keep loop comments generated by the LLVM backend
  • keep_block_comments=false: keep code block comments generated by the LLVM backend
  • keep_demoting_comments=false: keep variable demoting comments generated by the LLVM backend
  • indent_calls=true: re-indent call instructions to be more recognizable and readable
  • align_preds=8: align instruction guards (e.g. @p1) such that the instruction is placed at the n-th column.
source
cleanup_code(::Val{:sass}, code, dbinfo, cleanup_opts)

Cleanup SASS code.

If dbinfo, then location comments are kept.

Accepted cleanup_opts and their default values:

  • demangle=true: demangle names within the code
source
cleanup_code(::Val{:gcn}, code, dbinfo, cleanup_opts)

Cleanup GCN code.

Accepted cleanup_opts and their default values:

  • metadata=false: keep the metadata after the kernel's code
  • kernel_metadata=false: keep the YAML amdhsa.kernels section. It is removed if metadata=false.
  • demangle=true: demangle names within the code
  • keep_loop_comments=false: keep loop comments generated by the LLVM backend
  • keep_block_comments=false: keep code block comments generated by the LLVM backend
  • keep_misc_comments=false: keep other comments generated by the LLVM backend
  • align_operands=24: align the first operand of each instruction to the n-th column
source
cleanup_code(::Val{:spirv}, code, dbinfo, cleanup_opts)

Cleanup SPIRV code.

Accepted cleanup_opts and their default values:

  • metadata=false: keep the meta operations around the main function's body
  • demangle=true: demangle names within the code
source

LLVM and mangling

CodeDiffs.Cleanup.replace_llvm_module_nameFunction
replace_llvm_module_name(code::AbstractString)

Remove in code the trailing numbers in the LLVM module names, e.g. "julia_f_2007" => "f". This allows to remove false differences when comparing raw code, since each call to code_native (or code_llvm) triggers a new compilation using an unique LLVM module name, therefore each consecutive call is different even though the actual code does not change.

In Julia 1.11+, global variables names are also replaced with global_var_unique_gen_name_regex.

julia> f() = 1
f (generic function with 1 method)

julia> buf = IOBuffer();

julia> code_native(buf, f, Tuple{})  # Equivalent to `@code_native f()`

julia> code₁ = String(take!(buf));

julia> code_native(buf, f, Tuple{})

julia> code₂ = String(take!(buf));

julia> code₁ == code₂  # Different LLVM module names...
false

julia> replace_llvm_module_name(code₁) == replace_llvm_module_name(code₂)  # ...but same code
true
source
CodeDiffs.Cleanup.function_unique_gen_name_regexFunction
function_unique_gen_name_regex()
function_unique_gen_name_regex(function_name)

Regex matching all LLVM function names which might change from one compilation to another. As an example, in the outputs of @code_llvm below:

julia> f() = 1
f (generic function with 1 method)

julia> @code_llvm f()
...
define i64 @julia_f_855() #0 {
...

julia> @code_llvm f()
...
define i64 @julia_f_857() #0 {
...

the regex will match julia_f_855 and julia_f_857.

function_unique_gen_name_regex() should work for any function which does not have any characters in '",;- or spaces in its name. The function name is either in the capture group 1 or 2.

function_unique_gen_name_regex(function_name) should work with any generated name for the given function name.

It is 'globalUniqueGeneratedNames' in 'julia/src/codegen.cpp' which gives the unique number on the generated code. The regex matches most usages of this counter:

source
CodeDiffs.Cleanup.global_var_unique_gen_name_regexFunction
global_var_unique_gen_name_regex()
global_var_unique_gen_name_regex(global_name)

Regex matching all global variable names which might change from one compilation to another.

Julia 1.11

Those global variables names only appear starting from Julia 1.11.

Julia 1.12

Those global variables may now start with jl_ instead of +.

In LLVM IR, those variables are mentioned as such: @"+Core.GenericMemory#14067.jit". In native code, they look like this: ".L+Core.GenericMemory#13985.jit", with maybe some .set and .size sections at the end of the code (in x86 ASM).

global_var_unique_gen_name_regex() should work for any variable which does not have any characters in '",;- or spaces in its name.

global_var_unique_gen_name_regex(global_name) should work with any generated name for the given variable name.

It is 'globalUniqueGeneratedNames' in 'julia/src/codegen.cpp' which gives the unique number on the generated code. The regex matches only a single usage of this counter: in julia_pgv(ctx, cname, addr) at 'src/cgutils.cpp#L358' which is then added a ".jit" suffix in 'src/aotcompile.cpp#L2064' when doing code introspection.

source
CodeDiffs.Cleanup.clean_function_nameFunction
clean_function_name(name_regex, code, replacement=nothing)

Replace occurences of name_regex in the code by replacement. replacement defaults to the demangled function name.

source
CodeDiffs.Cleanup.cleanup_inline_llvmcall_modulesFunction
cleanup_inline_llvmcall_modules(c::Core.CodeInfo)
cleanup_inline_llvmcall_modules(c::Vector{Any})

Replace the LLVM-IR body of Base.llvmcall expressions in c to something more readable using an unescaped string, allowing to display the IR over multiple lines, with highlighting.

Only the LLVM function declaration is kept, other code (annotations, etc...) are stripped.

source