Cleanup
CodeDiffs.Cleanup.cleanup_code — Functioncleanup_code(::Val{code_type}, code, dbinfo=true, cleanup_opts=(;))Perform minor changes to code to improve readability and the quality of the differences.
dbinfo is a superset of debuginfo. It is compatible with all code types, but it may have no effect.
cleanup_opts are passed by the user, and have specific meaning depending on the code_type.
cleanup_code(::Val{:ast}, expr::Expr, dbinfo, cleanup_opts)
cleanup_code(::Val{:ast}, expr::AbstractString, dbinfo, cleanup_opts)Cleanup the AST in expr. If expr isa Expr, it is first converted to a String with Base.show.
As the cleanup step is supposed to operate only on strings, MacroTools.prettify isn't applied here but by CodeDiffs.code_ast.
Accepted cleanup_opts and their default values:
compact_if=true: transforms smallifblocks into one-liner ternary statementsline_length=120: threshold after whichcompact_ifkeeps the wholeifstatement, to prevent very long lines.fix_indents=true: removes unnecessary indents (e.g.@threads for ...is over-indended by default)add_newlines=true: attempts to unclutter the code by adding newlines in-between blocks at the same indentation level.
cleanup_code(::Val{:typed}, code, dbinfo, cleanup_opts)Cleanup Julia typed IR code.
Accepted cleanup_opts and their default values:
expand_llvmcall=true: replace raw inline LLVM IR with multiline blocks with syntax highlighting.
cleanup_code(::Val{:ptx}, code, dbinfo, cleanup_opts)Cleanup PTX code.
Accepted cleanup_opts and their default values:
demangle=true: demangle names within the codekeep_loop_comments=false: keep loop comments generated by the LLVM backendkeep_block_comments=false: keep code block comments generated by the LLVM backendkeep_demoting_comments=false: keep variable demoting comments generated by the LLVM backendindent_calls=true: re-indentcallinstructions to be more recognizable and readablealign_preds=8: align instruction guards (e.g.@p1) such that the instruction is placed at the n-th column.
cleanup_code(::Val{:sass}, code, dbinfo, cleanup_opts)Cleanup SASS code.
If dbinfo, then location comments are kept.
Accepted cleanup_opts and their default values:
demangle=true: demangle names within the code
cleanup_code(::Val{:gcn}, code, dbinfo, cleanup_opts)Cleanup GCN code.
Accepted cleanup_opts and their default values:
metadata=false: keep the metadata after the kernel's codekernel_metadata=false: keep the YAMLamdhsa.kernelssection. It is removed ifmetadata=false.demangle=true: demangle names within the codekeep_loop_comments=false: keep loop comments generated by the LLVM backendkeep_block_comments=false: keep code block comments generated by the LLVM backendkeep_misc_comments=false: keep other comments generated by the LLVM backendalign_operands=24: align the first operand of each instruction to the n-th column
cleanup_code(::Val{:spirv}, code, dbinfo, cleanup_opts)Cleanup SPIRV code.
Accepted cleanup_opts and their default values:
metadata=false: keep the meta operations around the main function's bodydemangle=true: demangle names within the code
LLVM and mangling
CodeDiffs.Cleanup.replace_llvm_module_name — Functionreplace_llvm_module_name(code::AbstractString)Remove in code the trailing numbers in the LLVM module names, e.g. "julia_f_2007" => "f". This allows to remove false differences when comparing raw code, since each call to code_native (or code_llvm) triggers a new compilation using an unique LLVM module name, therefore each consecutive call is different even though the actual code does not change.
In Julia 1.11+, global variables names are also replaced with global_var_unique_gen_name_regex.
julia> f() = 1
f (generic function with 1 method)
julia> buf = IOBuffer();
julia> code_native(buf, f, Tuple{}) # Equivalent to `@code_native f()`
julia> code₁ = String(take!(buf));
julia> code_native(buf, f, Tuple{})
julia> code₂ = String(take!(buf));
julia> code₁ == code₂ # Different LLVM module names...
false
julia> replace_llvm_module_name(code₁) == replace_llvm_module_name(code₂) # ...but same code
trueCodeDiffs.Cleanup.function_unique_gen_name_regex — Functionfunction_unique_gen_name_regex()
function_unique_gen_name_regex(function_name)Regex matching all LLVM function names which might change from one compilation to another. As an example, in the outputs of @code_llvm below:
julia> f() = 1
f (generic function with 1 method)
julia> @code_llvm f()
...
define i64 @julia_f_855() #0 {
...
julia> @code_llvm f()
...
define i64 @julia_f_857() #0 {
...the regex will match julia_f_855 and julia_f_857.
function_unique_gen_name_regex() should work for any function which does not have any characters in '",;- or spaces in its name. The function name is either in the capture group 1 or 2.
function_unique_gen_name_regex(function_name) should work with any generated name for the given function name.
It is 'globalUniqueGeneratedNames' in 'julia/src/codegen.cpp' which gives the unique number on the generated code. The regex matches most usages of this counter:
from
get_function_namejulia_<function_name>_<unique_num>japi3_<function_name>_<unique_num>japi1_<function_name>_<unique_num>
j_<function_name>_<unique_num>j1_<function_name>_<unique_num>
jlcapi_<function_name>_<unique_num>
jfptr_<function_name>_<unique_num>
tojlinvoke<unique_num>
CodeDiffs.Cleanup.global_var_unique_gen_name_regex — Functionglobal_var_unique_gen_name_regex()
global_var_unique_gen_name_regex(global_name)Regex matching all global variable names which might change from one compilation to another.
In LLVM IR, those variables are mentioned as such: @"+Core.GenericMemory#14067.jit". In native code, they look like this: ".L+Core.GenericMemory#13985.jit", with maybe some .set and .size sections at the end of the code (in x86 ASM).
global_var_unique_gen_name_regex() should work for any variable which does not have any characters in '",;- or spaces in its name.
global_var_unique_gen_name_regex(global_name) should work with any generated name for the given variable name.
It is 'globalUniqueGeneratedNames' in 'julia/src/codegen.cpp' which gives the unique number on the generated code. The regex matches only a single usage of this counter: in julia_pgv(ctx, cname, addr) at 'src/cgutils.cpp#L358' which is then added a ".jit" suffix in 'src/aotcompile.cpp#L2064' when doing code introspection.
CodeDiffs.Cleanup.demangle — Functiondemangle(name::AbstractString)Demangle name into a C/C++ name using the demumble utility.
CodeDiffs.Cleanup.demangle_all — Functiondemangle_all(code::AbstractString)Find and replace all mangled names in code with their demangled counterparts.
CodeDiffs.Cleanup.mangled_base_name — Functionmangled_base_name(name::AbstractString)Attempt to return the base name in the mangled function name. If it fails, nothing is returned.
CodeDiffs.Cleanup.clean_function_name — Functionclean_function_name(name_regex, code, replacement=nothing)Replace occurences of name_regex in the code by replacement. replacement defaults to the demangled function name.
CodeDiffs.Cleanup.cleanup_inline_llvmcall_modules — Functioncleanup_inline_llvmcall_modules(c::Core.CodeInfo)
cleanup_inline_llvmcall_modules(c::Vector{Any})Replace the LLVM-IR body of Base.llvmcall expressions in c to something more readable using an unescaped string, allowing to display the IR over multiple lines, with highlighting.
Only the LLVM function declaration is kept, other code (annotations, etc...) are stripped.