frame analysis
The principle of anti-debugging will not be described in detail. We will mainly analyze the specific implementation of PASS. This PASS is designed to increase the debugging resistance of the compiled program. Let’s first sort out its overall implementation logic to get a first impression. We will explain it in detail later. It implements anti-debugging capabilities in two main ways:
1. Link precompiled anti-debugging IR code
adbextirpath
The code attempts to load a precompiled anti-debug IR file from the path specified (provided by the option). If the file is loaded successfully, it will Linker::linkModules
be linked into the current module through the function. This precompiled IR may contain a series of functions ( ADBCallBack
and InitADB
) and structures used for anti-debugging, for example:
- Instrument the debugger’s code.
- Modify its own execution path to prevent the debugger from tracing normally.
- Instrument code to detect unusual behavior that may occur in a debugging environment.
2. Platform-specific inline assembly injection
For the AArch64 architecture of the Darwin operating system, if the ADBCallBack
and InitADB
functions are not found, pass will try to inject inline assembly code directly. A probability-based approach is adopted, by cryptoutils->get_range(2)
randomly selecting an inline assembly code to inject:
- The generated inline assembly code may use system calls to attempt to trigger anti-debugging behavior, for example, by
ptrace
calling to detect whether it is being debugged. - Use to
InlineAsm::get
create an inline assembly object and then insert it before the last instruction of the function, usually before the function’s return instruction.
code analysis
0.config
Let’s first briefly interpret the configuration. Two static global command line options are defined at the beginning of the code:
1. PreCompiledIRPath
Command line options:
static cl::opt<std::string> PreCompiledIRPath(
"adbextirpath",
cl::desc("External Path Pointing To Pre-compiled AntiDebugging IR"),
cl::value_desc("filename"), cl::init(""));
cl::opt<std::string>
Defines astd::string
command line option of type ."adbextirpath"
is the name of the command line option and the flag used when specifying the option on the command line.cl::desc
A description of this option is provided, telling the user that this option is used to specify the external path of the precompiled anti-debugging IR file.cl::value_desc
Is a description of the command line parameter, telling the user that this parameter should be a file name.cl::init("")
The default value of this option is initialized, which is an empty string, indicating that no path is specified by default.
2. ProbRate
Command line options:
static cl::opt<uint32_t> ProbRate(
"adb_prob",
cl::desc("Choose the probability [%] For Each Function To Be "
"Obfuscated By AntiDebugging"),
cl::value_desc("Probability Rate"), cl::init(40), cl::Optional);
cl::opt<uint32_t>
Defines auint32_t
command line option of type (unsigned 32-bit integer)."adb_prob"
is the name of this command line option.cl::desc
This parameter sets a percentage that determines the probability of each function being obfuscated by anti-debugging.cl::value_desc
Used to describe the type of value expected for this command line option. In this example, the user should provide a “probability rate”.cl::init(40)
Indicates that the default value of this option is 40, that is, if the user does not specify this option on the command line, its value will be automatically set to 40%.cl::Optional
Indicates that this command line option is optional and the user can choose whether to provide this option.
-adbextirpath
In general, users are allowed to specify the path to the precompiled anti-debugging IR file through options on the command line , and -adb_prob
the probability of each function being obfuscated using options.
1.initialize
Next, we analyze the function code in detail initialize
and sort out its overall logic.
1. Check the precompiled IR path:
First, it is judged PreCompiledIRPath
whether it is empty. If so, try to build a default path. It assumes there is a folder named “Hikari” in the user’s home_directory
directory, and then builds the file name based on the current module’s target architecture and operating system type.
if (PreCompiledIRPath == "") {
SmallString<32> Path;
if (sys::path::home_directory(Path)) {
sys::path::append(Path, "Hikari");
Triple tri(M.getTargetTriple());
sys::path::append(Path, "PrecompiledAntiDebugging-" +
Triple::getArchTypeName(tri.getArch()) +
"-" + Triple::getOSTypeName(tri.getOS()) +
".bc");
PreCompiledIRPath = Path.c_str();
}
}
2. Link the precompiled IR:
In this section, you first use an ifstream
object f
to check if the file exists. If present, try to link the precompiled IR file. If the file does not exist or is unreadable, an error message is output.
std::ifstream f(PreCompiledIRPath);
if (f.good()) {
errs() << "Linking PreCompiled AntiDebugging IR From:" << PreCompiledIRPath << "\n";
SMDiagnostic SMD;
std::unique_ptr<Module> ADBM(
parseIRFile(StringRef(PreCompiledIRPath), SMD, M.getContext()));
Linker::linkModules(M, std::move(ADBM), Linker::Flags::LinkOnlyNeeded);
// ...
} else {
errs() << "Failed To Link PreCompiled AntiDebugging IR From:" << PreCompiledIRPath << "\n";
}
3. Modify ADBCallBack
the InitADB
properties of the sum function:
If a function is found ADBCallBack
, assert that it is not a declaration (i.e. it has already been defined), and then change its visibility, linkage properties, and function properties to ensure its behavior during optimization and linking.
// ... ()
Function *ADBCallBack = M.getFunction("ADBCallBack");
if (ADBCallBack) {
assert(!ADBCallBack->isDeclaration() && "AntiDebuggingCallback is not concrete!");
ADBCallBack->setVisibility(GlobalValue::VisibilityTypes::HiddenVisibility);
ADBCallBack->setLinkage(GlobalValue::LinkageTypes::PrivateLinkage);
ADBCallBack->removeFnAttr(Attribute::AttrKind::NoInline);
ADBCallBack->removeFnAttr(Attribute::AttrKind::OptimizeNone);
ADBCallBack->addFnAttr(Attribute::AttrKind::AlwaysInline);
}
// ...
4. Set the initialization flag and target triplet information:
After successfully linking the precompiled IR, initialized
the flag is set true
and the module’s triple
information is stored.
this->initialized = true;
this->triple = Triple(M.getTargetTriple());
Eventually, initialize
the method returns after completing its task true
. In this way, if this LLVM Pass is included when the program is compiled, it will provide a process for initializing and linking the precompiled IR for each module, thus injecting anti-debugging code. If initialization fails, it will output an error and may stop further execution of Pass.
2. runOnModule&runOnFunction
2.1 runOnModule
runOnModule
The function is relatively simple and the overall logic is very clear. A set probability value is used to decide whether to apply anti-debugging obfuscation to each function in the module. It first ensures that the probability value entered by the user is within a reasonable range (0 to 100), then traverses all functions of the module, and toObfuscate
decides whether to apply obfuscation to non-specific functions (i.e. non-sum) through function and probability ADBCallBack
judgments InitADB
. If so, the corresponding obfuscation is performed and the necessary data structures are initialized during processing.
bool runOnModule(Module &M) override {
if (ProbRate > 100) {
errs() << "AntiDebugging application function percentage "
"-adb_prob=x must be 0 < x <= 100";
return false;
}
for (Function &F : M) {
if (toObfuscate(flag, &F, "adb") && F.getName() != "ADBCallBack" &&
F.getName() != "InitADB") {
errs() << "Running AntiDebugging On " << F.getName() << "\n";
if (!this->initialized)
initialize(M);
if (cryptoutils->get_range(100) <= ProbRate)
runOnFunction(F);
}
}
return true;
}
2.2 runOnFunction
This function is the core function of the entire PASS, and we will also analyze this function in detail.
1. Get F
the entry basic block of the functionEntryBlock
Obtains the first basic block of the function, usually used to insert initialization code or other pre-logic.
BasicBlock *EntryBlock = &(F.getEntryBlock());
2. Try to get a reference ADBCallBack
to InitADB
the sum function
Try to get a function named and from the module in which the current function is located ( F.getParent()
) .ADBCallBack
InitADB
Function *ADBCallBack = F.getParent()->getFunction("ADBCallBack");
Function *ADBInit = F.getParent()->getFunction("InitADB");
3. Processing ADBCallBack
with InitADB
functions
If handles for these two functions are found, a call to the pair is created in the entry basic block InitADB
.
If ADBCallBack
or InitADB
not found, an error message is output, and if the function F
‘s return type is not found void
, an error message is output false
.
if (ADBCallBack && ADBInit) {
CallInst::Create(ADBInit, "",
cast<Instruction>(EntryBlock->getFirstInsertionPt()));
} else {
errs() << "The ADBCallBack and ADBInit functions were not found\n";
if (!F.getReturnType()
->isVoidTy()) // We insert InlineAsm in the Terminator, which
// causes register contamination if the return type
// is not Void.
return false;
4. Check the target operating system and architecture and build an inline assembly code string
If the target system is Darwin (such as macOS or iOS) and the architecture is AArch64 (ARM64), execution continues and initializes an empty string for subsequent construction of inline assembly code.
if (triple.isOSDarwin() && triple.isAArch64()) {
errs() << "Injecting Inline Assembly AntiDebugging For:"
<< F.getParent()->getTargetTriple() << "\n";
std::string antidebugasm = "";
5. Decide which set of instructions to use for filling based on random numbersantidebugasm
get_range(2)
Different code paths are selected through a random function .
switch (cryptoutils->get_range(2)) {
6. Randomly select command fragments and splice them intoantidebugasm
Use a loop and random selection method to ensure that each set of instructions is used at least once, and then spliced into antidebugasm
the string.
case 0: {
std::string s[] = {"mov x0, #31\n", "mov w0, #31\n", "mov x1, #0\n",
"mov w1, #0\n", "mov x2, #0\n", "mov w2, #0\n",
"mov x3, #0\n", "mov w3, #0\n", "mov x16, #26\n",
"mov w16, #26\n"}; // svc ptrace
bool c[5] = {false, false, false, false, false};
while (c[0] != true || c[1] != true || c[2] != true || c[3] != true ||
c[4] != true) {
// ...
}
7. Create InlineAsm
the object IA
and insert it before the function termination instruction
Creates an inline assembly object that contains antidebugasm
the assembly code in a string.
InlineAsm *IA = InlineAsm::get(FunctionType::get(Type::getVoidTy(EntryBlock->getContext()), false), antidebugasm, "", true, false);
8. Add inline assembly at the end of each basic block of the function
All basic blocks in the function are traversed and an inline assembly call is inserted before the termination instruction of each basic block, with version adaptation performed internally.
Instruction *I = nullptr;
for (BasicBlock &BB : F)
I = BB.getTerminator();
CallInst::Create(IA, std::nullopt, "", I);
#if LLVM_VERSION_MAJOR >= 16
CallInst::Create(IA, std::nullopt, "", I);
#else
CallInst::Create(IA, None, "", I);
#endif
9. If the operating system and architecture do not support it, an error message will be output.
If the operating system and architecture are not expected, an error message is output.
} else {
errs() << "Unsupported Inline Assembly AntiDebugging Target: " << F.getParent()->getTargetTriple() << "\n";
}
Through the above code, the general process is mainly to obtain and call the sum function first ADBCallBack
, InitADB
and then insert inline assembly for the Darwin system ARM64 architecture, and realize the calling of svc ptrace through assembly. In the process, security methods such as random number filling are used .
3. Precompiled anti-debugging IR file
From the above analysis, we can see that the code logic PreCompiledIRPath
sets the IR file containing the ADBCallBack
and InitADB
functions through parameters, and some anti-debugging logic is performed in this file. So next we analyze this file. The original author of the IR file Hikari has provided it at the address: github.com/HikariObfus… . The file structure is as follows:
PrecompiledAntiDebugging-aarch64-ios.bc
PrecompiledAntiDebugging-thumb-ios.bc
PrecompiledAntiDebugging-x86_64-macosx.bc
SymbolConfig.json
We only PrecompiledAntiDebugging-aarch64-ios.bc
analyze files, .bc
which are in LLVM bitcode file format, which contain the compiled binary form of LLVM’s intermediate representation. To view .bc
the contents of the file, it needs to be converted into a textual LLVM IR. Use tools from the LLVM toolchain llvm-dis
to accomplish this conversion. The converted file usually has .ll
the extension, which is a readable LLVM IR file.
llvm-dis <input.bc> -o <output.ll>
Readers can convert it themselves. Due to the large amount of code, the corresponding code is not provided here. We will analyze the IR file next.
1. Structure definition:
The beginning of the code defines multiple structures, including , %struct.kinfo_proc
, %struct.extern_proc
, %union.anon
, %struct.itimerval
, %struct.timeval
, %struct.eproc
, %struct._pcred
, %struct._ucred
and %struct.vmspace
.%struct.ios_execp_info
2. Global declaration:
@.str
is the global declaration of the string “ptrace”:@.str = private unnamed_addr constant [7 x i8] c"ptrace\00", align 1
@mach_task_self_
is an external global variable declaration:@mach_task_self_ = external global i32, align 4
3. Function ADBCallBack:
ADBCallBack
The function is relatively simple. It calls abort()
the function to terminate the program, and then executes an unreachable instruction ( unreachable
).
define void @ADBCallBack() #0 {
call void @abort() #4
unreachable
}
4. Function InitADB:
This function contains multiple system calls and checks. The main logic is as follows:
- Use to
sysctl
query process information:%18 = call i32 @sysctl(ptr %16, i32 4, ptr %17, ptr %3, ptr null, i64 0)
- Check some status of the process (by performing bit operations
and
and comparison instructionsicmp
):%22 = and i32 %21, 2048
AND%23 = icmp ne i32 %22, 0
- If debug status is detected, call
ADBCallBack
function:call void @ADBCallBack()
- Try loading and unloading libraries dynamically, possibly trying to detect if there is a debugger intervening in the dynamic linking process:
dlopen
anddlsym
calling:%26 = call ptr @dlopen(ptr null, i32 10)
- Use
syscall
system calls to perform lower-level checks:syscall
call:%34 = call i32 (i32, ...) @syscall(i32 26, i32 31, i32 0, i32 0)
- To dynamically allocate memory, call
task_get_exception_ports
to check the exception port, which may be used to determine if a debugger is attached:%52 = call i32 @task_get_exception_ports(i32 %37, i32 7166, ptr %40, ptr %42, ptr %45, ptr %48, ptr %51)
- Check if
isatty
andioctl
for any unusual behavior. These are usually used to check whether the program is running on the terminal and the status of the terminal. :%81 = call i32 @isatty(i32 1)
and%85 = call i32 (i32, i64, ...) @ioctl(i32 1, i64 1074295912)
5. System calls and statements:
The function declaration section contains multiple system calls, such as:
declare void @abort() #1
declare i32 @getpid() #2
declare ptr @malloc(i64) #3
declare i32 @task_get_exception_ports(i32, i32, ptr, ptr, ptr, ptr, ptr) #2
declare i32 @isatty(i32) #2
declare i32 @ioctl(i32, i64, ...) #2
6. Properties:
Function attributes are defined at the end of the code using the attributes keyword:
attributes #0 = { noinline nounwind optnone ssp uwtable ... }
attributes #1 = { noreturn "correctly-rounded-divide-sqrt-fp-math"="false" ...}
attributes #2 = ...
7. Module logos and logos:
The module’s compiler flags and identifying information are given at the end of the code:
!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}
The above IR code is designed to detect and prevent debugging. Once it detects that certain conditions are consistent with debugger operation or are inconsistent with what is expected for a normal running program, it ADBCallBack
terminates the program by calling . Let’s do an overall analysis of the code:
- Structure definitions : The code starts with the definition of multiple structures that may be used for interaction with the iOS operating system and organization of memory data.
- Global declaration :
@.str
It is a private, unnamed address constant used to store the string “ptrace”.@mach_task_self_
Is an external global variable, which may represent the identity of the current task. - Function ADBCallBack : This function is very simple. It calls
abort()
the function to terminate the program and then executes an unreachable instruction (unreachable
). This is usually part of the anti-debugging logic. - Function InitADB : This function is the core of anti-debugging logic. It makes a series of system calls and checks:
- Use to
sysctl
query process information. - Check some status of the process (by performing bit operations
and
and comparison instructionsicmp
). - If debug status is detected,
ADBCallBack
the function is called. - Try loading and unloading libraries dynamically, possibly trying to detect if a debugger is interfering with the dynamic linking process.
- Use
syscall
to make system calls, possibly to perform lower-level checks. - To dynamically allocate memory, call
task_get_exception_ports
to check the exception port, which may be used to determine if a debugger is attached. - Loop through some checks, calling each loop
ADBCallBack
if an exception is found. - Finally, check
isatty
andioctl
for any misbehavior. These are usually used to check whether the program is running on the terminal and the status of the terminal.
- Use to
- System calls and declarations : A series of system functions are declared in the code, such as
getpid
,sysctl
,dlopen
,dlsym
,dlclose
,syscall
,malloc
,task_get_exception_ports
,isatty
andioctl
. These functions are used to perform various system-level operations, many related to preventing debugging. - Properties : These define the compiler optimization properties of the function, such as not inlining (
noinline
), not throwing exceptions (nounwind
), etc. - Module flags and flags : declares some compiler-related metadata, such as
wchar_size
and PIC (position-independent code) levels.
Summarize
In this article, we understand how AntiDebug based on LLVM PASS is implemented through detailed code analysis and IR file interpretation. Finally, we summarize the differences between implementing AntiDebug in the form of PASS compared to the source code.
Implementing AntiDebug directly in a project usually means adding logic to detect the debugger at the source code level, while implementing AntiDebug based on LLVM Pass inserts such logic during the compiler optimization phase. The advantages of the two can be compared from the following aspects:
- Concealment :
- Source code implementation : Anti-debugging is implemented in the source code. The logic is visible to experienced developers or attackers and may be discovered and bypassed by reading the source code.
- LLVM Pass implementation : The anti-debugging logic inserted through LLVM Pass is implemented in the compiled binary, which makes detection and reverse engineering more difficult and increases the concealment of anti-debugging measures.
- portability :
- Source code implementation : Anti-debugging based on source code needs to be adapted and modified for different platforms and compilers.
- LLVM Pass implementation : As a cross-platform compiler, LLVM supports multiple target architectures. Using LLVM Pass can ensure the consistency and portability of anti-debugging logic on different platforms.
- Flexibility and reusability :
- Source code implementation : Anti-debugging code needs to be manually added to the code. For large projects, this may mean that similar code needs to be added repeatedly in multiple places.
- LLVM Pass implementation : As part of the compilation process, anti-debugging code can be automatically inserted into multiple parts of the target program, making it easier to reuse across multiple projects.
- Maintainability :
- Source Code Implementation : As a project grows, maintaining and updating anti-debugging logic embedded in the source code can become complex.
- LLVM Pass implementation : Anti-debugging logic is separated from application logic, making maintenance easier. If new anti-debugging technology emerges, you only need to update the LLVM Pass.
- performance :
- Source code implementation : Program performance may be affected by adding additional checks.
- LLVM Pass implementation : You can choose more intelligently when and where to insert anti-debugging code during compilation, which may lead to better performance optimization space.
- Level of confusion :
- Source code implementation : usually straightforward and easy to reverse.
- LLVM Pass implementation : It can combine the compiler’s optimization and obfuscation strategies to generate more complex binary code that is difficult to analyze.
All in all, implementing AntiDebug based on LLVM Pass can provide better concealment, portability, flexibility, and maintainability, and may also bring advantages in performance and confusion. However, this approach requires a deep understanding of the LLVM framework and may require a more complex build and debug process.