frame analysis

The principle of anti-debugging will not be described in detail. We will mainly analyze the specific implementation of PASS. This PASS is designed to increase the debugging resistance of the compiled program. Let’s first sort out its overall implementation logic to get a first impression. We will explain it in detail later. It implements anti-debugging capabilities in two main ways:

1. Link precompiled anti-debugging IR code

adbextirpathThe code attempts to load a precompiled anti-debug IR file from the path specified (provided by the option). If the file is loaded successfully, it will Linker::linkModulesbe linked into the current module through the function. This precompiled IR may contain a series of functions ( ADBCallBackand InitADB) and structures used for anti-debugging, for example:

  • Instrument the debugger’s code.
  • Modify its own execution path to prevent the debugger from tracing normally.
  • Instrument code to detect unusual behavior that may occur in a debugging environment.

2. Platform-specific inline assembly injection

For the AArch64 architecture of the Darwin operating system, if the ADBCallBackand InitADBfunctions are not found, pass will try to inject inline assembly code directly. A probability-based approach is adopted, by cryptoutils->get_range(2)randomly selecting an inline assembly code to inject:

  • The generated inline assembly code may use system calls to attempt to trigger anti-debugging behavior, for example, by ptracecalling to detect whether it is being debugged.
  • Use to InlineAsm::getcreate an inline assembly object and then insert it before the last instruction of the function, usually before the function’s return instruction.

code analysis

0.config

Let’s first briefly interpret the configuration. Two static global command line options are defined at the beginning of the code:

1. PreCompiledIRPathCommand line options:
 static cl::opt<std::string> PreCompiledIRPath(
    "adbextirpath",
    cl::desc("External Path Pointing To Pre-compiled AntiDebugging IR"),
    cl::value_desc("filename"), cl::init(""));
  • cl::opt<std::string>Defines a std::stringcommand line option of type .
  • "adbextirpath"is the name of the command line option and the flag used when specifying the option on the command line.
  • cl::descA description of this option is provided, telling the user that this option is used to specify the external path of the precompiled anti-debugging IR file.
  • cl::value_descIs a description of the command line parameter, telling the user that this parameter should be a file name.
  • cl::init("")The default value of this option is initialized, which is an empty string, indicating that no path is specified by default.
2. ProbRateCommand line options:
 static cl::opt<uint32_tProbRate(
    "adb_prob",
    cl::desc("Choose the probability [%] For Each Function To Be "
             "Obfuscated By AntiDebugging"),
    cl::value_desc("Probability Rate"), cl::init(40), cl::Optional);
  • cl::opt<uint32_t>Defines a uint32_tcommand line option of type (unsigned 32-bit integer).
  • "adb_prob"is the name of this command line option.
  • cl::descThis parameter sets a percentage that determines the probability of each function being obfuscated by anti-debugging.
  • cl::value_descUsed to describe the type of value expected for this command line option. In this example, the user should provide a “probability rate”.
  • cl::init(40)Indicates that the default value of this option is 40, that is, if the user does not specify this option on the command line, its value will be automatically set to 40%.
  • cl::OptionalIndicates that this command line option is optional and the user can choose whether to provide this option.

-adbextirpathIn general, users are allowed to specify the path to the precompiled anti-debugging IR file through options on the command line , and -adb_probthe probability of each function being obfuscated using options.

1.initialize

Next, we analyze the function code in detail initializeand sort out its overall logic.

1. Check the precompiled IR path:

First, it is judged PreCompiledIRPathwhether it is empty. If so, try to build a default path. It assumes there is a folder named “Hikari” in the user’s home_directorydirectory, and then builds the file name based on the current module’s target architecture and operating system type.

if (PreCompiledIRPath == "") {
 SmallString<32> Path;
 if (sys::path::home_directory(Path)) {
   sys::path::append(Path, "Hikari");
   Triple tri(M.getTargetTriple());
   sys::path::append(Path, "PrecompiledAntiDebugging-" +
                             Triple::getArchTypeName(tri.getArch()) +
                             "-" + Triple::getOSTypeName(tri.getOS()) +
                             ".bc");
   PreCompiledIRPath = Path.c_str();
}
}
2. Link the precompiled IR:

In this section, you first use an ifstreamobject fto check if the file exists. If present, try to link the precompiled IR file. If the file does not exist or is unreadable, an error message is output.

std::ifstream f(PreCompiledIRPath);
if (f.good()) {
 errs() << "Linking PreCompiled AntiDebugging IR From:" << PreCompiledIRPath << "\n";
 SMDiagnostic SMD;
 std::unique_ptr<Module> ADBM(
     parseIRFile(StringRef(PreCompiledIRPath), SMD, M.getContext()));
 Linker::linkModules(M, std::move(ADBM), Linker::Flags::LinkOnlyNeeded);
 // ... else {
 errs() << "Failed To Link PreCompiled AntiDebugging IR From:" << PreCompiledIRPath << "\n";
}
3. Modify ADBCallBackthe InitADBproperties of the sum function:

If a function is found ADBCallBack, assert that it is not a declaration (i.e. it has already been defined), and then change its visibility, linkage properties, and function properties to ensure its behavior during optimization and linking.

// ... ()
Function *ADBCallBack = M.getFunction("ADBCallBack");
if (ADBCallBack) {
 assert(!ADBCallBack->isDeclaration() && "AntiDebuggingCallback is not concrete!");
 ADBCallBack->setVisibility(GlobalValue::VisibilityTypes::HiddenVisibility);
 ADBCallBack->setLinkage(GlobalValue::LinkageTypes::PrivateLinkage);
 ADBCallBack->removeFnAttr(Attribute::AttrKind::NoInline);
 ADBCallBack->removeFnAttr(Attribute::AttrKind::OptimizeNone);
 ADBCallBack->addFnAttr(Attribute::AttrKind::AlwaysInline);
}
// ... 
4. Set the initialization flag and target triplet information:

After successfully linking the precompiled IR, initializedthe flag is set trueand the module’s tripleinformation is stored.

this->initialized = true;
this->triple = Triple(M.getTargetTriple());

Eventually, initializethe method returns after completing its task true. In this way, if this LLVM Pass is included when the program is compiled, it will provide a process for initializing and linking the precompiled IR for each module, thus injecting anti-debugging code. If initialization fails, it will output an error and may stop further execution of Pass.

2. runOnModule&runOnFunction

2.1 runOnModule

runOnModuleThe function is relatively simple and the overall logic is very clear. A set probability value is used to decide whether to apply anti-debugging obfuscation to each function in the module. It first ensures that the probability value entered by the user is within a reasonable range (0 to 100), then traverses all functions of the module, and toObfuscatedecides whether to apply obfuscation to non-specific functions (i.e. non-sum) through function and probability ADBCallBackjudgments InitADB. If so, the corresponding obfuscation is performed and the necessary data structures are initialized during processing.

bool runOnModule(Module &M) override {
   if (ProbRate > 100) {
     errs() << "AntiDebugging application function percentage "
               "-adb_prob=x must be 0 < x <= 100";
     return false;
  }
   for (Function &F : M) {
     if (toObfuscate(flag, &F, "adb") && F.getName() != "ADBCallBack" &&
         F.getName() != "InitADB") {
       errs() << "Running AntiDebugging On " << F.getName() << "\n";
       if (!this->initialized)
         initialize(M);
       if (cryptoutils->get_range(100) <= ProbRate)
         runOnFunction(F);
    }
  }
   return true;
}

2.2 runOnFunction

This function is the core function of the entire PASS, and we will also analyze this function in detail.

1. Get Fthe entry basic block of the functionEntryBlock

Obtains the first basic block of the function, usually used to insert initialization code or other pre-logic.

BasicBlock *EntryBlock = &(F.getEntryBlock());
2. Try to get a reference ADBCallBackto InitADBthe sum function

Try to get a function named and from the module in which the current function is located ( F.getParent()) .ADBCallBackInitADB

Function *ADBCallBack = F.getParent()->getFunction("ADBCallBack");
Function *ADBInit = F.getParent()->getFunction("InitADB");
3. Processing ADBCallBackwith InitADBfunctions

If handles for these two functions are found, a call to the pair is created in the entry basic block InitADB.

If ADBCallBackor InitADBnot found, an error message is output, and if the function F‘s return type is not found void, an error message is output false.

if (ADBCallBack && ADBInit) {
   CallInst::Create(ADBInit, "",
                    cast<Instruction>(EntryBlock->getFirstInsertionPt()));
} else {
   errs() << "The ADBCallBack and ADBInit functions were not found\n";
   if (!F.getReturnType()
            ->isVoidTy()) // We insert InlineAsm in the Terminator, which
                          // causes register contamination if the return type
                          // is not Void.
     return false;
4. Check the target operating system and architecture and build an inline assembly code string

If the target system is Darwin (such as macOS or iOS) and the architecture is AArch64 (ARM64), execution continues and initializes an empty string for subsequent construction of inline assembly code.

if (triple.isOSDarwin() && triple.isAArch64()) {
       errs() << "Injecting Inline Assembly AntiDebugging For:"
              << F.getParent()->getTargetTriple() << "\n";
       std::string antidebugasm = "";
5. Decide which set of instructions to use for filling based on random numbersantidebugasm

get_range(2)Different code paths are selected through a random function .

switch (cryptoutils->get_range(2)) {
6. Randomly select command fragments and splice them intoantidebugasm

Use a loop and random selection method to ensure that each set of instructions is used at least once, and then spliced ​​into antidebugasmthe string.

case 0: {
   std::string s[] = {"mov x0, #31\n", "mov w0, #31\n", "mov x1, #0\n",
                            "mov w1, #0\n",  "mov x2, #0\n",  "mov w2, #0\n",
                            "mov x3, #0\n",  "mov w3, #0\n",  "mov x16, #26\n",
                            "mov w16, #26\n"}; // svc ptrace
         bool c[5] = {false, false, false, false, false};
         while (c[0] != true || c[1] != true || c[2] != true || c[3] != true ||
                c[4] != true) {
       // ...
  }
7. Create InlineAsmthe object IAand insert it before the function termination instruction

Creates an inline assembly object that contains antidebugasmthe assembly code in a string.

InlineAsm *IA = InlineAsm::get(FunctionType::get(Type::getVoidTy(EntryBlock->getContext()), false), antidebugasm, "", true, false);
8. Add inline assembly at the end of each basic block of the function

All basic blocks in the function are traversed and an inline assembly call is inserted before the termination instruction of each basic block, with version adaptation performed internally.

Instruction *I = nullptr;
for (BasicBlock &BB : F)
   I = BB.getTerminator();
CallInst::Create(IA, std::nullopt, "", I);
#if LLVM_VERSION_MAJOR >= 16
       CallInst::Create(IA, std::nullopt, "", I);
#else
       CallInst::Create(IA, None, "", I);
#endif
9. If the operating system and architecture do not support it, an error message will be output.

If the operating system and architecture are not expected, an error message is output.

else {
   errs() << "Unsupported Inline Assembly AntiDebugging Target: " << F.getParent()->getTargetTriple() << "\n";
}

Through the above code, the general process is mainly to obtain and call the sum function first ADBCallBackInitADBand then insert inline assembly for the Darwin system ARM64 architecture, and realize the calling of svc ptrace through assembly. In the process, security methods such as random number filling are used .

3. Precompiled anti-debugging IR file

From the above analysis, we can see that the code logic PreCompiledIRPathsets the IR file containing the ADBCallBackand InitADBfunctions through parameters, and some anti-debugging logic is performed in this file. So next we analyze this file. The original author of the IR file Hikari has provided it at the address: github.com/HikariObfus… . The file structure is as follows:

PrecompiledAntiDebugging-aarch64-ios.bc
PrecompiledAntiDebugging-thumb-ios.bc
PrecompiledAntiDebugging-x86_64-macosx.bc
SymbolConfig.json

We only PrecompiledAntiDebugging-aarch64-ios.bcanalyze files, .bcwhich are in LLVM bitcode file format, which contain the compiled binary form of LLVM’s intermediate representation. To view .bcthe contents of the file, it needs to be converted into a textual LLVM IR. Use tools from the LLVM toolchain llvm-disto accomplish this conversion. The converted file usually has .llthe extension, which is a readable LLVM IR file.

llvm-dis <input.bc> -o <output.ll>

Readers can convert it themselves. Due to the large amount of code, the corresponding code is not provided here. We will analyze the IR file next.

1. Structure definition:

The beginning of the code defines multiple structures, including , %struct.kinfo_proc%struct.extern_proc%union.anon%struct.itimerval%struct.timeval%struct.eproc%struct._pcred%struct._ucredand %struct.vmspace.%struct.ios_execp_info

2. Global declaration:
  • @.stris the global declaration of the string “ptrace”:@.str = private unnamed_addr constant [7 x i8] c"ptrace\00", align 1
  • @mach_task_self_is an external global variable declaration:@mach_task_self_ = external global i32, align 4
3. Function ADBCallBack:

ADBCallBackThe function is relatively simple. It calls abort()the function to terminate the program, and then executes an unreachable instruction ( unreachable).

define void @ADBCallBack() #0 {
call void @abort() #4
unreachable
}
4. Function InitADB:

This function contains multiple system calls and checks. The main logic is as follows:

  • Use to sysctlquery process information:%18 = call i32 @sysctl(ptr %16, i32 4, ptr %17, ptr %3, ptr null, i64 0)
  • Check some status of the process (by performing bit operations andand comparison instructions icmp): %22 = and i32 %21, 2048AND%23 = icmp ne i32 %22, 0
  • If debug status is detected, call ADBCallBackfunction:call void @ADBCallBack()
  • Try loading and unloading libraries dynamically, possibly trying to detect if there is a debugger intervening in the dynamic linking process: dlopenand dlsymcalling:%26 = call ptr @dlopen(ptr null, i32 10)
  • Use syscallsystem calls to perform lower-level checks: syscallcall:%34 = call i32 (i32, ...) @syscall(i32 26, i32 31, i32 0, i32 0)
  • To dynamically allocate memory, call task_get_exception_portsto check the exception port, which may be used to determine if a debugger is attached:%52 = call i32 @task_get_exception_ports(i32 %37, i32 7166, ptr %40, ptr %42, ptr %45, ptr %48, ptr %51)
  • Check if isattyand ioctlfor any unusual behavior. These are usually used to check whether the program is running on the terminal and the status of the terminal. : %81 = call i32 @isatty(i32 1)and%85 = call i32 (i32, i64, ...) @ioctl(i32 1, i64 1074295912)
5. System calls and statements:

The function declaration section contains multiple system calls, such as:

  • declare void @abort() #1
  • declare i32 @getpid() #2
  • declare ptr @malloc(i64) #3
  • declare i32 @task_get_exception_ports(i32, i32, ptr, ptr, ptr, ptr, ptr) #2
  • declare i32 @isatty(i32) #2
  • declare i32 @ioctl(i32, i64, ...) #2
6. Properties:

Function attributes are defined at the end of the code using the attributes keyword:

attributes #0 = { noinline nounwind optnone ssp uwtable ... }
attributes #1 = { noreturn "correctly-rounded-divide-sqrt-fp-math"="false" ...}
attributes #2 = ...
7. Module logos and logos:

The module’s compiler flags and identifying information are given at the end of the code:

!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}

The above IR code is designed to detect and prevent debugging. Once it detects that certain conditions are consistent with debugger operation or are inconsistent with what is expected for a normal running program, it ADBCallBackterminates the program by calling . Let’s do an overall analysis of the code:

  1. Structure definitions : The code starts with the definition of multiple structures that may be used for interaction with the iOS operating system and organization of memory data.
  2. Global declaration : @.strIt is a private, unnamed address constant used to store the string “ptrace”. @mach_task_self_Is an external global variable, which may represent the identity of the current task.
  3. Function ADBCallBack : This function is very simple. It calls abort()the function to terminate the program and then executes an unreachable instruction ( unreachable). This is usually part of the anti-debugging logic.
  4. Function InitADB : This function is the core of anti-debugging logic. It makes a series of system calls and checks:
    • Use to sysctlquery process information.
    • Check some status of the process (by performing bit operations andand comparison instructions icmp).
    • If debug status is detected, ADBCallBackthe function is called.
    • Try loading and unloading libraries dynamically, possibly trying to detect if a debugger is interfering with the dynamic linking process.
    • Use syscallto make system calls, possibly to perform lower-level checks.
    • To dynamically allocate memory, call task_get_exception_portsto check the exception port, which may be used to determine if a debugger is attached.
    • Loop through some checks, calling each loop ADBCallBackif an exception is found.
    • Finally, check isattyand ioctlfor any misbehavior. These are usually used to check whether the program is running on the terminal and the status of the terminal.
  5. System calls and declarations : A series of system functions are declared in the code, such as getpidsysctldlopendlsymdlclosesyscallmalloctask_get_exception_portsisattyand ioctl. These functions are used to perform various system-level operations, many related to preventing debugging.
  6. Properties : These define the compiler optimization properties of the function, such as not inlining ( noinline), not throwing exceptions ( nounwind), etc.
  7. Module flags and flags : declares some compiler-related metadata, such as wchar_sizeand PIC (position-independent code) levels.

Summarize

In this article, we understand how AntiDebug based on LLVM PASS is implemented through detailed code analysis and IR file interpretation. Finally, we summarize the differences between implementing AntiDebug in the form of PASS compared to the source code.

Implementing AntiDebug directly in a project usually means adding logic to detect the debugger at the source code level, while implementing AntiDebug based on LLVM Pass inserts such logic during the compiler optimization phase. The advantages of the two can be compared from the following aspects:

  1. Concealment :
    • Source code implementation : Anti-debugging is implemented in the source code. The logic is visible to experienced developers or attackers and may be discovered and bypassed by reading the source code.
    • LLVM Pass implementation : The anti-debugging logic inserted through LLVM Pass is implemented in the compiled binary, which makes detection and reverse engineering more difficult and increases the concealment of anti-debugging measures.
  2. portability :
    • Source code implementation : Anti-debugging based on source code needs to be adapted and modified for different platforms and compilers.
    • LLVM Pass implementation : As a cross-platform compiler, LLVM supports multiple target architectures. Using LLVM Pass can ensure the consistency and portability of anti-debugging logic on different platforms.
  3. Flexibility and reusability :
    • Source code implementation : Anti-debugging code needs to be manually added to the code. For large projects, this may mean that similar code needs to be added repeatedly in multiple places.
    • LLVM Pass implementation : As part of the compilation process, anti-debugging code can be automatically inserted into multiple parts of the target program, making it easier to reuse across multiple projects.
  4. Maintainability :
    • Source Code Implementation : As a project grows, maintaining and updating anti-debugging logic embedded in the source code can become complex.
    • LLVM Pass implementation : Anti-debugging logic is separated from application logic, making maintenance easier. If new anti-debugging technology emerges, you only need to update the LLVM Pass.
  5. performance :
    • Source code implementation : Program performance may be affected by adding additional checks.
    • LLVM Pass implementation : You can choose more intelligently when and where to insert anti-debugging code during compilation, which may lead to better performance optimization space.
  6. Level of confusion :
    • Source code implementation : usually straightforward and easy to reverse.
    • LLVM Pass implementation : It can combine the compiler’s optimization and obfuscation strategies to generate more complex binary code that is difficult to analyze.

All in all, implementing AntiDebug based on LLVM Pass can provide better concealment, portability, flexibility, and maintainability, and may also bring advantages in performance and confusion. However, this approach requires a deep understanding of the LLVM framework and may require a more complex build and debug process.