Detecting the xz-utils Backdoor with Automation

In this ObjectSecurity blog post, we discuss how automated binary vulnerability analysis helps detect advanced attacks such as the recently discovered “xz-utils backdoor”, which was committed on March 25, 2024 to a ubiquitous library in the Linux ecosystem, via the xz-utils GitHub repository that has since been removed from the site by Github and Microsoft. This malware was disguised as a binary file meant to act as input for an automated test that runs along with new public build versions of the xz-utils library. Had this exploit not been detected by Andres Freund, a developer at Microsoft, countless Linux/Unix systems would have become vulnerable to what is suspected to be a nation state attack.

On March 25^th, 2024, this intentional backdoor was committed to a ubiquitous library in the Linux ecosystem, via the xz-utils GitHub repository that has since been removed from the site by Github and Microsoft. This malware was disguised as a binary file meant to act as input for an automated test that runs along with new public build versions of the xz-utils library.

Only after a multi-stage parsing process is completed, does the backdoor become injected into release versions of xz-utils. The payload injected at the end of this process results in a malicious shared object (.SO) file in versions 5.6.0 and 5.6.1 of the liblzma dependency of xz-utils.

xz-utils defines a shared library called liblzma that is meant to provide compression and decompression capabilities to numerous downstream dependents. One of these dependents is the SSH daemon, sshd. This daemon is the core program of any SSH server. Thus, had this attack succeeded, it would have effectively made Linux servers and computers open for unauthorized access. Thankfully, this exploit was detected by Andres Freund, a developer at Microsoft. You can read his original OSS post here. Due to the complexity of the exploit, it is suspected that a nation state is behind this attack.

CISA has also catalogued this backdoor as CVE-2024-3094.

This is the first attack of its kind: This is because the malware itself is not truly present in the source code alone. Only when we examine the binary are we capable of detecting this backdoor.

Finding the Abused `system()` Call

The infected libzma.so.5.6.X instead allows for remote code execution (RCE) prior to authentication by extracting a command from the authenticating client’s SSH certificate and passing it to system(), in the place where RSA_public_decrypt would execute normally.

If we perform a backtrace in GDB, we can see the path obfuscation taken before reaching system() (see screenshot).

The analysis of the call shows a recursive structure at address 0x132C8 in the liblzma binary file. The process likely depends on interactions with other binaries, such as standard libraries, and this is likely a dead end for many more novice reverse engineers. What stands out, though, is the obfuscated control flow.

We now demonstrate the automated findings of ObjectSecurity’s BinLens™ (formerly ObjectSecurity OT.AI Platform) below, where this control flow is further analyzed and characterized as a particularly malevolent finding.

ROP Exploitation

The results of our analysis showed the liblzma binary to behave as a Return-Oriented Programming (ROP) attack. The output showed, consistently, an overly complex control flow and the presence of a vast amount of ROP weaknesses. This type of attack leverages existing code snippets within a program, known as “gadgets,” to execute arbitrary operations. Each gadget typically ends with a return instruction and is meticulously chosen to perform a step towards executing an attacker’s payload without the need to inject new code, thus evading typical security measures that scan for code injections.

Understanding Return-Oriented Programming (ROP)

At its core, ROP is a sophisticated exploitation technique that abuses the way software handles subroutine returns. It manipulates the stack, which is a crucial data structure used to store return addresses and local variables for functions. In a typical ROP attack, an adversary carefully crafts a stack that includes a sequence of return addresses, each pointing to a chosen gadget. By manipulating the program’s execution flow, these gadgets are executed in sequence to perform arbitrary actions.

The technique hinges on finding and using sequences of machine instructions that are already present in the running program’s memory, the “gadgets”. These gadgets are pieced together to construct a payload that achieves the attacker’s objectives, such as compromising a system or stealing data.

Correlating Weird Control Flow with Increased ROP Attack Potential

The complexity of control flow in an application can significantly impact its vulnerability to ROP attacks. When control flow is irregular or convoluted, it often indicates numerous branches and potential execution paths. This environment can be ripe for ROP for several reasons:

Increased Gadget Availability: Complex control flows imply a higher diversity of code snippets and function epilogues. This variety provides a richer set of gadgets for attackers to exploit, enabling them to find the necessary components to string together their desired malicious functionality. A prominent feature of the liblzma backdoor is its complex control flow and increased number of ROP gadgets.
Obfuscation and Detection Evasion: Irregular control flows can make it more challenging for static analysis tools to accurately map out potential execution paths and detect anomalous sequences that might signify an exploit. This obfuscation naturally aids attackers in hiding their exploit chains within the legitimate complexity of the software. The xz-utils attack employs a sophisticated linker manipulation that can evade detection.
Compromised Flow Integrity: The very nature of ROP exploits involves diverting the intended control flow of a program. A binary that inherently contains complex or non-linear execution paths may be more susceptible to further manipulations without these anomalies being readily apparent to monitoring tools or even during manual review. Once triggered, the liblzma attack highjacks the control flow to access the SSH daemon for backdoor access.

Unveiling the Veil: The Critical Role of Linker Operations in Cybersecurity

The Procedure Linkage Table (PLT) and the Global Offset Table (GOT) are fundamental components in the dynamic linking process that enables programs to utilize shared libraries for common functionality. This system is both programmable and dynamic, which conserves memory and supports modular programming. However, this flexibility also introduces significant risks. When manipulated, the PLT and GOT can alter the intended control flow of an application, potentially turning into conduits for ROP attacks.

When malware, like that found in the xz-utils case, consistently interacts with the PLT and GOT at specific addresses, it reveals a methodical approach to hijack these mechanisms. This consistency is a critical signal, a pattern that, once recognized, can be monitored and mitigated against.

Correlation Between ROP Gadgets and Linker Addresses

ObjectSecurity’s BinLens™ (formerly ObjectSecurity OT.AI Platform) demonstrates consistent interactions at a common address, spanning multiple key sections that our platform utilizes to measure linker manipulations in PLT and GOT. This uniformity in address usage across four critical linker sections is an assessment of our platform’s monitoring capabilities. These findings also associated to a function, __cxa_finalize, that correlates to the presence of multiple ROP gadgets in this region:

Analyzing section 1
__cxa_finalize at 17824 interacts with cat 1 linker ops
Analyzing section 2
__cxa_finalize at 17824 interacts with cat 2 linker ops
Analyzing section 3
__cxa_finalize at 17824 interacts with cat 3 linker ops
Analyzing section 4
__cxa_finalize at 17824 interacts with cat 4 linker ops

We report that this dual identification is shown as the basis of a detection mechanism for this class of attacks that leverage complex execution flows and ROP gadgets. With this level of insight, red, purple, and blue teams are equipped to preemptively address and neutralize sophisticated attack vectors, securing their systems against the most cunning of cyber adversaries.

Validation of Findings

We wanted to know with the unpatched and patched liblzma samples whether the detection of ROP gadgets alone is sufficient in making the determination that this backdoor exists within our analyses. The logistic regression model is chosen as an ideal method for this analysis because it allows for the integration of multiple factors and provides a probabilistic framework for assessing the impact of these factors on the likelihood of our platform’s final determinations.

We modeled our findings across three populations:

Liblzma5.6.0 is an infected population
Liblzma5.6.1 is an infected population
Liblzma5.6.2 is the patched healthy population

The analysis results are the number of total ROP gadgets, the number of ROP gadgets within the same region as a linker manipulated address with multiple categories of linker manipulations, and the number of categories of linker manipulations at a common address of four possible categories of linker manipulations.

We regularized the logistic regression model to handle the issues of perfect separation, a common challenge in datasets with distinct group differences. This regularization helped stabilize the estimates and provided more reliable insights into the relationships between our predictors and the outcome.

The results are as follows:

Coefficients: ([[-0.00069303, 0.30343857, 0.5774888]])
Intercept: ([-0.27405023])

Coefficients:

ROP Gadgets: (-0.00069303). Slightly decreases the likelihood of infection with increasing number. Minimal effect.
ROP Path: (0.30343857). Increases the likelihood of infection significantly with more detections.
Linker Manipulations: (0.5774888). Strongly increases the likelihood of infection with higher values; most influential factor.

Intercept ((-0.27405023)): Baseline log odds of being non-infected when all predictors are zero.

This analysis shows that while ROP gadgets are prevalent in the infected populations, their sheer numbers alone do not significantly increase the likelihood of a backdoor being present. Instead, the context in which these gadgets appear, particularly their association with linker manipulations, is necessary for backdoor detection.

Therefore, the results from this analysis shows that the Linker Manipulations analysis of ObjectSecurity’s BinLens™ (formerly ObjectSecurity OT.AI Platform) is the strongest predictor of detecting a ROP backdoor, followed by the ROP Path that depends on the findings from Linker Manipulations. The analysis of ROP gadgets alone has a negligible inverse effect on infection likelihood, indicating that their presence must be contextualized with more specific manipulative actions to accurately detect backdoors.

A Call to Arms in Cybersecurity: Detecting `liblzma` Attack in BinLens

The significance of finding ROP gadgets corresponding to addresses associated with the PLT and GOT cannot be overstated. ROP is a cunning exploitation technique that avoids direct code injection by reusing existing code snippets in the program. By chaining these snippets, attackers can execute arbitrary malicious functionality. The presence of ROP gadgets at PLT and GOT addresses is particularly alarming — it suggests that these critical sections are being targeted as vector points for executing attack sequences.

In light of the recent security breach through the xz-utils library, ObjectSecurity’s BinLens™ (formerly ObjectSecurity OT.AI Platform) stands ready as a fully automated and effective solution designed to detect such backdoors and other exploits. It provides comprehensive tools that benefit red, purple, and blue teams, ensuring greater visibility within the binary samples of critical infrastructure.

This exploration into the depths of dynamic linking and ROP exploits is a testament to the power of modern cybersecurity defense strategies, such as offered by ObjectSecurity’s BinLens™ (formerly ObjectSecurity OT.AI Platform). As we share these insights, we invite the global community to engage, learn, and fortify.

The fight against cyber threats is relentless, but so are we. Let’s turn our knowledge into our most potent weapon in the cybersecurity arsenal.

Resources

Freund, Andres. oss-security mailing list. “backdoor in upstream xz/liblzma leading to ssh server compromise”. 3/29/24. https://openwall.com/lists/oss-security/2024/03/29/4
NIST. “CVE-2024-3094 Detail”. 3/29/24. https://nvd.nist.gov/vuln/detail/CVE-2024-3094
Weems, Anthony. xzbot Github page. https://github.com/amlweems/xzbot/tree/main?tab=readme-ov-file
smx. XZ Backdoor Analysis Github page. https://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504
ObjectSecurity’s BinLens™ (formerly ObjectSecurity OT.AI Platform) website. https://objectsecurity.com/otai

BinLens

Follow Our Blog

Detecting the "xz-utils" Backdoor with Automation

by Dr. Reza Fatahi, Principal Research Scientist, ObjectSecurity and Trevor Thomas, Project Lead, ObjectSecurity (published 4/8/2024)

Finding the Abused `system()` Call

ROP Exploitation

Understanding Return-Oriented Programming (ROP)

Correlating Weird Control Flow with Increased ROP Attack Potential

Unveiling the Veil: The Critical Role of Linker Operations in Cybersecurity

Correlation Between ROP Gadgets and Linker Addresses

Validation of Findings

A Call to Arms in Cybersecurity: Detecting `liblzma` Attack in BinLens

Resources

ABOUT

CONTACT INFO

Title

Detecting the "xz-utils" Backdoor with Automation

by Dr. Reza Fatahi, Principal Research Scientist, ObjectSecurity and Trevor Thomas, Project Lead, ObjectSecurity (published 4/8/2024)

Finding the Abused system() Call

ROP Exploitation

Understanding Return-Oriented Programming (ROP)

Correlating Weird Control Flow with Increased ROP Attack Potential

Unveiling the Veil: The Critical Role of Linker Operations in Cybersecurity

Correlation Between ROP Gadgets and Linker Addresses

Validation of Findings

A Call to Arms in Cybersecurity: Detecting liblzma Attack in BinLens

Resources

Related Posts

Balancing Power and Risk: The Air Force’s AI Doctrine

Automated Vulnerability Detection and AI Model Hardening for Mission-Critical Environments

Vibe Coding Risks: Productivity at What Cost?

ABOUT

CONTACT INFO

Title

Finding the Abused `system()` Call

A Call to Arms in Cybersecurity: Detecting `liblzma` Attack in BinLens