You Should be Automating Binary Reverse Engineering: Here’s Why.

Binary reverse engineering is a luxury not many can afford. Up until now, there haven’t been options to automate and scale the skills and experiences of reverse engineers. With recent advancements in AI/ML, automated binary reverse engineering is now possible. New AI/ML-based approaches to binary vulnerability detection preserve the advantages of manual processes and transcend limitations imposed by the scarcity of skilled human reverse engineers.

In the complex world of OT/ICS security, traditional cyber management methods often fall short, making the case that binary reverse engineering is the first—and most critical—line of defense.

Network monitoring is a common technique said to ensure the security of an OT/ICS environment. However, this approach has caveats. Many devices are not connected to the network but may be exploited through other means. Additionally, network scanners are reactive: they point out malicious activity after it has already happened. At this point, the malicious actor has most likely already achieved their goal.

Cybersecurity vendors who offer software composition analysis (SCA) provide one path to approach these concerns. Using SCA, an organization can analyze the components, such as libraries and modules, that a software under analysis depends upon. These components are often cross-referenced against a vulnerability database to determine what known vulnerabilities affect any given firmware/software. Most often, these vendors report the detected vulnerabilities as a software bill of materials (SBOM).

However, the SBOM generation approach leaves a lot to be desired: what about unpublished vulnerabilities? If a vulnerability isn’t found in a public database, either because it hasn’t yet been discovered or has not been published, then SBOM services are incapable of detecting it. This is especially concerning in the case of proprietary software systems with few known and familiar dependencies. Such systems are ripe for exploitation due to their lack of adequate analysis, addressing the nuances of the proprietary software, when assessed with conventional SBOM/SCA techniques.

Around 20-70% of OT/ICS assets are legacy or end-of-life (EOL). These assets are no longer receiving patches or updates and cannot be easily replaced, with consumers of EOL assets almost always lacking access to source code. These reasons make EOL assets particularly challenging to assess for security compliance purposes using traditional cyber management methods.

This leaves organizations with one choice for their defense: binary reverse engineering.

What is binary reverse engineering?

Binary reverse engineering is the process of analyzing a binary program to understand its functionality, behavior, weaknesses, and vulnerabilities. Binary reverse engineering has a variety of applications, including malware analysis, exploit development, vulnerability research, and legacy system maintenance.

Although binary reverse engineering encompasses a wide range of techniques and practices, the domain can be roughly divided into two sub-categories: static analysis and dynamic analysis. With static analysis, a reverse engineer typically analyzes the disassembled or decompiled output of a binary without running it. Static analysis can be performed to identify the organization of the software program, whether functions are used, and to find other information such as text, copyright, and various other properties of a binary file.

In contrast, dynamic analysis involves the execution of a binary file, either in whole or in part, to examine runtime anomalies and behavior. Dynamic analysis leverages the execution flow of a software program, often known as the control flow, where possible execution paths are made distinct by branching patterns and behaviors within the code.

Reverse engineering requires domain expertise.

Reverse engineering is typically done in security labs in large organizations, or by lone, highly motivated, individuals with their own personal goals. The latter is true of black hatters who find exploits to extort their victims, yet, also applies to the student who is learning the art of reverse engineering as well. The learning curve for reverse engineering is steep: it requires a high degree of domain expertise and years of experience in writing production software. In a broader context, this means binary reverse engineering is inaccessible to most organizations who care about the cybersecurity of the products they make or use. Expert reverse engineers are hard to come by, leaving most organizations with just the SBOM generation and network monitoring tools that comprise the status quo of the modern cybersecurity landscape.

AI/ML: A Solution to Automated Binary Reverse Engineering

If not for its complexity, binary reverse engineering would serve as the first line of defense in cybersecurity for any operation aiming to detect vulnerabilities. Indeed, when time and money are not constraints, it is undeniably the best approach. However, until recently, it could not be automated and was impossible to scale. Human intuition has been the limiting factor, until now.

The recent advancements in AI have proven it to be a close technological approximation of human intuition. AI/ML is the best candidate for replacing the human factor required in traditional manual reverse engineering processes.

AI/ML can be used to synthesize the output from existing binary reverse engineering techniques. Many analyses that are currently performed manually by software reverse engineers, such as concolic, SAST, and DAST analysis, are unwieldy to use. These analyses are not only difficult to initially configure, but also generate complicated output. AI/ML can be used to make sense of this output, generating a unified report that is easy to understand. The ObjectSecurity OT.AI Platform takes a similar approach to this one: unifying traditionally disparate analysis results in a digestible manner.

In the past, the inability to program human intuition has impeded the reliability of many common reverse engineering techniques. This is especially true for decompilation: the process of converting binary machine code back into a source code representation. As code complexity increases, the abstractions present in the original source code become more difficult to reconstruct. Due to the loss of semantic information post-compilation, determining the original names and purposes of most variables is nearly impossible. A human reverse engineer must make a great effort to piece together this ‘lost’ information. AI/ML presents a method for automating decompilation in a more effective way. The ObjectSecurity OT.AI Platform uses AI/ML to fill this human intuition gap present in many reverse engineering techniques, such as decompilation.

New vulnerabilities are often impossible to detect without reverse engineering the affected product. Lone, highly motivated, malicious actors make use of these techniques to develop novel exploits. This puts organizations under increasing pressure to defend themselves. However, the tools organizations have today are insufficient to protect against such zero-days. Most cybersecurity software relies on published vulnerabilities. Human reverse engineers are scarce. Most organizations cannot afford to employ a vulnerability research lab.

Automated binary reverse engineering is set to become one of the most effective cybersecurity techniques to combat zero-day exploits. Automation makes the benefits of traditional manual reverse engineering more accessible.

ObjectSecurity OT.AI Platform automates binary reverse engineering and vulnerability analysis. Click here if you are interested to learn more:

Learn more about our Automated Reverse Engineering