In Exploring Memory Safety in Critical Open Source Projects, CISA, the FBI, and other foreign cyber security agencies report that approximately 52% of critical open-source projects contain code written in a memory-unsafe programming language. This article is the latest in a series of statements from the government arguing in favor of a transition to memory-safe programming languages (e.g., Rust, Go, Java, etc.) in critical infrastructure.
In this blog post, we seek to examine the reasons why the government is placing emphasis on memory safety. This blog post explores the solution presented by CISA et al. (i.e., a transition to memory-safe programming languages) and its alternatives.
Who Cares?
The recent CISA article is the latest in a series of government statements regarding memory safety, including The Case for Memory Safe Roadmaps, The Urgent Need for Memory Safety in Software Products, and Back to the Building Blocks: A Path Toward Secure and Measurable Software, with the last article being published by the White House, to a lot of news coverage.
Evidently, memory safety is of critical concern to the US Government; but why? Three of the four articles mention that around 70% of published common vulnerabilities and exposures (CVEs) assigned by Microsoft are related to memory safety, citing a blog post by the Microsoft Security Response Center (MSRC). This number is quite alarming. Many news outlets have reported this statistic in isolation, claiming that memory-safety is the root cause for most known software vulnerabilities.
However, it is important to understand that the 70% figure applies only to CVEs assigned by Microsoft, not all CVEs generally. Because Microsoft’s products mostly use memory-unsafe languages (C/C++, etc.), their related vulnerabilities are overrepresented in the memory-safety category. Looking at the totality of all CVEs in the National Vulnerability Database (NVD), you can see that memory-safety-related vulnerabilities actually comprise just ~21.3% of the total.
Percentage of Memory Safety Related CVEs Added to the NVD Over Time
Please note that this 21.3% figure is an approximation. Out of 255,373 total CVEs, only 175,971 CVEs had enough information to determine if they were related to memory safety or not. Out of these, 37,479 CVEs were related to memory-safety.
The concept of memory-safety is sometimes ill-defined, with some individuals arguing for and against certain classes of vulnerability be included in the memory-safety category. We have opted for the definition of memory safety used by MITRE: CWE category 1399. CWE category 1399 includes various types of memory-safety-related weaknesses (e.g., stack-based buffer overflow, use-after-free, double-free, etc.). CWE category 1399 does not include some weaknesses that some may deem as falling into the memory-safety category (notably, NULL pointer dereference is not included in 1399). So, the 21.3% figure may be slightly larger or smaller than presented, though not by much.
None of this is to say that memory-safety is not an issue to be left unaddressed: quite the contrary. This data showcases (expectedly) that the use of a memory-unsafe language is likely to dramatically inflate the number of discovered memory-safety vulnerabilities in the resulting product, relative to other types of vulnerabilities. For embedded systems, this is of great concern, as memory-unsafe programming languages are often the primary method of development.
A survey conducted by JetBrains in 2021 found that the programming languages most strongly associated with embedded system development are Assembly, MATLAB, C, and C++. In embedded environments, memory-unsafe languages such as these can introduce catastrophic bugs and vulnerabilities, due to most memory-related vulnerabilities going unhandled due to the lack of an operating system. What would be a program crash in an IT environment would result in a whole system crash in an OT environment.
The Great Rewrite
The solution presented by most of the government articles is a transition (albeit, a gradual transition) to memory safe programming languages. Just how viable is this solution? How long would such a transition take to complete? Is this even possible?
GitHub contains over 420 million code repositories. Assuming the 52% figure presented by CISA et al. could be extended to all GitHub repositories, (which is, admittedly, quite the assumption), we are looking at least 218.4 million repositories needing at least some level of rewrite. Assuming the average repository contains ~6k lines of code in memory-unsafe language (again, big assumption), just over 1.3 trillion total lines of code need to be rewritten. Some estimates suggest that over the past 20 years, 2.8 trillion lines of code have been written. So, if all software developers in the world stop whatever they are doing, learn Rust, and start working on The Great Rewrite, we will have eliminated memory-unsafety by 2034!
Obviously, this is all a bit contrived. In their articles, CISA and the White House advocate for a gradual transition, where critical and well-adopted open-source libraries/packages are targeted first, rather than how I have presented things in my silly hypothetical. While this more targeted approach has potential, a Great Rewrite of a more global scope does not seem feasible.
One potential solution to this issue is automation. Programs like DARPA’s Translating All C to Rust (TRACTOR) program aim to eliminate memory-safety issues stemming from C/C++ programs by developing a system that automates the translation of legacy C code to Rust. Such a system would hasten the translation efforts advocated for by CISA and the White House.
But, even with the highest quality code translation automation available, I imagine that there would still be some (most likely a significant amount of) manual effort. All code, especially the code in critical open-source repositories, must be maintained, and usually extended upon. I doubt many open-source maintainers would allow an automated tool to rewrite large parts of their repository without at least some degree of manual verification.
Code translation of this nature also presents significant challenges in the case of legacy systems. The developers that understood the logic and intent behind the code in end-of-life systems have likely moved on to greener pastures. This makes verifying that the translation between programming languages has been successful quite difficult. In the OT/ICS/embedded space, anywhere from 20-70% of assets are end-of-life, and are no longer receiving patches or developer support.
None of this is to say that a rewrite of some degree is not warranted, nor that we should not be utilizing memory-safe programming languages more than we are now. I am an avid user of Rust, both in my personal and professional life. I believe the guarantees it makes regarding memory safety are incredibly valuable, and that the language should be adopted at a much higher rate than today. But I also believe that there are different tools for different jobs. C and C++ are typically the best choice for projects where low-level memory manipulation and portability are prioritized (i.e., the qualities of embedded projects). In many cases, C/C++ should be used over Rust to save time, write clean code, and implement business rules more effectively. C/C++ will be used for many more decades to come. As long as C/C++ exists, so too will memory-safety issues.
An Argument for Symbolic Execution
The creator of C++, Bjarne Stroustrup, wrote an article a few years ago which touched on this same topic titled A call to action: Think seriously about “safety”; then do something sensible about it. In this article, Stroustrup argues, among other things, that static analysis could be used by the compiler to enforce coding conventions which prevent or alleviate both memory-safety issues, as well as other types of safety issues more generally. To me, this approach feels more feasible than the nuclear option of rewriting entire libraries into another programming language, albeit with its own set of unique drawbacks.
For one, static analysis can only go so far. Some types of program behavior cannot be determined until runtime. Languages like Rust get around this caveat by only allowing you to select from a subset of all possible ways a program could be written via the ownership system. Languages like C and C++ would have to be redesigned to disallow certain capabilities to facilitate memory-safety, changing the way C/C++ programs are allowed to be structured. These changes would likely be too fundamental to be toggleable as a compiler option.
Ultimately, the goal of both Stroustrup’s approach and Rust’s ownership system, is the same: to prevent software vendors from releasing software with memory safety vulnerabilities. Both approaches seek to disallow code from being written in a particular way to achieve this. In both cases, legacy code would need drastic restructuring/translation to fit the newly enforced conventions, albeit with less manual effort required in Stroustrup’s approach.
The approach I advocate for is symbolic execution. Symbolic execution is a method for executing a program abstractly, exploring the possible states a program can be in. In this way, symbolic execution enables a program’s runtime behavior to be analyzed and reported upon. Should the program encounter a state wherein memory safety is violated, symbolic execution lets us find and report this state, including the conditions necessary to reproduce it.
Symbolic execution is fundamentally different from either Stroustrup’s or Rust’s approach to solving the problem of memory-safety because it does not require strict coding conventions to be followed, until the point wherein a vulnerability is discovered and in need of remediation. In this way, symbolic execution better lends itself to the verification and analysis of legacy systems, wherein a Great Rewrite would be unsuitable.
The ObjectSecurity OT.AI Platform utilizes symbolic execution to detect memory-safety violation such as stack-based buffer overflow, heap-based buffer overflows, double-frees, and others. ObjectSecurity OT.AI Platform performs post-build binary analysis, inspecting a program’s runtime behavior as it would occur on the CPU. ObjectSecurity OT.AI Platform’s vulnerability analysis capabilities can be integrated directly into your CI/CD pipeline to ensure that a memory-safety vulnerability never makes its way into your product.