Why is software so vulnerable, and what can be done?
CAMBRIDGE, MA—The annual global tab of $73.7 billion for security-related hardware, software and service expenses still leaves many enterprises at risk and exposed to sophisticated and stealthy cybercrime attacks for the simple reason that those enterprises lack adequate visibility into their software vulnerabilities. By some estimates even small applications have hundreds of vulnerabilities. Lowering those numbers would bring many advantages, such as reducing the number of backdoors that can be hacked by cybercriminals.
In addressing this challenge, Draper has developed a new approach to testing software for vulnerabilities that takes advantage of the latest developments in machine learning and neural networks. Inadequate testing is one of the main reasons why software is typically delivered with approximately two to seven defects per thousand lines of code, according to the Software Engineering Institute. And while it’s not possible to catch all possible vulnerabilities with tests, closing the gap is critical and the foundation to Draper’s approach.
Called DeepCode, Draper’s neural network-based machine-learning system can comb through computer programs and learn their general properties. By using detection models that can learn from patterns of vulnerabilities that exist across different codebases, Draper’s system can recognize similar patterns in untested or newly written programs and produce repairs to eliminate those vulnerabilities. The goal is to promote the use (and reuse) of well-tested, well-analyzed code, and thus to reduce the incidence of exploitable vulnerabilities.
“Draper’s system has the potential to make a dramatic difference in reducing vulnerabilities—by stopping them before they occur, by finding them before they are exploited or by reducing their impact,” said Jeff Opper, Program Manager, Special Programs at Draper. “These techniques have the potential to detect vulnerabilities and even detect new classes of vulnerabilities.”
At the invitation of DARPA under its Mining and Understanding Software Enclaves (MUSE) program, Draper engineers recently demonstrated their machine-learning system’s ability to identify and repair software vulnerabilities in two widely deployed software languages, C and C++, which are used in televisions, airplanes, e-commerce websites and thousands of software applications.
Draper, which has amassed a large body of vulnerability, exploit, malware, rootkit and backdoor information, provides cyber security capabilities to commercial, government and non-profit customers who are increasingly concerned about the next cyber threat. “Our machine-learning system can serve customers as a vulnerability detection and repair engine,” Opper said.
Draper has previously applied its multidisciplinary engineering capabilities to a variety of related programs including inherently secure processors; machine learning to combat online extremism, cyberbullying and other abuse of social media applications; and cryptographically encoded, high-bandwidth communications for UAVs.
Over the past 10 years, Draper has extracted miniature systems and real-time embedded systems design knowledge to develop cyber capabilities to assess software vulnerabilities and capabilities to secure electronics systems. Additionally, Draper has demonstrated secure networks featuring over-the-air keying to realize cryptographically encoded, high-bandwidth communications for UAVs and other applications. These complementary capabilities and technologies provide robust security solutions to guard critical embedded systems against cyber, reverse engineering, and other attacks and ensure that critical information can be protected and delivered in a timely and accurate manner.
Draper combines specific domain expertise and knowledge of how to apply the latest analytics techniques to extract meaningful information from raw data to better understand complex, dynamic processes. Our system design approach encompasses effective organization and processing of large data sets, automated analysis using algorithms and exploitation of results. To facilitate user interaction with these processed data sets, Draper applies advanced techniques to automate understanding and correlation of patterns in the data. Draper’s expertise encompasses machine learning (including deep learning), information fusion from diverse and heterogeneous data sources, optimized coupling of data acquisition and analysis and novel methods for analysis of imagery and video data.