Written by: Hamid Karimi
Application proliferation equals attack exponential
Rapid adoption of cloud computing and web services has resulted in myriad use cases in all market segments. These use cases are governed by purpose-built applications running on many operating platforms. Application development is on a rapid growth path and this phenomenon creates a perfect security storm; as the number and complexity of applications grow, security considerations are challenged to keep pace. While static code analysis offers some relief, it is far from perfect and mandates access to the application’s source code. Enter fuzzing to blunt the sharp edge of application vulnerability peak.
Is fuzzing a process or a tool?
It is both.
Fuzzing is analogous to identifying gene mutations in the human body that can cause cancer or rare diseases before there is a sign of ailment. In computing platforms, it is a form of dynamic analysis which aims to uncover the unknown vulnerabilities of applications in input-parsing code long before an exploit would take advantage of it.
In other words, fuzzing goes against the structure of well-formed input to uncover response exceptions. But what kind of data permutations can a hacker use to discover blind spots in the shortest possible time? Given infinite time, all possible data input combinations can be generated and applied at an application’s ingress point to solicit a response and discover weaknesses in the code. However, time is a luxury that a security tester cannot afford. Attackers are increasingly using automated scripts to test the soft edges of cloud and web applications and precisely that is the approach an ethical hacker must use to conduct targeted fuzzing.
The first armament in the arsenal of a hacker is to use simple scanning tools and identify unpatched vulnerabilities. Fuzzing as a technique deployed by an attacker, on the other hand, requires a lot more patience and know-how to go beyond discovering common vulnerabilities.
On a more detailed level, fuzzing is like a debugging sandbox with its own set of requirements and features; the goal of the fuzzer is beyond identification of points of failure and entails actual manipulation of the application to perform tasks for which it was not designed— a sort of undocumented feature discovery. Case in point, lately there was an actual connected car project during which a commercial fuzzer discovered how the central power can be shut down through an anomalous data injected into the USB port. This is the kind of BUG that an attacker can successfully use with deadly consequences.
Lastly the fuzzing report must be detailed and actionable. At a minimum, the fuzzer must generate a list of causes for buffer overflow, format string, or memory exception. Furthermore, if the report claims the presence of a BUG, the fuzzer must show evidence of how to reproduce the error in a predictable way.
Machine learning’s role
Successful fuzzing requires the identification of protocols used in the application followed by defining the testcases for actual execution.
Testcases can be flawed as well and only statistically proper use of them will inform the user as which ones are prone to find relevant exceptions. In this process, one can really go far and deploy leading edge techniques such as machine learning to perform meta-analysis of data through complex correlations. By deploying machine learning, test cases are built in a way that their execution cycle and potential response can be predicted when there are time constraints in achieving results and fuzzing demands minimum human input.
Next step is to decide which testcases to fuzz after “learning” from them based on their feature-sets. Once they are identified, the first order of business is to mutate different input files causing random and unexpected responses—leading to discovery of unique BUGs. Machine learning will provide data for the construction of subsequent test cases, thus reducing the user input. Automation and machine learning in that sense are synonymous.
What fuzzer to acquire
There are two types of fuzzers in the market, the open source and commercial ones.
Open source fuzzers are readily accessible to potential hackers whereas the commercial tools would track the identity of user. Commercial fuzzers also offer a lot more automation and easy to deploy toolkits in addition to actionable reporting.
In general, the key attributes of an effective fuzzer are:
- Protocol inclusivity encompassing both open and proprietary protocols
- State full inspection by keeping track of the last state of test case and logging it
- Offer grammar suitable for input fuzzing using both simple data and network-based machine-learning techniques
- Ability to learn during execution and self-adjust
- Capability to conduct comprehensive fuzzing campaigns in the most time-efficient manner
- Built-in automation to minimize user input
- History of use cases and deployments that indicate broad adoption by multiple market segments
The security business is an arms race. Fuzzing is a tool available to hackers to develop zero-day attacks. It’s quite imperative that businesses should get ahead of the curve and implement valuable fuzzing as a standard process before applications are released to the market.
About the Author – Hamid Karimi has extensive knowledge about cyber security and for the past 15 years, his focus has been exclusively in the security space covering diverse areas of cryptography, strong authentication, vulnerability management, malware threats, as well as cloud and network protection. Hamid holds a Bachelor’s of Science degree in electrical and computer engineering from San Francisco State University. He is the VP of Business Development at Beyond Security, a provider for automated security testing solutions including vulnerability management, based out of Cupertino, CA.