Google Award to make widely used software testing technique more effective

Baris Kasikci plans to improve software fuzzers by learning how deployed software is most commonly run by users.

Prof. Baris Kasikci has received a Google Award for his project that could make the most widely used technique in software testing more effective. Called software fuzzing, the technique builds on the core aim of all software testing – covering as many different inputs as possible – by choosing a wide range of randomized inputs. 

“Fuzzers have evolved a lot – this is the hottest area of research in terms of bug discovery,” says Kasikci. In fact, fuzzing is the most widely employed software quality assurance technique in the world. Virtually all major software companies rely heavily on fuzzers for bug detection, particularly a modern iteration of the technique called coverage-driven fuzzing.

Coverage-driven fuzzing, rather than choosing a truly random array of inputs, strategically directs the inputs to ensure that as much of the program’s actual code is executed in the tests as possible. The goal is to make sure that no bugs are left unturned by a block of code that got missed in a randomized test.

“The idea is that the more code you expose to different inputs,” Kasikci explains, “the more likely it is that you’ll encounter bugs.”

Kasikci’s proposal explores a shortcoming of this technique, namely that covering a high amount of a program’s code doesn’t necessarily guarantee that you’ve encountered all of its possible bugs.

“Because code structure is very complex, just because you cover a line of code doesn’t mean it’s not buggy. The state in a program doesn’t necessarily depend on what line of code gets executed, but what sequence of lines get executed in the entire life of the program.”

He concludes that an effective fuzzer should instead focus on path coverage, which seeks to use diverse inputs to expose the program to the largest number of execution paths possible. But that is a more difficult thing to measure than naive code coverage.

To achieve this, Kasikci will use data from executions of actual programs deployed in production systems, from phones to cloud servers. By monitoring these systems as they run, he’ll be able to identify which paths in common programs are used the most frequently by end users.

“You ideally want to find bugs in code that users actually use,” Kasikci says. “You don’t want to blindly increase path coverage.”

“Collaborating with Google for this project is ideal,” says Kasikci. The company readily collects a number of hardware and software events on its servers for performance optimization. Kasikci hopes to make this data dual-purpose, enabling both performance and software quality improvements.

“It is important from a practical perspective to focus your resources on finding bugs that matter to people,” Kasikci says. “You can use other techniques to find obscure bugs, but they don’t impact actual users as much. The goal is to take this fuzzing tool and use it in a way that’s more productive.”