Techniques to facilitate probabilistic software analysis in real-world programs

Probabilistic software analysis aims at quantifying how likely a target event is to occur, given a probabilistic characterization of the behavior of a program or of its execution environment. Examples of target events may include an uncaught exception, the invocation of a certain method, or the a...

Full description

Main Author: BORGES, Mateus Araújo
Other Authors: D'AMORIM, Marcelo Bezerra
Format: masterThesis
Language: por
Published: Universidade Federal de Pernambuco 2016
Subjects:
Online Access: https://repositorio.ufpe.br/handle/123456789/14932
Tags: Add Tag
No Tags, Be the first to tag this record!
Summary: Probabilistic software analysis aims at quantifying how likely a target event is to occur, given a probabilistic characterization of the behavior of a program or of its execution environment. Examples of target events may include an uncaught exception, the invocation of a certain method, or the access to confidential information. The technique collects constraints on the inputs that lead to the target events and analyzes them to quantify how likely it is for an input to satisfy the constraints. Current techniques either handle only linear constraints or only support continuous distributions using a “discretization” of the input domain, leading to imprecise and costly results. This work proposes an iterative distribution-aware sampling approach to support probabilistic symbolic execution for arbitrarily complex mathematical constraints and continuous input distributions. We follow a compositional approach, where the symbolic constraints are decomposed into sub-problems whose solution can be solved independently. At each iteration the convergence rate of the computation is increased by automatically refocusing the analysis on estimating the sub-problems that mostly affect the accuracy of the results, as guided by three different ranking strategies. Experiments on publicly available benchmarks show that the proposed technique improves on previous approaches in terms of scalability and accuracy of the results.