Modern Cyber-Physical Systems (CPSs) that need to perform complex control tasks (e.g., autonomous driving) are increasingly using AI-enabled controllers, mainly based on deep neural networks (DNNs). The quality assurance of such types of systems is of vital importance. However, their verification can be extremely challenging, due to their complexity and uninterpretable decision logic. Falsification is an established approach for CPS quality assurance, which, instead of attempting to prove the system correctness, aims at finding a time-variant input signal violating a formal specification describing the desired behavior; it often employs a search-based testing approach that tries to minimize the robustness of the specification, given by its quantitative semantics. However, guidance provided by robustness is mostly black-box and only related to the system output, but does not allow to understand whether the temporal internal behavior determined by multiple consecutive executions of the neural network controller has been explored sufficiently. To bridge this gap, in this paper, we make an early attempt at exploring the temporal behavior determined by the repeated executions of the neural network controllers in hybrid control systems and first propose eight time-aware coverage criteria specifically designed for neural network controllers in the context of CPS, which consider different features by design: the simple temporal activation of a neuron, the continuous activation of a neuron for a given duration, and the differential neuron activation behavior over time. Second, we introduce a falsification framework, named FalsifAI FalsifAI, that exploits the coverage information for better falsification guidance. Namely, inputs of the controller that increase the coverage (so improving the exploration of the DNN behaviors), are prioritized in the exploitation phase of robustness minimization. Our large-scale evaluation over a total of 3 typical CPS tasks, 6 system specifications, 18 DNN models and more than 12,000 experiment runs, demonstrates 1) the advantage of our proposed technique in outperforming two state-of-the-art falsification approaches, and 2) the usefulness of our proposed time-aware coverage criteria for effective falsification guidance.
All Science Journal Classification (ASJC) codes