Low-Overhead Timely Progress Assessment

Low-overhead Online Assessment of Timely Progress as a Commodity

Abstract

The correctness of safety-critical systems depends on both their logical and temporal behavior. Control-flow integrity (CFI) is a well-established and understood technique to safeguard the logical flow of safety-critical applications. But unfortunately, no established methodologies exist for the complementary problem of detecting violations of control flow timeliness. Worse yet, the latter dimension, which we term Timely Progress Integrity (TPI), is increasingly more jeopardized as the complexity of our embedded systems continues to soar. As key resources of the memory hierarchy become shared by several CPUs and accelerators, they become hard-to-analyze performance bottlenecks. And the precise interplay between software and hardware components becomes hard to predict and reason about. How to restore control over timely progress integrity? We postulate that the first stepping stone toward TPI is to develop methodologies for Timely Progress Assessment (TPA). TPA refers to the ability of a system to live-monitor the positive/negative slack—with respect to a known reference—at key milestones throughout an application’s lifespan. In this paper, we propose one such methodology that goes under the name of Milestone-Based Timely Progress Assessment or MB-TPA, for short. Among the key design principles of MB-TPA is the ability to operate on black-box binary executables with near-zero overhead and implementable on commercial platforms. To prove its feasibility and effectiveness, we propose and evaluate a full-stack implementation called Timely Progress Assessment with 0 Overhead (TPAw0v). We demonstrate its capability in providing live TPA for complex vision applications while introducing less than 0.6% overhead. Finally, we demonstrate one use case where TPA information is used to restore TPI in the presence of temporal interference over shared memory resources.

Milestones

Mar, 2023
Paper submitted

Aug, 2022
Code Milestone

Real-time process tracing working

Jan, 2022
Project Milestone

Research Started

Overview

This project proposes a solution called Timely Progress Assessment (TPA) to guarantee Timely Progress Integrity (TPI) in complex computing systems. TPA is a system-level solution that can operate on black-box binaries and inform the operating system of the expected/detected progress of its applications. The proposed Milestone-Based Timely Progress Assessment (MB-TPA) relies on binary analysis and on-chip tracing subsystems to detect timely completion of intermediate progress milestones for an application. MB-TPA introduces negligible overhead to the monitored application and can provide live progress assessment even if a low-power CPU is used to monitor a high-performance CPU. The article demonstrates the feasibility of online progress assessment without source code instrumentation for black-box applications in commercial platforms and showcases three use cases. The contributions of the article include proposing the concept of TPI, formulating a theoretical approach, providing a full-stack proof-of-concept implementation, and demonstrating the feasibility and usefulness of TPA in complex computing systems.

Results

Prompted by the demand for high-performance embedded platforms, the design of modern system-on-chip has gained in complexity at the expense of software predictability and timeliness. While the vast majority of previous research has concentrated their effort on managing for the worst-case, we argue that reasoning on the progress of live applications must be a key requirement to achieve Timely Progress Integrity.

In our work, we proposed a theoretical formulation of our approach called MB-TPA and present a prototype, TPAw0v, feasible on widely available commercial platforms featuring tracing capabilities. Our experiments showed that our prototype is successful in tracking the progress of applications with near-zero overhead while operating on a lower-performance core! Moreover, through its prototype implementation, we demonstrated the capability of our model to detect execution anomalies and enforced corrective measures to preserve TPI. We envision that the contributions made by this work represent the first building blocks towards elaborated real-time policies with TPI at their core.

References

The complete paper is currently under review for a conference. After that, we will publish the paper and all of the related materials on this page. Stand by!