Failure Assertion Programming
What is it?
Failure assertion programming is the disciplined use of executable checks—called assertions—to verify that specified pre-conditions hold before a block of code runs and that post-conditions hold after it completes. If a condition is violated, the software detects the error immediately and follows a defined safe reaction rather than continuing with potentially corrupted state. The technique is simple to apply, highly reviewable, and focuses developer attention on the assumptions that keep the system safe.
How it supports functional safety
Assertions directly target systematic failures that stem from requirements, design, or implementation defects: they stop the program from silently proceeding when assumptions are broken. They also surface manifestations of random or common-cause hardware faults as invalid data (e.g., out-of-range sensors, stale timestamps), ensuring the safety function does not act on corrupted information. Because the reaction is explicit and deterministic, the system can log evidence and transition to a safe state or a controlled degraded mode.
When to use
- Interfaces and data hand-offs where contracts (valid ranges, freshness, units) must be enforced.
- Safety-critical control paths whose outputs must stay within a documented safe envelope.
- State machines and mode management that rely on invariants (e.g., mutually exclusive states).
Inputs & Outputs
Inputs
- Defined contracts: pre-conditions, post-conditions, invariants linked to hazards/requirements.
- A documented safe-reaction strategy (safe state or controlled degradation) with timing bounds.
Outputs
- Deterministic error reports (logs, diagnostic counters, DTCs) with trace to the violated contract.
- Safe reaction executed: inhibit actuation, hold last safe output, or enter safe state.
Procedure
- Identify contracts. For each critical function, specify pre-conditions (true before), post-conditions (true after), and invariants (always true) and trace them to hazards and safety requirements.
- Select the mechanism. In C/C++, wrap checks in project-specific macros/functions so they remain active in safety builds and map to safe reactions (not just
abort()). - Place checks. Put pre-conditions at safety-critical entry points and interfaces; post-conditions at outputs and state transitions; invariants in state machines or periodic monitors.
- Define safe reaction. For each assertion, document severity, logging, and deterministic transition to safe/degraded operation.
- Classify severity. Use levels (e.g.,
ASSERT_WARN,ASSERT_FATAL) to balance nuisance trips vs. hazard prevention. - Test negatively. Inject violations to prove logging, reaction, and timing (response within the safety deadline).
- Analyze overhead. Measure latency and memory; keep fatal checks in hot paths, tune warning checks if needed with justification.
- Maintain traceability. Link each assertion to requirements and keep evidence (reviews, tests) in the safety case.
Worked Example
High-level
A safety controller regulates pump speed from a flow sensor. The loop assumes (1) fresh data ≤ 50 ms old, (2) flow within the calibrated range [0, 120 L/min], and (3) commanded duty ≤ 0.8. Failure assertion programming enforces these assumptions; any violation leads to a safe, inhibited output rather than an unsafe actuation.
Code-level
#include <stdint.h> #include <stdbool.h>
// Assertion levels and safe reactions
typedef enum { ASSERT_WARN, ASSERT_FATAL } assert_level_t;
void fs_log_violation(const char* cond, const char* func, int line, assert_level_t lvl);
void fs_enter_safe_state(void); // SAFE REACTION: inhibit pump, set output to 0
void fs_hold_last_safe_output(void); // SAFE REACTION: hold last known safe output
uint32_t monotonic_ms(void); // platform timer
#define FS_ASSERT(cond, lvl) do {
if (!(cond)) {
fs_log_violation(#cond, FUNCTION, LINE, (lvl));
if ((lvl) == ASSERT_FATAL) {
fs_enter_safe_state(); /* SAFE REACTION: enter safe state and inhibit actuation /
return false;
} else {
fs_hold_last_safe_output(); / SAFE REACTION: hold last known safe output */
}
}
} while (0)
// Contracts (calibration)
static const float FLOW_MIN = 0.0f; // L/min
static const float FLOW_MAX = 120.0f; // L/min
static const uint32_t SENSOR_FRESH_MS = 50;
static const float CMD_MAX = 0.8f; // normalized duty (safety envelope)
typedef struct {
float flow_lpm;
uint32_t timestamp_ms;
bool valid;
} flow_sample_t;
// Pre: sample.valid == true, age <= SENSOR_FRESH_MS, FLOW_MIN <= flow <= FLOW_MAX
// Post: 0.0 <= cmd <= CMD_MAX
bool control_step(const flow_sample_t* sample, float* out_cmd)
{
// --- PRE-CONDITIONS ---
FS_ASSERT(sample != NULL, ASSERT_FATAL);
FS_ASSERT(out_cmd != NULL, ASSERT_FATAL);
FS_ASSERT(sample->valid == true, ASSERT_FATAL);
uint32_t age = monotonic_ms() - sample->timestamp_ms;
FS_ASSERT(age <= SENSOR_FRESH_MS, ASSERT_FATAL);
FS_ASSERT(sample->flow_lpm >= FLOW_MIN && sample->flow_lpm <= FLOW_MAX, ASSERT_FATAL);
// --- Control law (simplified) ---
float cmd = sample->flow_lpm / FLOW_MAX; // normalize
// --- POST-CONDITIONS ---
FS_ASSERT(cmd >= 0.0f, ASSERT_FATAL);
FS_ASSERT(cmd <= CMD_MAX, ASSERT_FATAL);
*out_cmd = cmd;
return true;
}
Result: Violations (stale data, out-of-range flow, or command exceeding envelope) are logged and trigger a defined SAFE REACTION, preventing unsafe actuation.
Quality criteria
- Completeness of contracts: All safety-relevant pre/post/invariant conditions are identified, justified, and reviewed.
- Deterministic handling: Each assertion maps to a time-bounded reaction with proven test evidence.
- Runtime coverage: Assertions remain enabled in safety builds and are exercised by negative tests and fault injection.
- Performance control: Overhead measured; any optimization or thinning is justified and documented.
Common pitfalls
- Compiling out assertions in release builds — Mitigation: Use project macros that call safe-reaction hooks; justify any removals against SIL goals.
- Vague conditions (e.g., “should be okay”) — Mitigation: Quantify ranges, tolerances, and freshness limits based on calibration and hazard analysis.
- Unsafe failure handlers (e.g., blind reset) — Mitigation: Replace with controlled transitions to safe/degraded states and persistent logging.
- Assertion storm hides root cause — Mitigation: De-bounce repeated violations, capture first-failure data, and escalate after N counts.
- Unbounded overhead — Mitigation: Classify severity; keep fatal checks in hot paths and move non-critical diagnostics to slower monitors.
References
FAQ
Should assertions be disabled in production?
For safety-relevant code, no. Keep fatal checks enabled and justified. If you thin checks for timing margins, document the rationale, classify by severity, and show test evidence that hazardous violations are still caught.
How are assertions different from standard input validation?
Input validation guards external data at boundaries. Assertions also protect internal invariants and post-conditions, catching deeper logic defects indicative of systematic failures.
Will assertions hurt performance?
They add overhead, but you can classify checks and optimize placement. Measure latency, retain fatal checks in hot paths, and justify any reductions in the safety case.