Event-driven, with guaranteed maximum response time
What is it?
Event-driven with guaranteed maximum response time is a real-time approach where software reacts to interrupts, messages, or state changes, yet still offers a proven worst-case response time (WCRT). The guarantee is achieved by bounding execution and blocking (via worst-case execution time, resource-access protocols, and queue limits) and by verifying schedule feasibility (e.g., fixed-priority or EDF analysis). The technique preserves responsiveness to sporadic events while retaining deterministic timing required by safety standards.
How it supports functional safety
It addresses systematic failures in timing: hidden dependencies, unbounded blocking, and priority inversion. By deriving and justifying an upper latency bound for each safety function, engineers prevent “silent” misses under load. Although primarily a software measure, it can also reveal manifestations of random/common-cause hardware faults (e.g., bus errors causing retries, bursty sensor traffic) because such faults appear as increased delay; the design detects and handles these through admission control and safe fallbacks so the safety function does not act on stale or corrupted data.
When to use
- Hazard response is triggered by sporadic/aperiodic events with strict deadlines (e.g., stop within 50 ms).
- Polling would either miss fast events or waste CPU; interrupts or messages are preferred.
- Mixed-criticality systems combining periodic control loops with asynchronous alarms or diagnostics.
Inputs & Outputs
Inputs
- List of safety-relevant events with deadlines and safe reactions.
- WCET estimates for handlers and dependent tasks; context switch and interrupt latencies.
- Resource map (shared data, mutexes, ISRs) and message/queue characteristics (arrival rates, bursts).
- Scheduler model (fixed priority or EDF) and resource protocol (e.g., priority inheritance/ceiling).
Outputs
- Verified worst-case response time (WCRT) per event with margin to deadline.
- Configured priorities, queue bounds, and admission-control rules.
- Test evidence: overload/fault-injection results demonstrating the bound is respected.
- Compliance traceability from hazard → requirement → analysis → test.
Procedure
- Enumerate events & deadlines: From hazard analysis, define per-event deadline and required safe reaction.
- Establish WCETs & blocking: Derive conservative WCETs (measurement + analysis). Identify all shared resources; apply a resource protocol to bound blocking.
- Choose scheduling model: Use fixed-priority (with response-time analysis) or EDF (with density tests), including interrupt latency and release jitter.
- Bound queues & admission: Set maximum queue sizes and overload rules (drop/merge/ratelimit) that trigger a safe reaction rather than unbounded delay.
- Compute WCRT: Perform schedulability analysis per event path (ISR → handler → actuator) and verify WCRT ≤ deadline − sensing/actuation margins.
- Implement instrumentation: Timestamp event release/actuation; add watchdogs for deadline miss detection and logging.
- Validate under stress: Run burst tests, fault injection (e.g., bus error storms), and temperature/voltage corners to show the bound holds.
- Document & maintain: Keep analysis, assumptions, and test evidence under configuration control; re-verify on change.
Worked Example
High-level
A packaging line must stop within 40 ms when a light curtain is broken. The system is event-driven: an interrupt signals the hazard, a high-priority handler commands a safe torque-off. Analysis with measured WCETs (ISR 6 µs, handler 120 µs, max blocking 300 µs, worst interference 2.4 ms) yields WCRT = 2.83 ms < 40 ms, with 10 ms reserved for actuator delay, leaving margin.
Code-level
// Fixed-priority kernel; STOP_TASK has highest application priority. volatile bool lightcurtain_tripped = false;
ISR(LIGHTCURTAIN_ISR) {
lightcurtain_tripped = true;
os_signal(STOP_TASK); // release jitter bounded by kernel config
}
TASK(STOP_TASK) {
static uint8_t burst_counter = 0;
timestamp_t t0 = monotonic_now();
if (queue_len(EVENT_Q) >= EVENT_Q_MAX) {
// SAFE REACTION: drop non-critical events and enter safe state deterministically
flush_noncritical(EVENT_Q);
command_safe_torque_off();
log_deadline_guard("Overload-safe stop");
return;
}
if (lightcurtain_tripped) {
command_safe_torque_off(); // SAFE REACTION: immediate stop to safe state
lightcurtain_tripped = false;
}
timestamp_t t1 = monotonic_now();
assert_deadline(t0, t1, 40_ms); // monitors/flags any miss for diagnostics
}
Result: Even under bursts or competing activity, the stop action is commanded within the verified WCRT; overload triggers a deterministic safe state instead of unbounded delay.
Quality criteria
- Sound timing model: WCETs include ISR and kernel overhead; blocking is bounded by a defined resource protocol.
- Schedulability proof: Documented analysis (e.g., response-time equations) shows WCRT ≤ deadline with explicit margins.
- Overload safety: Bounded queues and admission rules lead to a defined safe reaction; deadline monitors are in place.
- Evidence: Stress/fault-injection tests reproduce worst-case arrival patterns and confirm measured latencies within bounds.
Common pitfalls
- Underestimated WCET → misses in the field. Mitigation: combine static analysis with measurement at worst PVT corners; add margin.
- Priority inversion via shared resources. Mitigation: apply priority inheritance/ceiling; minimize critical sections.
- Unbounded queues causing delay growth. Mitigation: cap queue length and define safe overload behavior.
- Ignoring actuator/sensor latencies. Mitigation: include sensing and actuation time in the end-to-end deadline.
- Assumption drift after changes. Mitigation: change control triggers re-analysis and re-test.
References
FAQ
How is this different from a time-triggered schedule?
Time-triggered activates work at fixed times; event-driven reacts to unpredictable arrivals. Here, we still prove a maximum response time using WCET and schedulability, giving determinism comparable to cyclic schedules while remaining responsive to sporadic hazards.
What if multiple events arrive simultaneously and overload the CPU?
The design must bound queues and define admission/degeneration rules. When assumptions are violated, the system performs a defined safe reaction (e.g., immediate stop) rather than accumulating unbounded delay.
Do I need exact WCETs?
You need conservative upper bounds. Combine analysis and measurement, include OS/ISR overheads and blocking, and keep documented margins aligned with the SIL claim.