Error Detecting Codes

What is it?

Error detecting codes are mathematical techniques that add extra redundancy bits to digital information. These codes allow a system to check whether data has been corrupted during transmission or storage. Common examples include parity bits, checksums, Hamming codes, and cyclic redundancy checks (CRC). In functional safety, their purpose is not to correct errors but to prevent faulty data from being used in safety-critical decisions.

How it supports functional safety

Error detecting codes help prevent systematic failures by ensuring that corrupted or incomplete data does not silently propagate through the system. They also detect the effects of random hardware faults or electromagnetic interference that might alter transmitted or stored data. By discarding or safely reacting to erroneous data, error detection prevents hazardous control actions.

When to use

Safety-related communication between sensors, controllers, and actuators.
Protecting memory contents in embedded controllers or safety PLCs.
Serial communication links exposed to noise or interference.
Any case where corrupted data could cause a dangerous or unintended actuation.

Inputs & Outputs

Inputs

Raw data to be stored or transmitted
Coding scheme (e.g., CRC polynomial, Hamming parameters)

Outputs

Encoded data with redundancy bits
Detection status (valid / corrupted)

Procedure

Select an appropriate error detection code (parity, CRC, Hamming, etc.) based on required safety integrity level.
Encode outgoing data by adding redundancy bits.
Transmit or store the data with the code attached.
At the receiver (or during retrieval), recompute and verify the code.
If valid → accept data.
If invalid → apply a safe reaction (discard, hold last safe value, or enter safe state).

Worked Example

High-level

A temperature sensor sends data to a safety controller over a noisy bus. A CRC is appended to each data frame. If corruption occurs, the controller rejects the frame and keeps the last safe reading, avoiding a spurious shutdown command.

Code-level

def transmit(data):
    crc = compute_crc(data)
    return data, crc

def receive(data, crc):
    if compute_crc(data) == crc:
        return data
    else:
        # SAFE REACTION: discard frame, hold last safe value
        return last_safe_value

Result: The controller only acts on verified, uncorrupted sensor data.

Quality criteria

Coding scheme selected must match the SIL/ASIL target.
Coverage against single-bit and burst errors must be justified.
Safe reaction on error must be specified, tested, and documented.

Common pitfalls

Using error correction instead of detection → unsafe mis-corrections. Mitigation: always discard or enter safe state on detection.
Weak codes (e.g., simple parity) missing multi-bit errors. Mitigation: use strong CRCs or Hamming codes where required.
Not testing safe reaction paths. Mitigation: include error injection in verification.

References

IEC 61508-3:2010, Annex C
Huffman, W., Pless, V. Fundamentals of Error-Correcting Codes, Cambridge University Press, 2003
Koopman, P. — CRC and Error Detection Tutorial

FAQ

Why not correct errors instead of just detecting them?

Correction may produce an incorrect but valid-looking value, which is unsafe. Functional safety favors discarding over guessing.

Are CRCs enough for SIL 3/4?

CRCs with sufficient length and carefully chosen polynomials can provide very high diagnostic coverage, but justification is required.

Error Detecting Codes

Error Detecting Codes

What is it?

How it supports functional safety

When to use

Inputs & Outputs

Inputs

Outputs

Procedure

Worked Example

High-level

Code-level

Quality criteria

Common pitfalls

References

FAQ

For you

You and Us

Resources

About Risknowlogy

Error Detecting Codes

Error Detecting Codes

What is it?

How it supports functional safety

When to use

Inputs & Outputs

Inputs

Outputs

Procedure

Worked Example

High-level

Code-level

Quality criteria

Common pitfalls

Related techniques

References

FAQ

For you

You and Us

Resources

About Risknowlogy