← Back to incidents

Boeing 787 Dreamliner Integer Overflow Bug Could Cause Total Electrical Failure After 248 Days Continuous Operation

Critical

FAA discovered Boeing 787 software bug where integer overflow after 248 days of continuous generator operation could cause total electrical failure. Regulatory directive required power cycling every 120 days as interim fix.

Category
Safety Failure
Industry
Other
Status
Resolved
Date Occurred
Date Reported
May 1, 2015
Jurisdiction
US
AI Provider
Other/Unknown
Application Type
embedded
Harm Type
physical
Human Review in Place
No
Litigation Filed
No
Regulatory Body
Federal Aviation Administration
aviation_safetyembedded_systemsinteger_overflowboeing_787electrical_systemssoftware_bugfaa_directivesafety_critical

Full Description

On May 1, 2015, the Federal Aviation Administration issued Emergency Airworthiness Directive 2015-10-51 addressing a critical software vulnerability in the Boeing 787 Dreamliner's electrical power generation system. The directive was prompted by Boeing's discovery during laboratory testing that the aircraft's four main generator control units contained a software bug that could cause simultaneous failure after 248 days of continuous operation. The technical root cause involved an integer counter overflow in the generator control unit software. The system used a signed 32-bit integer to track time in centiseconds (hundredths of a second), which would overflow after 2^31 centiseconds, equivalent to approximately 248 days of continuous operation. When this overflow occurred, all four generator control units would simultaneously enter fail-safe mode, cutting electrical power to the aircraft. This represented a catastrophic single point of failure that could leave an aircraft without electrical power during flight. The vulnerability affected all Boeing 787-8 and 787-9 aircraft delivered to airlines worldwide. While no actual incidents had occurred in service, laboratory testing confirmed the failure scenario. The FAA classified this as an unsafe condition requiring immediate action, as total electrical failure would result in loss of flight-critical systems including flight controls, navigation, communication, and cabin pressurization. Boeing's initial mitigation strategy, mandated by the FAA directive, required operators to power cycle the aircraft at least once every 120 days to reset the counters well before the 248-day overflow threshold. This operational workaround was implemented while Boeing developed a permanent software fix. Airlines had to modify maintenance schedules to ensure compliance, adding operational complexity and costs to 787 operations. The directive applied to approximately 200 aircraft in service at the time, operated by airlines including United, American, JAL, ANA, and others globally. The incident highlighted broader concerns about software quality assurance in modern fly-by-wire aircraft systems. Unlike mechanical systems with well-understood failure modes, embedded software can exhibit unexpected behaviors that emerge only under specific timing or operational conditions. The 787's extensive use of software-controlled systems, while enabling advanced capabilities and fuel efficiency, also introduced new categories of potential failure modes that traditional aerospace testing methodologies had not fully anticipated. Boeing subsequently developed and deployed a permanent software fix that properly handled the counter overflow condition. However, the incident raised questions about software verification processes in aviation, the adequacy of testing for long-duration edge cases, and the challenges of ensuring reliability in increasingly complex avionics systems. The FAA's rapid response demonstrated the aviation industry's commitment to safety, but also revealed how software vulnerabilities could create systemic risks across entire aircraft fleets.

Root Cause

An integer overflow vulnerability in the Boeing 787's electrical power generation system caused by a software counter that would overflow after 2^31 centiseconds (approximately 248 days) of continuous generator operation, potentially causing all four main generator control units to simultaneously switch to fail-safe mode and cut electrical power.

Mitigation Analysis

This incident demonstrates critical gaps in embedded systems testing for edge cases and long-duration operation scenarios. Comprehensive stress testing with extended runtime simulations, formal verification of counter overflow handling, and implementation of watchdog timers with manual reset procedures could have identified and prevented this vulnerability. The fix required cycling power every 120 days, highlighting the need for proactive maintenance protocols in safety-critical systems.

Lessons Learned

This incident demonstrated that even well-established aerospace manufacturers can overlook critical edge cases in software testing, particularly those involving long-duration operations and integer overflow conditions. It highlighted the need for more comprehensive formal verification methods and extended-duration testing protocols for safety-critical embedded systems.

Sources

FAA Emergency Airworthiness Directive 2015-10-51
Federal Aviation Administration · May 1, 2015 · regulatory action
Boeing 787 Software Bug Prompts FAA Emergency Order
Aviation Week · May 1, 2015 · news