AI Finds 57-Year-Old Bug in Apollo Guidance Computer Code

Finding a bug in one of history's most scrutinized codebases

The Apollo Guidance Computer (AGC) code has been publicly available since 2003, transcribed from printed listings at the MIT Instrumentation Laboratory. In 2016, Chris Garry's GitHub repository went viral, and thousands of developers have examined this assembly code running on hardware with 2K of erasable RAM and a 1MHz clock. Despite this scrutiny, no formal verification, model checking, or static analysis had been published against the flight code until now.

The bug: A resource lock leak in gyro control

The bug is in the Inertial Measurement Unit (IMU) subsystem, which manages the gyroscope-based platform that tells the spacecraft which way it's pointing. The AGC manages the IMU through a shared resource lock called LGYRO. When the computer needs to torque the gyroscopes (to correct platform drift or perform star alignment), it acquires LGYRO at the start and releases it when all three axes have been torqued.

The problem occurs during 'caging' - an emergency measure where a physical clamp locks the IMU's gimbals in place to protect gyroscopes from damage. When torque completes normally, the routine exits via STRTGYR2 and the LGYRO lock is cleared. But when the IMU is caged while a torque is in progress, the code exits via a routine called BADEND, which does not clear the lock.

Two instructions are missing: CAF ZERO TS LGYRO - just four bytes. Once LGYRO is stuck, every subsequent attempt to torque the gyros finds the lock held, sleeps waiting for a wake signal that will never come, and hangs. This would disable fine alignment, drift compensation, and manual gyro torque.

How it was found: AI and behavioral specifications

The researchers used Claude and Allium, their open-source behavioral specification language, to distill 130,000 lines of AGC assembly into 12,500 lines of specs. The specification models the lifecycle of every shared resource: when it is acquired, when it must be released, and on which paths. This approach surfaced a flaw that reading and emulation had missed.

The specifications were derived from the code itself, and the process signposted directly to the defect. This represents a different approach from previous scrutiny, which focused on reading the code, emulating the code, and verifying the transcription.

Historical context and potential impact

On 21 July 1969, while Neil Armstrong and Buzz Aldrin walked on the lunar surface, Michael Collins orbited alone in the Command Module Columbia. Every two hours he disappeared behind the Moon, out of radio contact with Earth. During each pass he ran Program 52, a star-sighting alignment that kept the guidance platform pointed in the right direction. If the platform drifted, the engine burn to bring him home would point the wrong way.

The bug might have manifested if Collins accidentally triggered the cage switch while the computer was torquing the gyroscopes. The code would handle this gracefully by detecting the cage, abandoning the torque, and exiting - but the P52 alignment would fail, and the guidance platform could lose its reference.

📖 Read the full source: HN AI Agents