Banking

Air-traffic control bank holiday chaos: On-call engineer took 90 minutes to reach HQ and reboot computer


The engineer who was rostered to oversee the UK’s air-traffic control system on the August bank holiday last year was at home when both the main and back-up computers failed.

A report has revealed it took 90 minutes for the staff member to reach work and start to fix the failed National Air Traffic Services (Nats) flight system.

More than 700,000 passengers were hit by the failure of the UK’s air-traffic control system on one of the busiest days of the decade.

At 8.32am on Monday 28 August, the UK’s air-traffic control computer system, and its back-up, failed for several hours.

While no aircraft were ever in danger, the dual failure cut the capacity of UK airspace by 92.5 per cent: from a maximum of 800 flights per hour to just 60.

Once the engineer arrived and rebooted the system, the problem was eventually solved seven hours after it began.

The outage caused the cancellation of 1,600 flights on the day. Four hundred more followed over the next couple of days, due to planes and pilots being stranded out of position by the shutdown.

The Civil Aviation Authority (CAA) set up an independent review, which has now published its interim report.

The panel says that 300,000 people had their flights cancelled, while a further 400,000 suffered delays.

“This had considerable financial and emotional consequences for them,” the report says.

The failure collectively cost airlines tens of millions of pounds. Tim Alderslade, chief executive of Airlines UK, said: “This report contains damning evidence that Nats’ basic resilience planning and procedures were wholly inadequate and fell well below the standard that should be expected for national infrastructure of this importance.”

The cause of the shutdown was a plan for a flight by French Bee from Los Angeles to Paris Orly, which contained duplicate “waypoints”: DVL, the code for both Devils Lake in North Dakota and Deauville in northern France.

The report sets out the sequence of events that shut the system down within 20 seconds:

  • The Nats system “identified a flight whose exit point from UK airspace, referring back to the original flight plan, is considerably earlier than its entry point”
  • “Recognising this as being not credible, a critical exception error was generated”
  • The system, as it is designed to do, “placed itself into maintenance mode to prevent the transfer of apparently corrupt flight data to the air traffic controllers”
  • “The same flight plan details were presented to the secondary system which went through the same process as the first with the same result: a second critical exception error and disconnection”

The flight plan was “filed in accordance with standard procedures”, the report says.

“At that point, further automated processing of flight plan data was no longer possible and the remaining processing capacity was entirely manual,” the investigators add.

Fixing the problem took longer than it might have done, the report finds. The engineer responsible for overseeing the system “was rostered on-call and therefore was not available on site at the time of the failure”.

It adds: “Having exhausted remote intervention options, it took 1.5 hours for the individual to arrive on-site in order to perform the necessary full system restart which was not permitted remotely.”

A more senior engineer “was unfamiliar with the fault message” recorded in the log, the report says.

The company that built the system, Frequentis, was not asked for assistance “for more than four hours after the initial failure despite their having a unique level of knowledge”.

The CAA panel aims “to draw lessons from the incident which may help the prevention of future incidents, or at least to reduce the scale of the impact on consumers, airlines and others”.

A spokesperson for the air-traffic control provider said: “Nats has cooperated fully with the independent panel appointed by the CAA to review the events of 28 August and its repercussions.

“We will continue to respond constructively to any further requests to support the panel’s ongoing work. We have not waited for the panel’s report to make improvements for handling future events based on learning from the experience of last year.

“These include a review of our engagement with our airline customers, our wider crisis response and our engineering support processes.

“We will study the panel’s interim report and look forward to their recommendations when they publish their final report.”

Michael O’Leary, chief executive of Europe’s biggest budget airline, Ryanair, said: “The CAA report confirms, unbelievably, that Nats engineers were sitting at home in their pyjamas on the UK’s August bank holiday weekend, which is one of the busiest travel weekends of the year for air travel.

“In any properly managed ATC [air-traffic control] service, engineers would be on site to cover system breakdowns instead of sitting at home unable to log into the system.”

Nats says that its engineering protocols were followed.

The Ryanair CEO repeated his call for his opposite number at Nats, Martin Rolfe, to go.

The transport secretary, Mark Harper, posted on X: “I’m glad to see steps have already been taken to ensure an incident like this doesn’t happen again.”

For more travel news, views and advice from Simon, download his daily Independent Travel podcast.



Source link

Leave a Response