One week on: AT&T's nationwide network outage

By Lauren Davis
Thursday, 29 February, 2024

Any Australians who found themselves in the United States last week may have experienced a sense of déjà vu after major telco company AT&T suffered a massive network outage that affected mobile phone users nationwide.

Commencing on the morning of 22 February at 3 am CT, online outage monitor DownDetector started receiving reports of AT&T service issues, with the outage affecting more than 74,000 users in over 40 states when it peaked mid-morning (as this figure only represents self-reported outages, the true number of affected customers was almost certainly higher). Reportedly, the issue was so widespread that customers using other carriers thought their own networks were experiencing issues, because when they tried to connect to AT&T users, their calls and texts weren’t going through.

As with Australia’s Optus outage, the disruption prevented customers of AT&T and its discount carrier, Cricket Wireless, from making calls, texting or using the internet on their mobile phones, unless they were connected to a separate Wi-Fi network; for this reason, AT&T encouraged people to switch on the Wi-Fi calling feature on their phones until service was restored. The ability to call emergency services was even disrupted across several states — including California, North Carolina, Virginia and Texas — prompting 911 centres to urge AT&T customers to use a landline, find a mobile phone that uses a different carrier or use Wi-Fi calling. And for those without active service who were able to contact 911, their location information was not delivered to the 911 call centre, leaving responders to check this manually.

Perhaps most worryingly, the outage also affected the country’s public-safety broadband network, FirstNet, which AT&T has been building and maintaining since March 2017. This had a direct impact on some emergency services, according to Matt Zavadsky, Chief Transformation Officer at MedStar Mobile Healthcare.

“FirstNet is what most public safety agencies across the country use, because it was built to be more robust and to have more coverage and to have some features that public safety folks needed,” Zavadsky said.

“When the system went down ... area law enforcement, fire agencies, first responders all over lost connectivity with their field units. So we had to revert to radio dispatching and actually using maps and ambulances to get to calls because the mapping systems weren’t working because the cell system was down.”

In an open letter issued on 25 February, AT&T CEO John Stankey confirmed that the company’s restoration efforts had prioritised service on FirstNet, before moving on to consumers. He also claimed that “about three-quarters of our customers were able to access our network as they started their days around 5 a.m. CT” — with the remaining customers reconnected throughout the morning and the network apparently normalised by noon CT. This would, however, contradict the mid-morning peak in complaints — with AT&T’s own website only making the three-quarter claim at 10.15 am on the day in question — as well as an update confirming full service restoration at 2.10 pm, around 11 hours after the outage began. At this point, 911 lines were reportedly flooded with people making ‘test calls’, as many had been doing all day, despite emergency officials’ best efforts to discourage this.

It was speculated that the outage may have been caused by a cyber attack, with the Federal Communications Commission (FCC), the US Cybersecurity and Infrastructure Security Agency (CISA), the Department of Homeland Security (DHS) and even the FBI saying they wished to investigate the matter further. But AT&T’s final update on the day claimed the outage was in fact “due to the application and execution of an incorrect process used while working to expand our network”, according to an initial review, and Stankey’s letter several days afterwards did not elaborate further on this rather vague explanation. Other updates from the company have stated that it is “taking steps to prevent this from happening again in the future” and issuing a portion of its customers with compensation in the form of US$5 credit — the average cost of a full day of service.

Looking at the outage from an Australian perspective, it is rather extraordinary how many parallels it has to November’s Optus outage, which begs the question: how many times must we watch this scenario play out? Interestingly, the Australian Government has just this week agreed to extend the deadline for the Optus Outage Post-Incident Review report — which was due by today — following the receipt of new information from Optus relating to its activation of network wilting, where signals from mobile towers are powered down in order for Triple Zero calls to be carried by other networks. With the government now expected to receive the report by 21 March, we can but wonder how many of the report’s findings will be applicable to AT&T and other telco companies around the world.

Image credit: iStock.com/Ivan Pantic