Can AI save accident-prone telcos from themselves?

A cardiac patient has died in Australia during a Telstra network failure – the latest in a series of network outages.

Robert Clark, Contributing Editor, Special to Light Reading

March 1, 2024

2 Min Read
(Source: Alisha Arif/Alamy Stock Photo)

Have telcos become accident prone, or does it just seem that way? Telstra CEO Vicki Brady had the wretched task today of apologizing for the death of a man in Australia during a failure of its emergency call service.

The system, which is required by law to support all first-responders, was out of action for one and a half hours early Friday morning. In that period Telstra was unable to transfer 148 of 494 calls received, including one involving a man who died of a cardiac arrest, the Guardian reported.

Brady expressed her "deepest apology to the family of that person and in fact anyone who was impacted in those 90 minutes." Regulator ACMA is investigating.

While this outage was thankfully on a small scale, there's been a series of large-scale network blowouts in the last 20 months involving KDDI, Optus and AT&T.

As far as we know, these seem to all result from bad data or software configuration.

They are also expensive. The KDDI event in July 2022 cost it $49 million. Optus took a $40 million hit from its outage last November.

'Configuration error'

AT&T has yet to reckon the damage from its mobile crash last week, but it will almost certainly dwarf the other two. It has already offered a $5 compensation to an unknown number of customers.

KDDI's said its failure, which took out 31 million services, was the result of "a configuration error" that caused network traffic to densify, triggering KDDI to limit user access.

The Optus outage, which took 10 million customers off air, was caused by changes to routing information following a routine software upgrade.

AT&T said vaguely it was due to "execution of an incorrect process."

The Telstra outage also seems to be a candidate for some kind of misconfiguration, taking place at 3:30am, typically a time when upgrades are carried out.

Of course, there's a good number of people who suspect these failures are the result of chronic understaffing by telcos as they shift responsibilities to their vendor partners.

That may be the case. But isn't this the kind of problem that AI is here to solve?

Whether these are just random incidents, or a sign of growing complexity in modern networks, AI seems the ideal solution. 

As McKinsey expressed it so confidently in an article earlier this week: "AI can also enable a self-healing network, which automatically fixes faults—for example, auto-switching customers from one carrier frequency to another because the former was expected to become clogged."

Can the AI-powered core router network recognize a faulty routing table or a software configuration error, or plain old human error?

You'd hope so. It certainly sounds like a good test to set an AI model. 

These network crashes are becoming an expensive and reputation-shredding habit. Some superior network smarts are called for to avoid future tragedies. 

Read more about:

AsiaAI

About the Author(s)

Robert Clark

Contributing Editor, Special to Light Reading

Robert Clark is an independent technology editor and researcher based in Hong Kong. In addition to contributing to Light Reading, he also has his own blog,  Electric Speech (http://www.electricspeech.com). 

Subscribe and receive the latest news from the industry.
Join 62,000+ members. Yes it's completely free.

You May Also Like