HLR Headaches

7:00 AM -- Are subscriber database issues going to become the focal point for mobile operator headaches?

Recent evidence suggests that updates, upgrades and new deployments of HLRs (home location registers), HSSs (home subscriber servers -- like a HLR on steroids) and unifying subscriber databases can cause major service availability problems that cost money, affect credibility and potentially be the catalyst for churn.

And these aren't problems that can be solved by adding capacity to access, backhaul or core networks -- this isn't about having too many customers on the network or trying to manage too much data traffic: These are Service Provider Information Technology (SPIT) failures often caused by software (rather than network infrastructure hardware) issues that can creep up on a network operations team.

Ironically, the software problems are often linked to the introduction of new systems or software version upgrades that are designed to help the service provider improve their customer experience management capabilities. But as we all know, IT systems, whether large or small, are delicate.

Here are a few of the most notable SPIT system mishaps that led to major mobile service outages, most of which were related to subscriber data system problems:

  • Telefónica UK Ltd. , better known as O2, July 2012: A major service outage affected more than 7 million customers. Neither the operator nor its principal supplier Ericsson AB (Nasdaq: ERIC) have provided any detailed insight -- both have said they have been jointly looking into the cause of the disruption -- but the problems appear to have lay somewhere within its subscriber database infrastructure, either with its HLR servers or supporting shared database (one that is then used by multiple platforms such as the HLR, policy control server, etc). The operator has instigated a charm offensive to appease its customers but isn't providing any further details about the outage, other than to say that it experienced an "unprecedented software fault on the register that works to connect mobile devices to our network. The network is now stable and we are already implementing changes to mitigate the risk of this happening again." O2's services are going to be under specific scrutiny in the coming weeks as it is the official mobile services provider within the London Olympic Village. (See Outage Strikes O2 UK, Now What, O2? and O2's Payback.)

  • Orange France , July 2012: A major Friday evening service outage that affected about 26 million customers followed a software upgrade to Alcatel-Lucent (NYSE: ALU) HLR software carried out by staff from the vendor and the carrier. The upgrade introduced data inconsistencies that led to a rising tide of inconsistent messages being shared by Orange France's subscriber systems, which eventually became overloaded and unable to process new requests. The operator provided details of why and how the outage happened in testimony to the country's National Assembly: France takes these issues seriously.

  • Telenor Group (Nasdaq: TELN), June 2011: An upgrade to a mobile broadband-related server resulted in the "most extensive breakdown that Telenor has experienced since the mobile network was established in 1993." Signaling traffic between various servers escalated and then "created disturbances" in other network elements, affecting the availability of voice and text message services to 3 million users. (See Telenor Explains Mobile Outage.)

  • Verizon Wireless , April 2011: The U.S. operator suffered problems with its 4G/LTE service that were linked to a Nokia Networks HSS. Verizon has subsequently suffered a number of interruptions to its 4G services, but the problems appear to have been caused by a different network issue each time. (See Verizon: LTE Is Back and Euronews: April 21.)

  • T-Mobile Deutschland GmbH , April 2009: The German giant's 40 million mobile customers were cut off for about four hours following a HLR system crash. The vendor supplier, NSN, held up its hands and issued an apology, a move not only to be applauded but one that would have helped T-Mobile get to grips with the fallout. (See Server Glitch Crashes T-Mobile Network.)

    The good news is that there isn't a major danger of a service outage every time subscriber database software needs to be updated. "Regular HLR software upgrades happen all the time and they shouldn't be the cause of any problems," says Heavy Reading Senior Analyst Jim Hodges. "The major issues tend to come when there is a hardware upgrade or an operator is moving from a HLR to a HSS, or sometimes when there are a lot of new features in a software upgrade."

    Those sorts of major upgrades are likely to become more commonplace as operators upgrade their packet core and subscriber data systems capabilities to support LTE services, so it's a safe bet that further reports of major mobile service outages will reach us during the next few years.

    — Ray Le Maistre, International Managing Editor, Light Reading

  • Kevin Mitchell 12/5/2012 | 5:26:27 PM
    re: HLR Headaches

    In terms of protecting the Diameter servers, including HSS, in a LTE or IMS network, that's the prime job of the Diameter signaling controller. The DSC addresses network overload conditions, protecting the servers that can't protect themselves. They can also load balance across HSS farms and provide subscriber profile routing (matching the subscriber to the right HSS).

    Using DSCs, service providers can also create a geographically redundant network with dispersed HSS servers to provide uptime in the event of data center failure.

    Sign In