Deep learning is one of the most exciting elements of artificial intelligence, but it's also one of the slowest moving. IBM Fellow Hillery Hunter calls deep learning, which enables computers to extract meaning from images and sounds with no human intervention, a "rarified thing, off in an ivory tower," but a recent breakthrough by the company promises to make it more accessible -- and a lot faster too.
Last week IBM Corp. (NYSE: IBM) announced that its software was able to take the speed of training deep neural networks down from weeks to hours, or hours to minutes depending on the use case, while also improving the accuracy. It accomplished this by increasing the scalability of its training applications across 256 Nvidia Corp. (Nasdaq: NVDA) GPUs in 64 IBM Power systems.
IBM was looking specifically at image recognition and was able to train its model in 50 minutes. The previous record was held by Facebook at one hour. IBM says it achieved accuracy of 33.8% for a neural network trained on 7.5 million images. The previous record here was held by Microsoft Corp. (Nasdaq: MSFT) at 29.8% -- it's only a 4% increase in accuracy, but Hunter says typical past improvements have been less than 1%.
"We need this kind of breakthrough where we've integrated this capability for speed into a number of deep learning frameworks and packages and provided it out to customers," she says. "It's the kind of thing that changes the rate and pace of an artificial intelligence capability like this."
Hunter walked Light Reading through IBM's recent breakthrough, why it matters, timelines for deployment and what it means for the telecom industry, specifically, in a recent interview. (We even threw in one Women in Comms-centric question.) Read on for a lightly edited transcript of our discussion.
Light Reading: IBM's big breakthrough here centered on the speed to analyze and interpret data. How much have you improved it, and what's the ultimate goal?
Hillery Hunter: It is very interesting because deep learning works by feeding the neural networks many pieces of data where that data has been labeled. Pictures, for example, are labeled with the type of object, animal or place that they are, and the neural network ingests millions of those pre-labeled pictures. In some sense it's like reading a book -- it's given information that's known to be safe and correct, so it learns, and then you test it against a set of pictures or data it's never seen before. That is how you come up with validation that it learned correctly from what it was given. It's one of the few areas in modern computing where people wait weeks for full results. There are not a whole lot of others we would tolerate that in.
Deep learning has shown itself to be really effective, especially at speech. When you're talking to a phone, it drives speech recognition today. When you are using social media and images are auto labeled with the person or place it was taken -- all those things are deep learning, and it works well. It's now being applied to other things in the enterprise like credit card fraud and risk and things like that because it works well and you can get to even better than human accuracy with the tasks you are doing with deep learning. Because of that, people have tolerated really long learning and model training times. We are aware of many, many cases where deep learning researchers are waiting weeks to get to the results they need. We see that as unacceptable. We want to get it into hours and then for jobs that are smaller, take it from hours to seconds. That could be transformative, to get to seconds. It depends on the use case and how much data you are feeding the model. It takes so long because you want the model to work well so you feed it lots of data to learn the task.
LR: How much human involvement is required in deep learning processes?
HH: With deep learning, it doesn't require feature engineering. In many other types of machine learning that are used for artificial intelligence, the human has to specify the features that help the computer identify the task it's trying to do. This is not mathematically how it works, but conceptually, in order for a computer in non-deep learning communities to learn what houses look like you'd have to pick out that a house has two sloping lines for a roof and two sets of walls; it has "n" windows and a door. You would have to pick those things out and say, "These make up a house." Every time the computer would see all those features in one picture, it'd say, "I think this is a house." In deep learning, you just throw data at the computer with the right answer, and it sees enough house pictures that it figures out the features itself. You've shown it many houses, so it figures the features out. It does require you show it many houses, because it has to find those common features, but it saves user a lot of time because they don’t have to identify all the features that comprise a specific object.
LR: What are the use cases for deep learning that are most exciting in the world of telecom and network operations?
HH: Deep learning is definitely looking like it could play a role in cybersecurity. In general, it has been shown to be used for data center efficiency in terms of managing power and cooling in a data center. I would go with security, data center management and also being used to understand customer characteristics and demand forecasting. There are lots of potential use cases.
Next page: Deep learning in the enterprise and more