Deep Learning Is Great, But Use Cases Remain Narrow
by Alex Woodie
Deep learning is all the rage these days, and is driving a surge in interest around artificial intelligence. However, despite the advantages that deep neural networks can bring for certain applications, the actual use cases for deep learning in the real world remain narrow, as traditional machine learning methods continue to lead the way.
The rapid ascent of deep learning is arguably one of the least expected technological phenomena to have occurred in the past five years. While neural networks have been around for decades, it wasn’t until the University of Toronto’s Geoff Hinton paired those techniques with a new computational paradigm (GPU) and the availability of huge amounts of training data to yield what we now know as deep learning.
The result is that deep neural network technology has evolved “at a lightening rate,” Nvidia CEO Jenson Huang said earlier this year. “What started out just five years ago with AlexNet…five years later, thousands of species of AI have emerged.”
To be sure, novel deep learning techniques have driven big breakthroughs that have caught the imagination of the public. We wouldn’t be as close to putting self-driving cars on the road without the computer vision breakthroughs that were delivered courtesy of convolutional neural networks (CNN). We wouldn’t have seen Google develop a self-learning system emerge victorious in the game of “Go.” The list goes on.
But for all the excitement that deep learning techniques have generated around the creation of new AI systems, deep learning techniques are still very much the rarity when it comes to what data scientists are building in the real world today.
“I maintain a list of existential question that impact our roadmap and vision for the future, and one of them is what percentage of production jobs from our customer base will be deep learning and what time horizon,” says Hilary Mason, the general manager of machine learning for Cloudera.
The percentage of production deep learning jobs on Cloudera clusters currently is between 5% and 10%, Mason estimates. “It would have to be somewhere near 40% before we build a lot of that special capability [into the product],” she says. “…[W]e have no concert plans to go fully deep learning yet.”
Deep learning is being used in pockets, Mason says, but it’s far from the dominant data science approach. “We fully support it and we believe that support will have to increase,” she says. “But I don’t think it’s a six-month time horizon for things being fully [baked]. I think it’s in years.”
Deep learning using neural network architectures does bring advantages over traditional machine learning approaches, such as K-means, linear regression, and random forest algorithms. For starters, it doesn’t require data scientists to identify the features in the data that they want to model, and do the feature engineering to highlight those features in the model, which is still often done by hand. Instead, the pertinent feature are identified by the deep learning model itself, eliminating that step. Instead, the data scientist does architectural engineering.
Deep learning can also be more accurate, provided there is a big enough data set and that it’s clean and well-labeled. Traditional machine learning approaches, by comparison, can work even with smaller data sets – although the sample size would still need to be large enough to be statistically significant. Deep learning can be useful for eking out the last few percentage points of accuracy, which can be critical in some use cases.
The easy availability of open source deep learning frameworks, such as Keras, Pytorch, and TensorFlow, has also contributed to the sudden rise of this statistical approach. Many companies treading the machine learning automation waters, including Cloudera Data Science Workbench (CDSW), support these frameworks.
But there are big challenges that come with deep learning, too. Thanks to the large training sizes, GPUs are the favored processor type for deep learning models. This has been a boon for Nvidia, which has ridden the surge in demand for GPUs spurred by the AI wave to an enormous increase in revenues and stock value. But it also puts deep learning out of the hands of the average company that doesn’t have a spare GPU cluster handy.
The analytics giant SAS also supports deep learning approaches across its various product suites. According to SAS text analytics expert Mary Beth Moore, deep learning is not as widely used as some may believe.
“We offer it. It’s on the market. We have customers using it,” Moore told Datanami at SAS’s recent Analytics Experience conference. “But in the market, deep learning is still so heavily tied to CNN and image use cases.”
CNNs have exploded in popularity, thanks to Hinton and his team’s work on the original AlexNet architecture, which spurred hundreds of variants. A different deep learning architecture, called a recurrent neural network (RNN), is most often used for language use cases. However, while RNN’s have found success in the language translation and chat-bot space, they have not been as widely used in other text analytics use cases, where traditional machine learning techniques prevail, Moore says.
“We have embedded deep learning into our text analytics solutions, so it is available,” she says. “We’re certainly helping customers move in that direction as well. But the ask in the market is still heavily around computer vision.”
That sentiment was backed by Oliver Schabenberger, SAS executive vice president, CTO, and COO. “A very bright light is shining on deep learning, and it’s moving us forward in ways that we had not anticipated, and that’s great,” he says. “But it cannot be the answer to every problem.”
SAS advocates a blended approach – what a data scientist might call an ensemble — that leverages newer deep learning techniques alongside well-understood machine learning techniques, or what might be called “classical” machine learning. This can help give SAS clients the maximum insight out of raw materials, such as a block of unstructured human text.
“We have classical systems for natural language interaction. It provides logic, reason, and contextual understanding,” Schabenberger says. “Those two things are very difficult to imbue in a model just training it on a bunch of data. That’s why some of these AI systems are really good at translating language, but they can’t tell you what they are translating. It’s a very mechanical thing.”
Deep learning systems excel at translating from English to French and back better than the average interpreter. “But they’re not good at actually telling you what the gist is of the text, what the author was meaning,” Schabenberger says. “That still needs our understanding.”
Category: Uncategorized