Applications of Deep Learning

Big Data IoT Forum | July 22, 2016

This post highlights a number of important applications found for deep learning so far. It is well known that 80% of data is unstructured. Unstructured data is the messy stuff every quantitative analyst tries to traditionally stay away from. It can include images of accidents, text notes of loss adjusters, social media comments, claim documents and review of medical doctors etc. Unstructured data has massive potential but has never been traditionally considered as a source of insight before. Deep Learning is becoming the method of choice for its exceptional accuracy and capturing capacity for unstructured data.

Text mining utilizes a number of algorithms to make linguistic and contextual sense of the data. The usual techniques are text parsing, tagging, flagging and natural language processing [1][1]. There is a correlation between unstructured data and text mining as many unstructured data is qualitative free text like loss adjusters’ notes, notes in medical claims, underwriters’ notes, and critical remarks by claim administration on particular claims and so on. For instance, a sudden surge in homeowners’ claims in a particular area might remain a mystery but through text analytics, it can be seen that they are due to rapid growth in mold in those areas. Another useful instance is utilizing text analytics when lines have little data or are newly introduced which is becoming more and more necessary for our fast moving society with new emerging risks arriving before the previous emerging risks even ossify [2].[2]

Sentiment analysis/opinion mining over expert judgment on level of uncertainty in actuarial models like pricing, capital modeling and reserves can also prove fruitful. Natural Language Processing (such as in Stanford ‘CoreNLP’ software available free for download [3][3]) is a powerful source of making sense out of the texts.

Aside from exposures, the other side of ratemaking is losses and loss trends. By building Deep Learning models we can analyse images to estimate repair costs. Also Deep Learning techniques can be applied to automatically categorize the severity of damage to vehicles involved in accidents. This will more quickly update with us more accurate severity data for modelling pure premiums [4].[4]

Claim professionals often have more difficulty in assessing loss values associated with claims that are commonly referred to as “creeping Cats.” [5][5]

These losses typically involve minor soft tissue injuries, which are reserved and handled as such. Initially, these soft tissue claims are viewed as routine. Over time, however, they develop negatively. For example, return-to-work dates get pushed back, stronger pain medication is prescribed, and surgery may take place down the road. Losses initially reserved at $8,000–$10,000 then become claims costing $200,000–$300,000 or more. Since these claims may develop over an extended time period, they can be difficult to identify. Creeping cat is a big problem for emerging liabilities because mostly, we do not fully know what we are dealing with. Emerging risks like cyber-attacks, terrorism etc have shown to have huge creeping cat potential.

As discussed, Deep Learning models can review simulated or real (as per availability) claim data from agent based modeling, network theory for similarities and other factors shared by such losses, thereby alerting the claims professional to emerging risks that may have creeping Cat potential. With this information, strategies and resources can be applied at a point in time where they can be most effective in an effort to achieve the best possible outcome and control cost escalation. Additional loading on premiums can also be given on areas with higher Creeping Cat potential.

Another powerful application is fraud analytics especially since the advent of cyber insurance. With big data, insurers are collecting more and more sensitive and personal data on customers. Also they have started selling cyber insurance where the insurance company covers the losses arising from data theft, hacking, internal sabotage etc to the company’s IT systems. With advent of Cyber insurance, fraud and intrusion has been explicitly commoditized as a risk factor. Fraud analytics are crucial for insurance company more so in the present than ever before, because while insuring other companies for cyber risk, it is paramount that the insurance company itself should have top-notch fraud analytics software so to ensure breaches of its privacy, data security are at the bar most minimum, given that more and more data is stored by insurance companies on their clients as well as frauds in claims and other internally caused frauds.

It would be most embarrassing and costly indeed, if the insurance company insuring other companies itself for cyber insurance becomes hacked or gets compromised. Aside from these reputational damages, the lack of extremely sensitive and personal data on the policyholders can spark class action liability damages action in legal courts as well.

With Deep Learning doctors can for the first time use the predictive power of Deep Learning to directly improve patients’ medical outcomes. Deep Learning can readily handle a broad spectrum of diseases in the entire body, and all imaging modalities (x-rays, CT scans, etc). Deep Learning contextualizes the imaging data by comparing it to large datasets of past images, and by analyzing ancillary clinical data, including clinical reports and laboratory studies.

In initial benchmarking test against the publicly-available LIDC dataset, Enlitic technology, the startup utilizing Deep Learning for improving healthcare, detected lung cancer nodules in chest CT images 50% more accurately than an expert panel of radiologists. In initial benchmarking tests, Enlitic’s Deep Learning tool regularly detected tiny fractures as small as 0.01% of the total x-ray image Enlitic’s Deep Learning tool is designed to simultaneously support hundreds of diseases (not just a limited specialization of diseases or one disease) [6].[6]

Another key combination is Deep Learning’s integration and synergy with Big Data. Actuaries will have to understand and appreciate the growing use of big data and the potential disruptive impacts on the insurance industry. Actuaries will also need to become more proficient with the underlying technology and tools required to use big data in business processes [7].[7] Subsequently, this is the effort of the next section.

Deep Learning, in collaboration with other machine learning tools is make headways in possible applications. All major giants like Google, IBM, Baidu are aggressively expanding in this direction but startups are providing the most vivid applications so far. Kensho is a startup that aims to use software to perform tasks in minutes that would take analysts weeks or months. Just like searching via Google, the analyst can write their questions in the Kensho’s search engine. The cloud based software, as per Forbes reporter Steven Bertoni, can find targeted answers to more than 65 million combination in the flick of a second by scanning over 90,000 actions which are as myriad as political events, new laws, economic reports, approval of drugs etc and their impact on nearly any financial instrument in the world. Another startup, Ufora is set to automate a large part of quantitative finance work undertaken by quants, especially on the stochastic modeling front. Even some hedge funds like Renaissance Technologies are proactively working on machine learning and Deep Learning algorithms to better see patterns in the financial data to exploit opportunities (which stocks are overrated or underrated, market is going strong on fundamentals or approaching the bubble stage and so on) to guide their investment strategies[8][8].

On the other hand, Firms like Narrative Science and Automated Insights working on text analytics are utilizing Deep Learning to create lively and interactive narrative reports out of data and numbers. This essentially means report written by a machine that reads like it is almost written by a human author. To elaborate this feature, Narrative Science’ s Quill platform undertakes statistical analysis of applying time series, regression etc and then the semantic engine evaluates the important data signal from the unimportant noise as per the needs of the particular audience in question like different reasoning if it is related for a quant or a trader of investments. The patterns are spotted and made sense out of in a holistic manner. Particular fuzzy attention is given to anomalies and elements of results that are deviant from the main normal body of the results to ascertain their impact and proper interpretation. It remembers previous reports made so it doesn’t become repetitive. Natural Language Generation is applied with a surgeon’s precision and expertise in forming such a dynamic semantic engine.

This is indeed a leap forward as report writing consumes a lot of human time and efforts and because machines making such reports were before unheard of practically. Deep Learning allows us not just to explore and understand the data better, but also to perform forecasts better. For predictive analytics part, the startup MetaMind is working to help financial firms assess chances of selling of stocks by going through corporate financial disclosures [9].[9] It identifies from previous experiences when a particular combination of actions lead to a particular result to assess chances of the same result happening in the future.

Extrapolating this trend into the future, it is my opinion that such analytics might soon find their way into Mergers & Acquisitions (M & A) and will be able to come up with probability of some key event happening and the consequences of it when involved in a high stake M & A. Another application can be to apply Deep Learning applications to help us for one of the most vexing problems, i.e, financial crises. Economists, financial experts and social scientists have elaborated on a lot of key issues that lead to financial crises in general as well as specifically for a particular meltdown. These can form the modeling methodology for the Deep Learning machine to analyze the cosmic scale of data available on any and every platform that it can garner. Such evaluation can perhaps help us to see patterns that we could have missed otherwise as well as to allow us to understand more accurately the sequential movements and mechanisms involved in a particular financial contagion and crisis. There is no guarantee that this will work. But perhaps it can shed some light inside the ‘quantum black box’ of financial crises. This seems to be the need of the hour with recurring financial hemorrhages such as EU crisis on Greek Debt and Brexit as well as the recent massive and escalating falls in Chinese stock exchanges reminding us of the bitter past we faced in Wall Street Crisis of 2008-09.

Given all these developments, there are still a myriad of issues that need clarification with not just Deep Learning in specific but also with big data generally. Automation of such unprecedented scale and intensity raises the possibility of mass redundancies in labor force across the economy. Are we comfortable with giving up our controls to such applications without knowing the full implications of such a move? Not every innovation brings positive results or sustains in the long run. Technology is progressing rapidly at an unstoppable pace but can we manage the social consequences and make it sustainable in the long term? Human efforts are seemingly being diverted from other fields into IT which consequently can imply a concentration of power in one overlord field to the potential detriment of others. Are we ready for this? From a consumer point of view how ethical is it that marketing personnel know you so well that it makes rational optimization very difficult on the part of the consumer?

These are all good questions and should be adequately and mutually tackled and addressed by all the stakeholders involved such as the data scientists, government, professions and consumers so as to be able to reach a mutual policy that can better alleviate such concerns. The core aim of the policy has to be to sustain technology for the benefit of our societies, to lead to value creation, to reduce scarcity and reduce fragility of our systems as well as to generate more resources for our prosperity instead of creating the monster of Frankenstein, as Terminator and other doomsday movies will have us believe.

[1] Stanford Natural Language Processing Group; available at: http://nlp.stanford.edu/software/

[2] CAS Ellingsworth and Balakrishnan: 2008. Practical text mining in insurance

[3] Stanford Natural Language Processing Group; available at: http://nlp.stanford.edu/software/

[4] PwC; March 2016; Top Issues: AI in Insurance; hype or reality?

[5] Lentz,W. GenRe Research (Nov 2013); Predictive Modeling—An Overview of Analytics in Claims Management

[6] Entilic website; science and solutions

[7] SOA; The Actuary Magazine December 2013/January 2014 – Volume 10, Issue 6, Ferris et al “Big Data”.

[8] Dugas et al; Statistical Learning Algorithms Applied to Automobile Insurance Ratemaking

[9] MetaMind company’s website

Category: Uncategorized