AI in Banking and Insurance

TL;DR

Regulatory requirements in banking or insurance companies are kept particularly strict by BaFin and DSGVO. Artificial intelligence (AI) can rarely deliver the desired benefits here, or is eliminated from the agenda altogether. Nevertheless, many use cases are feasible. This publication highlights three such use cases for AI in banking or insurance companies along with associated out-of-the-box solutions, identifies in-house data sources for training AI, and provides advice based on my own experience.

Dynamic Malware Analysis

Thanks to virus developers who send their viruses out into the world in government-sponsored companies like the NSO Group, the numbers of infected networks are steadily going up.

While the cat-and-mouse game between virus and antivirus software has been going on since Fred Cohen’s 1986 dissertation - the world’s first on computer viruses - the game is taking a new spin with the rise of AI: Virus developers are now using AI for code obfuscation, model inversion attacks on anomaly detection systems, and polymorphisms viruses at new levels of scalability.

The banking troyan Emotet, for example, which became known in 2014, was almost impossible to track down without automated methods due to its fluid behavior.

Two other reasons justify the recommendation:

The shortage of experts in the cybersecurity sector

Especially in understaffed situations, all the more intelligent decisions must be made regarding incoming traffic.

And where human intelligence is lacking, AI can come to the rescue. For example, in detecting malware.

Tools like vmware’s NSX Advanced Threat Analyzer use machine learning and an orchestra of expert systems to put Emotet in its place. The SIEM system Splunk Enterprise Security from the company of the same name searches the IT system for malware based on pre-trained ML models.

Anomaly Detection

Artwork by Benjamin Donnelly: https://www.artstation.com/artwork/Keg9O9

Internal anomaly detection usually reports with a certain percentage of false-positives. Each one has to be checked manually. Which takes time. Therefore, we look for ways to reduce the time of manual checking.

This is where LSTM networks come in. Such Long Short Term Memory networks, from the family of recurrent neural networks, make decisions based on feedback loops that pass the output back to the input, thus developing a memory that spans a longer or shorter period of time, depending on the configuration. In anomaly detection, LSTMs can assist IT in taking their feedback (“That’s not an anomaly, that’s a false positive. Remember that.”) into account in future decisions. With the number of false-positives decreasing.

Tangible implementations of this approach for LSTMs can be found e.g. in:

Avora, by the company of the same name

Deeplog, by Wei Xu

NSX Advanced Threat Analyzer, by vmware

Myriads of other possibilities are conceivable if the GDPR were to relax its dictates. Credit card fraud, for example, can be detected more reliably by an AI than by conventional solutions, but this requires user analytics, which collides with §22 of the GDPR’s plea for manual decision making (provided that an AI is autonomously at work here). Therefore, IT staff must continue to manually process the reported anomalies, which often is a Sisyphean task.

However, in line with the GDPR, we can provide the IT specialist with neural assistance that merely recommends decisions: an LSTM network that gets to grips with the inflationary false-positives. The advantage here is twofold:

More time for the IT person while remaining compliant with the GDPR

Increased contact between IT and AI, giving the IT staff more contact to attain proficiency in the utilization of AI

Minimal Principle und Segregation of Duty

Artwork by Pan Ping: https://www.artstation.com/mrpin1

BaFin requires compliance with the minimum principle and segregation of duty (SoD). This has consequences for companies: SoD-matrices need to be defined, including categories, authorization concepts and control processes. This not only introduces transparency, but also the pressure to ensure and maintain SoD.

Which brings a conflict: BaFin requires transparency, whereas companies require the achievement of entrepreneurial goals. A compromise is needed.

Back to the minimal principle: The principle of minimal authorizations is a classic optimization problem and can therefore be solved by AI. For example, by integrating AI that observes User Access and reducing permissions to a minimum level. The PAM tool CyberArk from the company of the same name already uses neural networks for exactly this purpose - and simplifies the minimum principle in favor of the company.

Regarding the SoD: There is no optimization problem here, but an inference problem. Here, the separations do not want to be minimized or maximized, but designed in such a way that they separate safety-critical functions in a way that meets the safety requirements. A neural network cannot help here. An ontology would be more appropriate.

With an expressive ontology of operational processes, separation needs of operational functions are determined by inference. This is done by an inference engine, which is already natively underlying most ontology software. With the help of a rule set, the inference engine detects possible conflicts in the assignments of roles and functions in the organization and suggests solutions. In this way, we have simplified the SoD.

Ontologies are costly to produce, but they are also richer in semantics than conventional databases and more suitable for storing, mapping, and understanding complex webs of relationships in organizations. Often, the roles and functions of the operational processes are compared there, which provides the inference engine with an initial, albeit vague, rule set for inference. All that remains is to refine the rule set.

Synergy effects between ontologies and AI are possible, but difficult to achieve. There will also remain some administrative effort after the introduction of neural networks and ontologies, because many control processes still require manual execution. And unfortunately, AI cannot predict what BaFin will still require in the future.

Training and Data

Artwork by Jakub Sobiczewski: https://www.artstation.com/artwork/3dwWnY

To make a company fit for AI, a data mining project is recommended as a start. An internationally valid and DSGVO-compliant process model is CRISP-DM (Cross-Industry Standard Process for Data Mining). Experience shows that data mining projects (whether CRISP-DM or not) consume about 70% of the project time for pre-processing and quality assurance of the data. Therefore, in these phases it is even more important to rely on tools that accelerate the workflow: SMART is particularly helpful here (full title: SMART - An Open Source Data Labeling Platform for Supervised Learning). SMART is an annotation tool for training data that speeds up the process of labeling and thus categorizing your data. The tool allows collaborative work, includes user permissions as well as an admin dashboard, is MIT licensed, thus open source, free and commercially usable.

Noisy labeling often occurs during labeling. This refers to inconsistencies where several employees label one and the same data set differently, i.e. contradict each other in the labeling. The metric here is inter-rater reliability. We want the highest possible inter-rater reliability, i.e. the minimum amount of noise. This is possible in two ways:

Make employees commit to uniform labeling with the help of label policies

Implement robust learning

Robust learning introduces the useful side effect of being more resistant to deception attempts such as adversarial examples or label poisoning.

Up to this point, we have mentioned out-of-the-box AI solutions from various providers. However, individual solutions, tailored to the company’s specifics, are conceivable as well. For this, we need to onboard the necessary expertise, which usually costs a lot of money. And also, we need data in sufficient mass and class.

Free and open datasets are numerous: NSLKDD, for example; published with the goal of providing representative data for the development of intrusion detection scanners. Pre-built subdivision into 8 different subsets for training, training+20 and test. It names 41 features for cyberattacks such as denial-of-service (DoS), user-to-root (U2R), or remote-to-user (R2U), along with typical attacker approaches, number of bytes transmitted, host server rates, number of active shells, etc.

In the case of anomaly detection based on log files, the game is easier. Log files exist en masse on every computer in our network, which can be passed on to LSTM via unsupervised learning - but only in the first training run. For the second training run, the IT staff must be actively involved in the feedback loop of the LSTMS to assist in the training of recognizing false-positives. Further training data can be found on DNS servers, in the config files of domain controllers… Data is Ubiquitous.

Takeaways

Out-of-the-box solutions in AI require less expertise in applications than custom solutions, at the expense of flexibility. If you’re looking for AI expertise, look for people with prior knowledge of data-based disciplines such as statistics, trend and opinion research, or analytics. Many skills required in these areas transfer well to AI operations.

Practice almost always wins out over theory. If your AI throws false-positives galore, you need IT people who know how to fix such excesses at the code level (change activation function, invert labels of training data, …).

If AI is too sensitive for your information security, AI can also be used in the bureaucratic back-end, such as validating second approvals or creating reports. The GDPR tolerates AI only as long as it assists without making its own decisions. Ask yourself: where in the company can we find employees with ‘assistant’ in their job title? There you will also find application areas for AI.