Enterprise Europe Network

High-performance machine learning engine for designing cost-efficient solutions for making risk-aware decisions

Country of origin:
Country: 
UNITED KINGDOM
Opportunity:
External Id: 
TOUK20201123001
Published
24/11/2020
Last update
02/12/2020
Expiration date
03/12/2021

Keywords

Partner keyword: 
Data Processing / Data Interchange, Middleware
Information Technology/Informatics
Simulation
Molecular design
Process control equipment and systems
Airlines
Insurance related
Banking
EXPRESS YOUR INTEREST

Summary

Summary: 
A UK university group has designed a cost-efficient solution to evaluate uncertainty that is not limited in application scale or accuracy. The proposed high performance computing (HPC) solution has been verified on real problems existing in drug design, survival prediction, air traffic control, biometric identification and fraud detection. Partners in listed industries are sought for technical or research cooperation and licensing agreements.

Description

Description: 

In many cases of machine learning applications data are represented by a small set of abnormal cases along with a large set of normal events. The patterns of these samples are analysed by experts in order to decide whether a given event is abnormal or normal. In simple cases experts are capable of describing a model for making reliable and accurate decisions. Ideal models allow users to make decisions with minimal losses caused by errors.
For example, abnormal samples can represent fraud in payment transactions, active chemical compounds, pathological cases, or events of critical aircraft proximity, which are typically defined as “positive” samples in opposite to the “negative” normal instances.
Accurate decisions are defined as true positive and true negative outcomes, whilst errors are defined as false positive (FP) and false negative (FN) outcomes.
The false decisions have different costs. For example, FN outcomes in fraud detection (which define the loss from undetected frauds) are more expensive than the FP outcomes (which define the loss from blocking authorised payments). Thus there are cost-efficient solutions providing the minimum of overall losses caused by false decisions. Such solutions allow users to design cost-efficient solutions when patterns of detected events vary in a large range.
In practice uncertainty of decisions is evaluated within a probabilistic framework. The most common framework provides a single estimate without important information about the uncertainty in predicted probabilities. For example, 2 fraudulent transactions having different behaviour patterns could be detected with probabilities 0.3 and 0.8. Given a threshold probability 0.5 the 1st case is detected as a normal transaction which has a probability below 0.5 whilst only the 2nd case is detected as fraudulent. Thus the 1st case detected as FN could be avoided if the user knows how uncertain the model’s outcome is. This shows that users need to know estimates of uncertainty in the model’s outcomes in order to avoid losses caused by wrong decisions.
The important information about uncertainty can be obtained within the full probabilistic framework based on estimating a probability density function. This framework could be implemented within 2 strategies: (1) by analysing a given model on randomised data and (2) by analysing randomised models on given data. In practice the 2nd strategy provides the most reliable estimates of uncertainty.
(Please see the Figure and the attached legend). However, it requires large computations implemented on a High Performance Computing (HPC) engine in order to be applicable to real-scale problems. The proposed methodology has been implemented as a unique HPC engine which can be deployed on cloud platforms or used as a machine learning library.
Advantages of the proposed methodology have been demonstrated on imbalanced problems such as drug design, survival prediction, air traffic control, biometric identification, and fraud detection. The advantages of the HPC solution to survival prediction and fraud detection are demonstrated online. The solution allows users to instantly and reliably evaluate the predictive density distributions required to estimate the FN and FP probabilities and so minimise losses caused by wrong decisions.
Besides the above application areas, the proposed technology will advantage businesses in fault diagnostics, predictive maintenance, biometric payments, and other domains represented by imbalanced data whilst users need to minimise the losses caused by wrong decisions.

The University offers its solution to listed industries under license agreements. In case more work needs to be done to fine tune the solution, the cooperation can be of the technical or research type (collaborative R&D funding bids).

Advantages & innovations

Cooperation plus value: 
The patented methodology allows for users making cost-efficient and reliable decisions. The advantages are proven abilities of: (1) making the reliable estimation of uncertainty represented by probabilities of false positive and false negative outcomes; (2) designing cost-efficient solutions based on the developed HPC engine to real-scale problems.

Stage of development

Cooperation stage dev stage: 
Available for demonstration

Partner sought

Cooperation area: 
Type of partner sought: industry. Specific area of partner sought: drug design, survival prediction, air traffic control, biometric identification, fraud detection, fault diagnostics, predictive maintenance, biometric payments, and other domains represented by imbalanced data. Role of partner sought: the University offers its solution to listed business types under license agreements. In case more work needs to be done to fine tune the solution, the cooperation can be of the technical or research type (collaborative R&D funding bids).

Type and size

Cooperation task: 
SME 11-50,SME <10,>500 MNE,251-500,SME 51-250,>500

capture_from_pdf.png

Legend attached separately.