Deepfake AI voice cloning detection against impersonation fraud

Summary

Profile Type

Technology offer

POD Reference

TOES20240213003

Term of Validity

13 February 2024 - 12 February 2026

Company's Country

Spain

Type of partnership

Commercial agreement with technical assistance

Targeted Countries

All countries

Contact the EEN partner nearest to you for more information.

Find my local partner

General information

Short Summary

A Spanish company, specialized in voice biometrics, has developed a solution capable of detecting deepfake voice cloning generated by newest AI (Artificial Intelligence) technologies. It directly addresses growing threats resulting from identity impersonation using cloned voices created by advanced AI that are completely indistinguishable from the real person by a human being. The company is looking for commercial agreements with technical assistance

Full Description

Bank heists and phishing attacks using voice impersonation are on the rise due to recent advances in sophisticated AI voice cloning technology. Fake news also poses an increasingly serious problem. Modern state of the art machine learning makes artificial voices virtually indistinguishable from the impersonated human voice just by listening.

In order to address this threat, the Spanish company, specialized in voice recognition technologies, has developed a state-of-the-art voice cloning detection engine. It allows to detect in real time, artificially generated voices used by cybercriminals and impostors and thus prevent frauds. Tenths of different technologies of voice synthesis are detected and it’s effective even against advanced vocoders. New cloning technologies can be specifically modelled and added to the recognition engine as they are created, in order to provide maximum accuracy.

AI voice cloning technology can work in two operating modes: Text-To-Speech (TTS) and Voice Modulation (voice changer). In TTS, human voice recordings are used to train a synthesizer beforehand that can then be used to input text and generate a voice mimicking the cloned person. On the other hand, a Voice Modulator changes on the fly a human input voice into the impersonated target voice, adjusting acoustic parameters based on the reference voice to be cloned. Therefore, there is no need to input text.

This technology can analyse both types of attacks and detect subtle audio signal features present in real human voices and not in synthetics voices, and vice versa.

The solution can address a wide range of threats such as the following examples:

• Fake news in social media, radio and TV: Impersonation of celebrities and authorities.
• Banks heists by phone: impersonation of big clients of banks by cybercriminals.
• Phishing to citizens by phone: gathering of confidential data and banking accounts, luring family members to make a bank transfer.
• Malicious AI: is the doctor I’m talking to remotely by phone or video-call a human being or an AI entity?

Key features:

•SDK (Software Development Kit) that exports its functionalities through a powerful yet easy to use API (Application Programming Interface)
• Effective against advanced vocoders.
• Language independent: voice impersonation attacks are detected no matter the language.
• State of the art AI recognition engine: based on advanced machine learning algorithms.
• Advanced audio features extraction: the audio signal is processed and analysed before feeding the AI recognition engine.
• Easy to integrate API for on-premises solutions.
• Highly optimized C++ recognition engine: can be integrated into embedded systems.

The company is looking for developing its international business through partner within the cybersecurity sector to reach commercial agreements with technical assistance.

Advantages and Innovations

This is a state-of-the-art solution voice cloning detection, that has been specially developed by the Spanish company to prevent the growing fraud attacks with supplanted voice created by AI. Therefore, currently there are not technologies in the market that can provide the same level of accuracy addressing the same threats.

Technical Specification or Expertise Sought

Technical specifications:
• Audio required for synthetic speech detection: 5s.
• Verification time: < 0.4 seconds.
• Supported audio formats: WAV PCM linear 16 bits 8/16 KHZ (recommended), MP3.
• Proprietary C++ API.
• Number of TTS and voice cloning technologies modelled: 21 (new ones will be added as they are available).
• EER1: < 1%, dependent on audio quality and channel noise.
• Minimum recommended CPU: Intel i5, 2.5 GHz or equivalent.

Supported platforms:
• Windows® 7, 8, 10, 11.
• Linux, several distributions

Stage of Development

Already on the market

Sustainable Development Goals

Not relevant

IPR status

Secret know-how

Partner Sought

Expected Role of a Partner

The company is looking for international ICT partners specialized in cybersecurity with a wide network of clients for sector in need of enhanced security such as banking, media,…

The role of the partner sought is to commercialise this technology within a defined region and the kind of partnership sought is commercial agreement with technical assistance.

Type and Size of Partner

Big company
SME 50 - 249
Other
SME <=10
SME 11-49

Type of partnership

Commercial agreement with technical assistance

Dissemination

Technology keywords

01003003 - Artificial Intelligence (AI)

Market keywords

01002004 - Other telephone related
09002004 - Security and commodity brokers and services

Sector Groups Involved

Digital

Targeted countries

All countries