Con esta herramienta te facilitamos un acceso a todas las ofertas y demandas de tecnología europeas y a búsquedas de socios para participar en propuestas europeas de I+D publicadas en la red Enterprise Europe Network, pudiendo filtrar los resultados para facilitar las búsquedas más acordes con tus necesidades.

¿Quieres recibir estos listados de oportunidades de colaboración en tu correo de forma periódica y personalizada? Date de alta en nuestro Boletín

Para optimizar los resultados de la búsqueda, se recomienda utilizar términos en inglés.

Reconocimiento computacional de voz a partir del movimiento de los labios


Oferta Tecnológica
Una pyme británica ha desarrollado una tecnología para descifrar la voz mediante el análisis de vídeos que contienen los movimientos de los labios del hablante. Esta tecnología es apta para cámaras estándar y puede implementarse tanto en la nube como en un centro de datos o un dispositivo del cliente. La tecnología utiliza aprendizaje automático e inteligencia artificial y descifra la voz allí donde el análisis de audio no es capaz, como entornos ruidosos. Además distingue la voz de diferentes personas que están pronunciando las mismas palabras. El producto no precisa hardware adicional y funciona en cualquier dispositivo con cámara estándar (smartphone, tableta, ordenador portátil, ordenador de sobremesa, etc.). La empresa busca socios con el fin de establecer acuerdos de licencia, cooperación y comercialización con asistencia técnica.


Computer recognition of speech from lip movements
A UK SME have developed technology that can decipher speech from analysing video of a speaker´s lip movements. It will work with standard device cameras and can be deployed either in the cloud, in a customer´s data centre or on a device. It uses machine learning and artificial intelligence. It can decipher speech where audio analysis can´t (such as noisy environments). Partners are sought for commercial agreement with technical assistance, research or technical cooperation, or licensing.
This technology is a system which can understand what is being said from video (with no sound) of someone speaking. It can also distingush one person from another who are saying the same words. The SME is a spin-out from a leading UK university. It was established in 2015 and are now commercialising over 10 years of research in the field of speech and image processing with particular focus on the fusion of speech and lip movements for robust speech recognition in real-world environments.
Their solution can be used as a supporting technology to audio speech recognition systems whose word accuracy levels universally degrade in noisy (real-world) environments. Where a camera can be trained on head of the speaker, the combined audio-visual speech recognition system will boost word accuracy when background noise levels increase.
The technology enables a number of use cases including:
Improving audio speech recognition systems for example:
- In-vehicle voice activation
- Personal assistants (Siri, Cortana, GoogleNow, Echo)
- Smart home voice control
Autonomous visual-only speech recognition for example:
- Keyword/phrase recognition in video segment (e.g.surveillance)
- Improved subtitling for live broadcast
- Aid to the hard of hearing
Additionally the system can be used as a ´liveness´ check to validate that a real person is present during any on-line interaction. Here it can be deployed with a facial recognition system to eradicate the common problem of ´spoofing´ by using a static image of the user to fool the system . The user will be prompted to speak/mime a random sequence of digits into the camera. The combined solution will validate the user is who they purport to be and that they are actually present.
The product requires no additional hardware and will work on any device with a standard forward facing camera (e.g. smartphone, tablet, laptop, desktop, in-vehicle dashboard etc.). It will initially be used with standard colour cameras, but infra-red or time of flight sensor support is a key deliverable on the near-term product roadmap.
Partners are sought for commercial agreement with technical assistance (to intergate the technology into other systems), research or technical cooperation, or licensing. Technical assistance would consist of assistance to intergrate this technology into other systems, advice as to scope and limits, and problem solving assistance. Licensing would be envisaged as a straighforward license to use the technology in an agreed application and geographical area. Several partnering options are mentioned, since this is very much open to discussion and negotiation.
Advantages and Innovations:
There are no commercial visual speech recognition platforms available today. The advantages of this system are many -
- it will work with common device cameras
- it is very robust and can potentially work in ´real world´ conditions - background noise, head movement, varying illumination
- multiple deployment options offered
It is a first-to-market visual speech recogniser and thus highly innovative. It employs unique, state-of-the-art techniques to:
- efficiently capture video of speaker lip movement using very low data bandwidth requirement
- extract an optimal feature set from captured video for input into the system´s artificial intelligence engine
- accurately ascertain, from the artificial intelligence engine (sequence of trained deep neural networks), the words or phrase spoken
Stage of Development:
Available for demonstration
Patent(s) applied for but not yet granted

Partner sought

Type and Role of Partner Sought:
The company would like to develop go-to-market partnerships with companies in the following markets

- Online authentication/identity verification - to develop commercial go-to-market partnerships with companies involved in the provision of on-line authentication services, especially these who use Facial Recognition as an authentication modality.

- Audio speech recognition - the company would like to develop technical cooperation agreements (and/or research cooperation agreements) with companies who develop audio speech recognition technologies, with a view to developing commercial go-to-market partnerships.

- Automakers -Y- suppliers of technology to automotive industry - the company would like to develop commerical go-to-market partnership agreements with companies involved in the supply of technology (esp. Infotainment-type tech) into the automakers.


Type and Size of Client:
Industry SME <= 10
Already Engaged in Trans-National Cooperation:
Languages Spoken:


Technology Keywords:
01003022 Aplicaciones inteligentes
01003003 Artificial Intelligence (AI)
01003009 Data Protection, Storage, Cryptography, Security
01003012 Imaging, Image Processing, Pattern Recognition
01003017 Speech Processing/Technology