In various industries, the real problem with creating value with AI is the lack of a suitable data basis. For example, media companies produce a lot of content in complex data formats (especially video), unstructured, and not sufficiently documented with meta information. Thus, they often have no overview of their content or transparency within their archives, and many potentials remain unused. At the moment, the best practice is to use the human workforce (up to 50 archivers) to annotate videos, which is very slow, expensive, and not scalable. Still, due to the overwhelming amount of content produced, they can only annotate a small fraction of their videos manually.
This is why the AI startup, based in the South of Germany, developed a novel AI-based video analysis platform for video assets. It utilizes state-of-the-art video captioning techniques to create meaningful, semantically deep annotations to describe what is actually happening in the videos in full sentences. The platform is based on a human-in-the-loop approach, where the AI annotates the videos and the human only refines the ones, where the AI was not perfect. Based on this human input, the AI constantly learns and improves itself, which is specialized to the customer's own, individual needs.
Compared to other solutions, the Unique Selling Point (USP) is that it is not only extracting simple objects, faces, etc., but rather understanding the multi-modal context of the video, and summarizing it in multiple scenes. By taking multiple views (like image and speech) into account, the AI can describe the actual gist, and in doing so also surpasses all current state-of-the-art video captioning models. Furthermore, instead of trying to replace the human with an out-of-the-box AI, he is integrated into an innovative, AI-supported workflow, to make his tasks more natural for humans, and to let the AI continuously learn and adapt to human-level intelligence.
The startup is interested in research and technical cooperation agreements as well as in a commercial agreement with technical assistance with partners from the industry who want to apply, further develop and leverage this technology to their industry-specific use cases. Especially, it would be important that the cooperation partner provides a relevant video dataset, in the best case already annotated with relevant labels.