This page is optimized for AI. For the human-readable: Similarity in multimedia artifacts: extending monomodal methods to multimodality

Similarity in multimedia artifacts: extending monomodal methods to multimodality

Project Idea Metadata

Project Idea Description

Current approaches for similarity computation of multimedia documents are limited by the way the heterogeneous information included in multimedia objects is used.

In many cases, a single modality is considered, or a loose coupling of monomodal solutions is adopted. The basic idea of this exploratory work is to reason and propose a new integrated approach that could takle multimodal data and provide an integrated similarity going further than previous state-of-the-art solutions.

An example can be seen in the task of music recommendation, where features that can be extracted from the objects constitute a richer space for the similarity computation than each monomodal aspect taken individually.

The final goal is to develop a demonstrator that can be used as the base for experiments and comparisons in this domain.


We suggest using the FMA dataset for machine-learning in the music and NLP domains: https://freemusicarchive.org and https://github.com/mdeff/fma


Bibliography

[BS13] Christian Beecks and Thomas Seidl. Distance based similarity models for content based multimedia retrieval. PhD thesis, Aachen, 2013. Zsfassung in dt. und engl. Sprache; Aachen, Techn. Hochsch., Diss., 2013.

Current approaches for similarity computation of multimedia documents are limited. The basic idea of this exploratory work is to reason and propose a new integrated approach that could tackle multimodal data and provide an integrated similarity measure.

Music recommendation can be an applicatipon domain where this problem is relevant. The final goal is to develop a demonstrator that can be used as the base for experiments and comparisons in this domain.