Audio Retrieval With Natural Language Queries A Benchmark Study

Name: Audio Retrieval With Natural Language Queries A Benchmark Study
Uploaded: 2024-01-19T17:23:36+0530
Description: Audio Retrieval With Natural Language Queries A Benchmark Study

admin

Jan 19, 2024 - 17:23

0 20

Abstract:

The objectives of this work are cross-modal text-audio and audio-text retrieval , in which the goal is to retrieve the audio content from a pool of candidates that best matches a given written description and vice versa. Text-audio retrieval enables users to search large databases through an intuitive interface: they simply issue free-form natural language descriptions of the sound they would like to hear. To study the tasks of text-audio and audio-text retrieval, which have received limited attention in the existing literature, we introduce three challenging new benchmarks. We first construct text-audio and audio-text retrieval benchmarks from the AudioCaps and Clotho audio captioning datasets. Additionally, we introduce the SoundDescs benchmark, which consists of paired audio and natural language descriptions for a diverse collection of sounds that are complementary to those found in AudioCaps and Clotho . We employ these three benchmarks to establish baselines for cross-modal text-audio and audio-text retrieval, where we demonstrate the benefits of pre-training on diverse audio tasks. We hope that our benchmarks will inspire further research into audio retrieval with free-form text queries.

Click Here To See More

Audio Retrieval With Natural Language Queries A Benchmark Study

Audio Retrieval With Natural Language Queries A Benchmark Study

Tags:

What's Your Reaction?

Related Posts

Popular Posts

Follow Us

Recommended Posts

Popular Tags

Voting Poll