Follow Infotel Systems on Facebook Follow Infotel Systems on twitter


Download 736 740 Zip May 2026

Reference the original paper: Drossos, K., Lipping, S., & Virtanen, T. (2020). "Clotho: an Audio Captioning Dataset." Proc. IEEE ICASSP, pp. 736-740 .

The full development set is approximately 6.5 GB . Download 736 740 zip

Categorized into development, validation, and evaluation sets for training and testing machine learning models. 📥 How to Download Reference the original paper: Drossos, K

If you are writing a technical report or paper using this data, ensure you include these standard sections: Reference the original paper: Drossos

Clotho is an audio dataset used for intermodal translation (audio-to-text) tasks. It is widely utilized in the (Detection and Classification of Acoustic Scenes and Events) challenges. 📂 Key Data Components

Infotel Systems . 6008 Hermitage Rd . Richmond, Virginia 23228

All content copyright reserved (c) 2021. Content may not be used without written permission of Infotel Systems Inc