Smarter data pipelines for audio

Process hundreds of thousands of audio files just as easily as any other kind of dataset.

Built on Apache Beam. Written for native Python speakers. Designed for the cloud. Klio is an open framework for building data pipelines optimized for media. It makes processing audio, video, and images as simple and seamless as processing text.

With Klio, engineers can speak the same language as ML and audio researchers to build data pipelines, together. Pipelines that are both efficient and reproducible. And that automatically come with managed resources, along with the other benefits of a modern cloud infrastructure. All at enormous scale.

For the first time, media companies and academics alike can process hundreds of thousands of media files at once, simply and efficiently. How do we know? Because that’s how we do it every day here at Spotify.