Audioflare is an all-in-one AI audio playground that utilizes Cloudflare AI Workers to transcribe, analyze, summarize, and translate any audio file!
This is an open-source project that showcases a practical use case: orchestrating a series of AI workers to process audio files up to 30 seconds in length.
Audioflare’s core functionality includes:
• Transcription: Using Cloudflare’s Speech to Text worker, which is built on OpenAI’s whisper API.
• Summarization: Using Cloudflare’s LLM AI worker, based on Meta’s llama-2-7b-chat-int8 model.
• Sentiment Analysis: Using Cloudflare’s Text Classification AI worker, leveraging the Huggingface’s distilbert-sst-2-int8 model.
• Translation: Using Cloudflare’s Translation AI workers, which utilize Meta’s m2m100-1.2b model.
Current constraints:
• Transcription confined to 30 seconds
• The LLM model’s perf on summarization needs improvement
Despite this, it shows the potential of Cloudflare AI workers by standardizing the AI API request framework and simplifying multi-step AI actions.
Audioflare serves as a valuable template for learning and working with Cloudflare AI workers. It is a great step towards gaining a better understanding of the Cloudflare AI ecosystem.
Kudos to @SeanOliver for publishing it. Check it out here: https://github.com/seanoliver/audioflare
I talk about the latest in frontend, along with my experience in building various (Indie) side-projects