speech-to-text

Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript

android windows macos linux raspberry-pi ios text-to-speech csharp cpp dotnet speech-to-text aarch64 mfc risc-v asr arm32 onnx vits openkylin

Updated Jun 11, 2024
C++

occ-ai / obs-localvocal

Star

OBS plugin for local speech recognition and captioning using AI

plugin translation ai livestream live-streaming speech-recognition speech-to-text obs transcription obs-studio whisper realtime-translator obs-studio-plugin realtime-transcribe openai-whisper whisper-cpp real-time-transcription

Updated Jun 11, 2024
C++

KevKibe / African-Whisper

Star

🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.

speech speech-recognition speech-to-text whisper asr speech-translation speech-transcription

Updated Jun 11, 2024
Python

modelscope / FunClip

Star

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

speech-recognition speech-to-text gradio video-clip subtitles-generator video-subtitles llm gradio-python-llm

Updated Jun 11, 2024
Python

TheSoftDiamond / Kazushin

Star

Customizable TTS Chat Bot using OpenAI & Google Cloud TTS/ElevenLabs

python text-to-speech twitch ai chatbot tts speech-recognition openai speech-to-text gpt googlecloud gemini-api twitchio elevenlabs

Updated Jun 11, 2024
Python

ictnlp / StreamSpeech

Star

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Updated Jun 11, 2024
Python

OpenVoiceOS / status

Star

Open Voice OS Status Page

status text-to-speech translator monitoring alerting cuda sam nvidia tts uptime stats speech-to-text stt piper ovos upptime openvoiceos fasterwhisper mimic3

Updated Jun 11, 2024
Markdown

ErcinDedeoglu / WhisperDock

Star

Dockerized Whisper C++ speech-to-text API for easy deployment and rapid integration. Offering the latest stable and nightly builds for efficient audio transcription.

api docker machine-learning speech-to-text audio-transcription whisper-cpp

Updated Jun 11, 2024
C++

Picovoice / web-voice-processor

Star

A library for real-time voice processing in web browsers

javascript real-time browser worker realtime voice-commands microphone speech-recognition webaudio-api pcm web-browser speech-to-text audio-processing wake-word-detection downsampling voice-processing

Updated Jun 10, 2024
TypeScript

Macoron / whisper.unity

Star

Running speech to text model (whisper.cpp) in Unity3d on your local machine.

unity3d speech-recognition openai speech-to-text stt whisper asr

Updated Jun 10, 2024
Metal

AssemblyAI / assemblyai-java-sdk

Star

The AssemblyAI Java SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async and real-time transcription, audio intelligence models, as well as the latest LeMUR models.

java ai speech-to-text transcription stt asr assemblyai llm