Delphi LLM Pack is a ai component pack support audio transcribe with Whisper.cpp and translate language with NLLB-200
- New: Support Delphi 13 and add voice clone based on zipvoice.
- llama now support multi model LLM and allow to input image and audio besides text.
- voice activity detector component based on silero (can be used to remove silence from voice)
- Add kokoro tts support which can output mixed multiple language speech
- Add OCR component based on PaddleOCR (Win64 support only)
- Add Llama component based on llama.cpp (Win64 support only)
- Add realtime TTS component based on Piper (Win64 support only)
- Add realtime translator component based on NLLB-200 (Win64 support only)
- add a realtime transcribe BBC news example (Note: to compile the new example you should have our ffmpeg firemonkey pack installed we use ffmpeg to playback and convert the audio stream to wav)
- upgrade whisper runtime to 1.62
- Support Delphi 12 Win64 , Mac OSX64(Intel and M1/M2) and Android 64 bit now.
- Support both file and realtime audio buffer translation.
| Feature |
Library |
Win64 |
OSX |
Android |
| Transcribe |
Whisper.cpp |
Yes |
Yes |
Yes |
| Translate |
NLLB-200 |
Yes |
No |
No |
| TTS |
Piper/kokoro |
Yes |
No |
No |
| LLM |
llama.cpp |
Yes |
No |
No |
| OCR |
PaddleOCR |
Yes |
No |
No |
| VAD |
silero |
Yes |
No |
No |
| Voice Clone |
zipvoice |
Yes |
No |
No |
Download Realtime Transcribe Demo: Realtime BBC News transcribe with Whisper tiny.en model (90M)
Download Realtime Translate Demo: Realtime translate with NLLB-200 600m model (18M)