feat: Add realtime audio transcription #2

richiejp · 2025-05-16T07:09:53Z

Attempt to add real time audio transcription. Initially it uses a very niave method of just sending chunks to be transcribed, with the previous chunk being joined with the current one.

It sort of works, but words get repeated and there are lots of mistakes. Push to talk doesn't work because when cancelling it, key presses from the transcription get in the way of the key presses to stop it.

richiejp · 2025-05-25T13:47:51Z

Now just waiting on mudler/LocalAI#5392

rehno-lindeque · 2025-05-29T19:10:45Z

I guess this is still quite new and work-in-progress, but all the vendor cruft makes the code base a lot less appealing to me as a small command-line tool.

Is it just temporary for now?

richiejp · 2025-06-05T20:04:01Z

It's possible that I can remove it again. Presently it simplifies using a forked version of the realtime API library, but it also simplifies creating a Nix package because I can remove the vendorHash.

So once the changes to the realtime API are resolved, I think that would be a good time to revisit it.

richiejp mentioned this pull request May 16, 2025

Realtime transcription API and VAD mudler/LocalAI#5377

Closed

richiejp mentioned this pull request May 24, 2025

feat: Realtime API support reboot mudler/LocalAI#5392

Merged

richiejp force-pushed the realtime branch from c6db078 to 1448bcf Compare May 25, 2025 13:46

richiejp mentioned this pull request May 25, 2025

Command-line argument for a longer timeout #3

Closed

feat: Add realtime audio transcription

b4281ba

richiejp force-pushed the realtime branch from 1448bcf to b4281ba Compare May 26, 2025 10:12

richiejp merged commit 68ac1a3 into main May 26, 2025

richiejp mentioned this pull request Jun 7, 2025

Sometimes voice input contains [BLANK_AUDIO] #1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add realtime audio transcription #2

feat: Add realtime audio transcription #2

Uh oh!

richiejp commented May 16, 2025

Uh oh!

richiejp commented May 25, 2025

Uh oh!

rehno-lindeque commented May 29, 2025

Uh oh!

richiejp commented Jun 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: Add realtime audio transcription #2

feat: Add realtime audio transcription #2

Uh oh!

Conversation

richiejp commented May 16, 2025

Uh oh!

richiejp commented May 25, 2025

Uh oh!

rehno-lindeque commented May 29, 2025

Uh oh!

richiejp commented Jun 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants