Skip to content

Conversation

@richiejp
Copy link
Owner

Attempt to add real time audio transcription. Initially it uses a very niave method of just sending chunks to be transcribed, with the previous chunk being joined with the current one.

It sort of works, but words get repeated and there are lots of mistakes. Push to talk doesn't work because when cancelling it, key presses from the transcription get in the way of the key presses to stop it.

@richiejp
Copy link
Owner Author

Now just waiting on mudler/LocalAI#5392

@richiejp richiejp merged commit 68ac1a3 into main May 26, 2025
@rehno-lindeque
Copy link

I guess this is still quite new and work-in-progress, but all the vendor cruft makes the code base a lot less appealing to me as a small command-line tool.

Is it just temporary for now?

@richiejp
Copy link
Owner Author

richiejp commented Jun 5, 2025

It's possible that I can remove it again. Presently it simplifies using a forked version of the realtime API library, but it also simplifies creating a Nix package because I can remove the vendorHash.

So once the changes to the realtime API are resolved, I think that would be a good time to revisit it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants