ai

Ovi Stoica 2024-12-30T07:38:59.561839Z

Looking for feedback on a library: Hello! I’m working on https://github.com/shipclojure/voice-fn - a library for creating real-time AI pipelines in a declarative way. It is a very early stage. It is heavily inspired by https://github.com/pipecat-ai/pipecat but less object-oriented. A pipeline is a list of processors that each defines what types of frames it accepts and what types of frames it puts back on the pipeline. There is an example working in the example directory with live transcription of a twilio phone call. It uses core.async and pub/sub for processors to subscribe to their accepted-frames on the pipeline Let me know if you have any questions. I plan to make it feature parity or as close as possible with pipecat.

🆒 2
👀 2
Ovi Stoica 2024-12-30T07:47:41.384679Z

Am I going in a direction where I’ll shoot myself in the foot with this architecture? It’s pretty much an event-based architecture. It differs from Pipecat in this regard. Pipecat has a bidirectional queue, and all frames have a direction (upstream, downstream), and AFAICT, if a processor doesn’t need a frame, sends it to its next neighbor in the frame’s direction. This approach might be better if you want a direct sync between processors (I need to guarantee that processor1 runs on frameA before processor2 runs on frameA). However, with my approach, to obtain this guarantee, I need processor1 to accept frameA types and generate frameB types, and I need processor2 to only accept frameB, which might get more out of hand. An example of this need for synchronization is when you need to do context aggregation (assemble streaming LLM token chunks) before the subsuquensequent chat completion request (small and minor exa