Advanced Topics
This page covers more advanced usage of Oxyde, including voice integration, streaming output, persistent storage, and standalone deployment strategies.
🎙️ Voice Integration (Coming Soon)
Oxyde is built to support real-time speech-to-text and text-to-speech pipelines:
Plug in services like:
🗣️ OpenAI Whisper
🧠 Deepgram, AssemblyAI
🗣️ ElevenLabs (TTS)
Route audio input → transcript → LLM → audio output
Future agent config will include:
"voice_profile": {
"tts": "elevenlabs",
"stt": "whisper"
}🌐 Streaming Output
You can enable real-time LLM streaming to get token-by-token output, ideal for:
NPCs talking while typing
Dynamic cutscenes
Interruptible generation
The async pipeline in llm_service.rs supports streaming via:
reqwestandfuturesWebSocket-compatible forwarders (WIP)
🗃️ Persistent Storage (Memory DB)
Persistent long-term memory is being designed to store:
⌛ Session data
🧠 High-priority long-term memories
🔄 Recalled insights across playthroughs
Pluggable backends:
🪵 JSON (dev mode)
📄 SQLite (lightweight)
🛢️ Postgres (production)
Future config block:
"storage": {
"type": "sqlite",
"path": "./data/velma.db"
}🖥️ CLI Deployment
Use Oxyde to power:
NPC simulation servers
Discord agents
Networked agent brains via socket
Inference daemons behind REST APIs
main.rs and cli.rs already support:
cargo run --example rpg_demoSoon: oxyde serve for headless server deployment.
🧩 Feature Flags
For lighter builds, Oxyde supports compile-time toggles:
LLM Streaming
--features stream
TTS/STT
--features voice
SQLite
--features db
🔗 Related Pages
8. API Reference
9. Examples & Demos
