Advanced Topics

This page covers more advanced usage of Oxyde, including voice integration, streaming output, persistent storage, and standalone deployment strategies.

πŸŽ™οΈ Voice Integration (Coming Soon)

Oxyde is built to support real-time speech-to-text and text-to-speech pipelines:

  • Plug in services like:

    • πŸ—£οΈ OpenAI Whisper

    • 🧠 Deepgram, AssemblyAI

    • πŸ—£οΈ ElevenLabs (TTS)

  • Route audio input β†’ transcript β†’ LLM β†’ audio output

Future agent config will include:

"voice_profile": {
  "tts": "elevenlabs",
  "stt": "whisper"
}

🌐 Streaming Output

You can enable real-time LLM streaming to get token-by-token output, ideal for:

  • NPCs talking while typing

  • Dynamic cutscenes

  • Interruptible generation

The async pipeline in llm_service.rs supports streaming via:

  • reqwest and futures

  • WebSocket-compatible forwarders (WIP)


πŸ—ƒοΈ Persistent Storage (Memory DB)

Persistent long-term memory is being designed to store:

  • βŒ› Session data

  • 🧠 High-priority long-term memories

  • πŸ”„ Recalled insights across playthroughs

Pluggable backends:

  • πŸͺ΅ JSON (dev mode)

  • πŸ“„ SQLite (lightweight)

  • πŸ›’οΈ Postgres (production)

Future config block:


πŸ–₯️ CLI Deployment

Use Oxyde to power:

  • NPC simulation servers

  • Discord agents

  • Networked agent brains via socket

  • Inference daemons behind REST APIs

main.rs and cli.rs already support:

Soon: oxyde serve for headless server deployment.


🧩 Feature Flags

For lighter builds, Oxyde supports compile-time toggles:

Feature
Flag

LLM Streaming

--features stream

TTS/STT

--features voice

SQLite

--features db


  • 8. API Reference

  • 9. Examples & Demos