Inference System
Oxyde’s inference engine transforms internal agent state into high-quality LLM prompts — and routes them intelligently across multiple providers. This is the heart of Oxyde’s autonomy.
🔧 Prompt Construction
Every prompt is built dynamically based on:
💬 Current input (e.g. player message)
🧠 Recalled memories (ranked by importance)
😶🌫️ Emotional state (6D vector)
🎯 Active goals
🧍 Agent name/persona
Example (simplified):
Agent: Velma
Emotional state: curious, calm
Top memory: "Marcus gave me the map"
Current goal: explore ruins
Prompt: You are Velma, a curious NPC currently exploring ruins. Remember that Marcus gave you the map. The player just asked: “What’s down that tunnel?”🔀 Multi-LLM Routing
Oxyde supports dynamic routing to:
OpenAI
✅ Supported
Groq
✅ Supported
Anthropic
✅ Supported
xAI
✅ Supported
Local (GGUF)
✅ Supported
Routing logic lives in llm_service.rs. It selects a provider based on:
⚡ Latency benchmarks
💸 Cost settings
🎯 Prompt type or agent profile
🛠️ API key availability
🧩 Plug-and-Play Model Interface
To add a new model:
Implement the
LLMProvidertrait:
trait LLMProvider {
fn generate(&self, prompt: &str) -> String;
}Register it in the router dispatch map
Oxyde does not hardcode OpenAI logic — you can drop in anything.
🔁 Async Inference Pipeline
Batched, cancellable, and thread-safe
Uses Rust
tokio+reqwestFutureproofed for streaming output and live voice
⚙️ Configuring Inference
Set routing preferences in config.json:
"llm_profile": {
"preferred_provider": "groq",
"fallbacks": ["openai", "local"]
}📊 Benchmarks (placeholder)
Groq
35ms
$0.003
OpenAI
250ms
$0.006
Local
~100ms
free
(Benchmarks depend on prompt length and model. Replace with real data if needed.)
🔗 Related Pages
4. Configuring Agents →
llm_profile8. API Reference →
llm_service.rs
