# Inference System

#### 🔧 Prompt Construction

Every prompt is built dynamically based on:

* 💬 Current input (e.g. player message)
* 🧠 Recalled memories (ranked by importance)
* 😶‍🌫️ Emotional state (6D vector)
* 🎯 Active goals
* 🧍 Agent name/persona

Example (simplified):

```
Agent: Velma
Emotional state: curious, calm
Top memory: "Marcus gave me the map"
Current goal: explore ruins

Prompt: You are Velma, a curious NPC currently exploring ruins. Remember that Marcus gave you the map. The player just asked: “What’s down that tunnel?”
```

***

#### 🔀 Multi-LLM Routing

Oxyde supports dynamic routing to:

| Provider     | Status      |
| ------------ | ----------- |
| OpenAI       | ✅ Supported |
| Groq         | ✅ Supported |
| Anthropic    | ✅ Supported |
| xAI          | ✅ Supported |
| Local (GGUF) | ✅ Supported |

Routing logic lives in `llm_service.rs`. It selects a provider based on:

* ⚡ Latency benchmarks
* 💸 Cost settings
* 🎯 Prompt type or agent profile
* 🛠️ API key availability

***

#### 🧩 Plug-and-Play Model Interface

To add a new model:

1. Implement the `LLMProvider` trait:

```rust
trait LLMProvider {
    fn generate(&self, prompt: &str) -> String;
}
```

2. Register it in the router dispatch map

Oxyde does not hardcode OpenAI logic — you can drop in anything.

***

#### 🔁 Async Inference Pipeline

* Batched, cancellable, and thread-safe
* Uses Rust `tokio` + `reqwest`
* Futureproofed for streaming output and live voice

***

#### ⚙️ Configuring Inference

Set routing preferences in `config.json`:

```json
"llm_profile": {
  "preferred_provider": "groq",
  "fallbacks": ["openai", "local"]
}
```

***

#### 📊 Benchmarks (placeholder)

| Provider | Latency (avg) | Token cost ($/1k) |
| -------- | ------------- | ----------------- |
| Groq     | 35ms          | $0.003            |
| OpenAI   | 250ms         | $0.006            |
| Local    | \~100ms       | free              |

(*Benchmarks depend on prompt length and model. Replace with real data if needed.*)

***

#### 🔗 Related Pages

* 4\. Configuring Agents → `llm_profile`
* 8\. API Reference → `llm_service.rs`
