Structured Output

MagicAF makes it straightforward to get structured, typed responses from your RAG pipeline. Instead of working with raw text, define a struct and let the framework deserialize it automatically. Using JsonResultParser<T> The fastest way to get structured output: define a struct that derives Deserialize, then use JsonResultParser. 1. Define your result type use serde::{Deserialize, Serialize}; #[derive(Debug, Serialize, Deserialize)] pub struct QAAnswer { pub answer: String, pub confidence: f32, pub sources: Vec<usize>, } 2. Tell the LLM to output JSON Use a PromptBuilder that instructs the LLM to respond in your expected format: ...

VectorStore

Trait Definition #[async_trait] pub trait VectorStore: Send + Sync { async fn index( &self, collection: &str, embeddings: Vec<Vec<f32>>, payloads: Vec<serde_json::Value>, ) -> Result<()>; async fn search( &self, collection: &str, query_vector: Vec<f32>, limit: usize, filter: Option<serde_json::Value>, ) -> Result<Vec<SearchResult>>; async fn delete_by_entity( &self, collection: &str, entity_id: Uuid, ) -> Result<()>; async fn ensure_collection( &self, collection: &str, vector_size: usize, ) -> Result<()>; } Module: magicaf_core::vector_store Methods index async fn index( &self, collection: &str, embeddings: Vec<Vec<f32>>, payloads: Vec<serde_json::Value>, ) -> Result<()> Index a batch of embeddings with their associated JSON payloads. ...

Edge & Mobile

MagicAF supports deployment on resource-constrained devices where running Qdrant and a full LLM server is impractical. Architecture ┌ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ └ ─ ─ ─ ─ ─ ┌ │ │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ O E ( ─ ─ ─ ─ n m O ─ ─ ─ ─ - b N ─ ─ ─ ─ D e N ─ ─ ─ ─ e d X ┬ └ ┌ │ └ ┌ │ │ └ ─ ─ ─ v d / ─ ─ ─ ─ ─ ─ ─ ─ ─ i i C ─ ─ ─ R ─ ─ ─ ─ ─ ─ c n o ─ ─ ─ A ─ ─ R o ─ ─ ─ ─ e g r ─ ─ ─ G ─ ─ e r ─ ─ ─ ─ e ─ ─ ─ W ─ ─ m ─ ─ ─ ─ M ─ ─ ─ o ─ ─ o C ─ ─ ─ ─ L ─ ┬ ▼ r ┬ ▼ t a ─ ─ ─ ┐ ) ┘ ─ ─ k ─ ─ e c ─ ─ ─ │ │ │ ─ ─ f ─ ─ h ─ ─ ─ ─ ─ l ─ ─ L e ─ ─ ─ E ┌ └ ─ ─ o ─ ─ L d ─ ─ ─ d ─ │ │ │ ─ ─ ─ w ─ ─ M ─ ─ ─ g ─ ─ ─ ─ ─ ─ ─ ─ ─ e ─ I ─ ─ ┐ │ ┘ ┐ │ ┘ ─ ─ ─ n ( ─ ─ │ ─ ─ D ─ M m ─ ─ ─ ─ e ─ e a ─ ─ ← ← ─ ─ v ─ m g ─ ─ ─ ─ i ─ o i ─ ─ F W ─ ─ c ─ r c ─ ─ u h ─ ─ e ─ y a ─ ─ l e ─ ─ ─ V f ┬ ┘ l n ─ ─ ─ e - ─ ─ ─ ─ c c ─ M n ─ ─ ─ t o ─ a e ─ ─ ─ o r ─ g t ─ ─ ─ r e ─ i w ─ ─ ─ S ) ─ c o ─ ─ ─ t ─ A r ─ ─ ─ o ─ F k ─ ─ ─ r ─ ─ ─ ─ e ─ o a ─ ─ ┐ ┘ r v ─ ─ │ │ │ c a ─ ─ h i ─ ─ e l ─ ─ s a ─ ─ t b ─ ─ r l ─ ─ a e ─ ─ t ─ ─ i ─ ─ o ─ ─ n ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ ─ ┐ │ │ │ │ │ │ │ │ │ │ │ │ │ ┘ Component Availability Component Server Deployment Edge Deployment Vector Store Qdrant (magicaf-qdrant) InMemoryVectorStore (in magicaf-core) Embeddings llama.cpp / TEI server On-device (ONNX, CoreML, TFLite) LLM vLLM / llama.cpp Omitted, or remote when online Persistence Qdrant handles it InMemoryVectorStore::save() / load() InMemoryVectorStore The zero-dependency vector store that ships with magicaf-core: ...

LlmService

Trait Definition #[async_trait] pub trait LlmService: Send + Sync { async fn chat(&self, request: ChatRequest) -> Result<ChatResponse>; async fn generate(&self, prompt: &str, config: GenerationConfig) -> Result<String>; async fn health_check(&self) -> Result<()>; } Module: magicaf_core::llm Methods chat async fn chat(&self, request: ChatRequest) -> Result<ChatResponse> Send a structured chat completion request. Parameter Type Description request ChatRequest The chat completion request payload Returns: ChatResponse containing the model’s reply, usage statistics, and metadata. Errors: MagicError::LlmError, MagicError::HttpError, MagicError::SerializationError generate async fn generate(&self, prompt: &str, config: GenerationConfig) -> Result<String> High-level convenience: turn a raw prompt into generated text. ...

Multi-Source Analysis

This example demonstrates a multi-source analysis pipeline with custom implementations of all three adapter traits and a complex output schema. Difficulty: ★★★ Advanced Custom adapters: EvidenceFormatter, PromptBuilder, ResultParser Output type: IntelSummary (complex nested JSON) What This Example Does Implements all three adapter traits with domain-specific logic Uses a complex nested output type Configures custom GenerationConfig for analytical precision Handles potentially malformed LLM output gracefully Domain Result Type use serde::{Deserialize, Serialize}; #[derive(Debug, Serialize, Deserialize)] struct IntelSummary { key_findings: Vec<String>, confidence_assessment: String, information_gaps: Vec<String>, recommended_actions: Vec<String>, } Custom Evidence Formatter Formats evidence with source metadata, dates, and classification levels: ...

Quickstart

This guide walks you through building a full Retrieval-Augmented Generation pipeline: embed documents, store them in a vector database, and answer questions using an LLM. The Complete Pipeline use magicaf_core::prelude::*; use magicaf_core::embeddings::LocalEmbeddingService; use magicaf_local_llm::LocalLlmService; use magicaf_qdrant::QdrantVectorStore; #[tokio::main] async fn main() -> anyhow::Result<()> { // 1. Configure services let embedder = LocalEmbeddingService::new(EmbeddingConfig { base_url: "http://localhost:8080".into(), model_name: "bge-large-en-v1.5".into(), batch_size: 32, timeout_secs: 30, api_key: None, })?; let store = QdrantVectorStore::new(VectorStoreConfig { base_url: "http://localhost:6333".into(), api_key: None, timeout_secs: 30, }).await?; let llm = LocalLlmService::new(LlmConfig { base_url: "http://localhost:8000/v1".into(), model_name: "mistral-7b".into(), api_key: None, timeout_secs: 120, })?; // 2. Build the RAG workflow let workflow = RAGWorkflow::builder() .embedding_service(embedder) .vector_store(store) .llm_service(llm) .evidence_formatter(DefaultEvidenceFormatter) .prompt_builder(DefaultPromptBuilder::new()) .result_parser(RawResultParser) .collection("my_docs") .top_k(5) .build()?; // 3. Run a query let result = workflow.run("What is MagicAF?", None).await?; println!("{}", result.result); println!("Evidence items: {}", result.evidence_count); Ok(()) } What Just Happened The RAGWorkflow executes a six-step pipeline: ...

Testing

MagicAF’s trait-based design makes testing straightforward. You can test each adapter in isolation and test complete workflows using mock services — no live infrastructure required. Mock Services The test crate provides mock implementations of all infrastructure traits: use magicaf_tests::mocks::*; Mock Trait Behavior MockEmbeddingService EmbeddingService Returns fixed-dimension zero vectors MockVectorStore VectorStore Returns configurable search results MockLlmService LlmService Returns a configurable text response Testing a Custom ResultParser The simplest unit test — verify that your parser handles valid and invalid LLM output: ...

Traits & Interfaces

MagicAF’s extensibility comes from its trait-based design. Every major component is accessed through an async trait, and every trait can be implemented by your application. Infrastructure Traits These traits define the fundamental AI building blocks. EmbeddingService Produces dense vector embeddings from text input. #[async_trait] pub trait EmbeddingService: Send + Sync { /// Embed a batch of input strings, returning one vector per input. async fn embed(&self, inputs: &[String]) -> Result<Vec<Vec<f32>>>; /// Embed a single string. async fn embed_single(&self, input: &str) -> Result<Vec<f32>>; /// Verify the upstream service is reachable. async fn health_check(&self) -> Result<()>; } Shipped implementation: LocalEmbeddingService — calls any OpenAI-compatible /v1/embeddings endpoint. ...

Observability

MagicAF provides built-in observability through structured logging, health checks, and tracing instrumentation. Structured Logging MagicAF uses the tracing ecosystem. All public methods are annotated with #[instrument] and emit structured spans with contextual fields. Basic Setup tracing_subscriber::fmt() .json() .with_env_filter("info,magicaf=debug") .init(); This produces JSON-formatted log lines with fields like: { "timestamp": "2025-01-15T10:30:00Z", "level": "INFO", "target": "magicaf_core::rag", "message": "Searching vector store", "span": { "collection": "my_docs" }, "fields": { "top_k": 10 } } Structured Fields Span Fields RAGWorkflow::run collection VectorStore::search collection, limit VectorStore::index collection, count EmbeddingService::embed count (number of inputs) LlmService::chat model, messages (count) Production Configuration For production, attach a subscriber that ships spans to your observability stack: ...

RAGWorkflow

Overview RAGWorkflow is the central pipeline engine. It is fully generic over all service and adapter traits, with zero runtime overhead — all dispatch is static. Module: magicaf_core::rag Type Signature pub struct RAGWorkflow<S, V, L, EF, PB, RP, T> where S: EmbeddingService, V: VectorStore, L: LlmService, EF: EvidenceFormatter, PB: PromptBuilder, RP: ResultParser<T>, T: Send, In practice, you never write out the type parameters — Rust infers them from the builder. ...

Related resources: Intracav AI · Intracav Blog · QPolicy