MagicAF Documentation

Defense-grade, HIPAA-compliant AI toolkit for secure, air-gapped deployments. Production-ready Rust framework for RAG pipelines, NLP analysis, embeddings, vector search, and LLM orchestration. Designed for SIPR/NIPR, classified, and healthcare environments with zero cloud dependencies.

Architecture

MagicAF follows a strict three-layer architecture. Each layer has a single responsibility, and layers communicate only through well-defined trait boundaries. Layer Model ┌ │ │ └ ┌ │ │ └ ┌ │ │ └ ┌ │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ( ─ ─ E ─ ─ ─ ─ E ─ ─ y ─ ─ v ─ ─ ─ ─ m ─ ─ o ─ ─ i ─ ─ ─ ─ b ─ ─ u ─ ─ d ─ ─ ─ ─ e ─ ─ r ─ ─ e ─ ─ ─ ─ d ─ ─ ─ ─ n ─ ─ ─ ─ d ─ ─ b ─ ─ c ─ ─ ─ ─ i ─ ─ u ─ ─ e ─ ─ ─ ─ n ─ ─ s ─ ─ 3 F ─ ─ 2 ─ ─ 1 g ─ ─ i ─ ─ o ─ ─ ─ ─ S ─ ─ n ─ ─ · r ─ ─ · ─ ─ · e ─ ─ D e ─ ─ m ─ ─ R ─ ─ r ─ ─ o s ─ ─ A a ─ ─ O A ─ ─ I v ─ ─ m s ─ ─ d t ─ ─ r G ─ ─ n i ─ ─ a ─ ─ a t ─ ─ c W ─ ─ f c ─ ─ i l ─ ─ p e ─ ─ h o ─ ─ r e ─ ─ n o ─ ─ t r ─ ─ e r ─ ─ a ─ ─ g ─ ─ e ─ ─ s k ─ ─ s · ─ ─ A i ─ ─ r · ─ ─ t f ─ ─ t ─ ─ p c ┬ │ ▼ ┬ │ ▼ r l ┬ │ ▼ r V ─ ─ p , ─ ─ L P ─ ─ a o ─ ─ u e ─ ─ l ─ ─ a r ─ ─ t w ─ ─ c c ─ ─ i a ─ ─ y o ─ ─ i < ─ ─ t t ─ ─ c d ─ ─ e m ─ ─ o S ─ ─ u o ─ ─ a a ─ ─ r p ─ ─ n , ─ ─ r r ─ ─ t p ─ ─ t ─ ─ ─ ─ e S ─ ─ i t ─ ─ B ─ ─ L V ─ ─ t ─ ─ o e ─ ─ u ─ ─ a , ─ ─ L o ─ ─ n r ─ ─ i ─ ─ y ─ ─ a r ─ ─ s s ─ ─ l ─ ─ e L ─ ─ y e ─ ─ , ─ ─ d ─ ─ r , ─ ─ e ─ ─ ─ ─ e ─ ─ ─ ─ r · ─ ─ r ─ ─ r ─ ─ … ─ ─ ─ ─ e ─ ─ ─ ─ > ─ ─ L ─ ─ s ─ ─ · ─ ─ ─ ─ l ─ ─ u ─ ─ ─ ─ ─ ─ m ─ ─ l ─ ─ R ─ ─ ─ ─ S ─ ─ t ─ ─ e ─ ─ ─ ─ e ─ ─ ─ ─ s ─ ─ ─ ─ r ─ ─ t ─ ─ u ─ ─ ─ ─ v ─ ─ y ─ ─ l ─ ─ ─ ─ i ─ ─ p ─ ─ t ─ ─ ─ ─ c ─ ─ e ─ ─ P ─ ─ ─ ─ e ─ ─ s ─ ─ a ─ ─ ─ ─ ─ ─ ) ─ ─ r ─ ─ ─ ─ ─ ─ ─ ─ s ─ ─ ─ ─ ─ ─ ─ ─ e ─ ─ ─ ─ ─ ─ ─ ─ r ─ ─ ─ ─ ─ ┐ │ │ ┘ ┐ │ │ ┘ ┐ │ │ ┘ ┐ │ │ ┘ Layer 1 — Infrastructure The infrastructure layer provides the fundamental AI primitives. Each primitive is defined as an async trait. ...

Building Custom Adapters

Adapters are how your domain logic plugs into the MagicAF RAG pipeline. You implement three traits — each one controls a different stage of the pipeline — without modifying any framework code. What You Need to Implement Trait Pipeline Step Responsibility EvidenceFormatter After vector search Turn search results into an LLM-ready text block PromptBuilder Before LLM call Assemble the final prompt from query + evidence ResultParser<T> After LLM call Parse the LLM’s raw text into your domain type T You can mix custom and default implementations — use defaults for rapid prototyping, then replace them one at a time. ...

Docker Compose

A MagicAF deployment requires three local services. Here’s a reference Docker Compose configuration that starts all of them. Infrastructure Components Service Default Port Purpose Example Software Embedding server 8080 Dense vector embeddings text-embeddings-inference, llama.cpp, vLLM Vector database 6333 Similarity search Qdrant LLM server 8000 Chat completion / generation vLLM, llama.cpp, TGI, Ollama All three communicate via HTTP REST. No gRPC, no cloud SDK, no vendor library is required at the network level. ...

EmbeddingService

Trait Definition #[async_trait] pub trait EmbeddingService: Send + Sync { async fn embed(&self, inputs: &[String]) -> Result<Vec<Vec<f32>>>; async fn embed_single(&self, input: &str) -> Result<Vec<f32>>; async fn health_check(&self) -> Result<()>; } Module: magicaf_core::embeddings Methods embed async fn embed(&self, inputs: &[String]) -> Result<Vec<Vec<f32>>> Embed a batch of input strings, returning one vector per input. Parameter Type Description inputs &[String] Texts to embed. Empty slice returns empty vec. Returns: Vec<Vec<f32>> — one embedding vector per input, in the same order. Errors: MagicError::EmbeddingError, MagicError::HttpError ...

Minimal RAG

This example demonstrates the bare minimum needed to build a working RAG pipeline. It uses all default adapters and returns raw text output. Difficulty: ★☆☆ Beginner Custom adapters: None Output type: String What This Example Does Connects to local embedding, vector store, and LLM services Indexes a handful of sample documents Runs a RAG query and prints the answer Full Code use magicaf_core::adapters::{ DefaultEvidenceFormatter, DefaultPromptBuilder, RawResultParser, }; use magicaf_core::config::{EmbeddingConfig, LlmConfig, VectorStoreConfig}; use magicaf_core::embeddings::LocalEmbeddingService; use magicaf_core::rag::RAGWorkflow; use magicaf_local_llm::LocalLlmService; use magicaf_qdrant::QdrantVectorStore; #[tokio::main] async fn main() -> anyhow::Result<()> { // ── Structured logging ───────────────────────────────────────── tracing_subscriber::fmt() .with_env_filter( tracing_subscriber::EnvFilter::try_from_default_env() .unwrap_or_else(|_| "info,magicaf=debug".parse().unwrap()), ) .json() .init(); println!("=== MagicAF · Minimal RAG Example ===\n"); // ── Configuration ────────────────────────────────────────────── let embedding_config = EmbeddingConfig { base_url: std::env::var("EMBEDDING_URL") .unwrap_or_else(|_| "http://localhost:8080".into()), model_name: std::env::var("EMBEDDING_MODEL") .unwrap_or_else(|_| "bge-large-en-v1.5".into()), batch_size: 32, timeout_secs: 30, api_key: None, }; let vector_config = VectorStoreConfig { base_url: std::env::var("QDRANT_URL") .unwrap_or_else(|_| "http://localhost:6333".into()), api_key: None, timeout_secs: 30, }; let llm_config = LlmConfig { base_url: std::env::var("LLM_URL") .unwrap_or_else(|_| "http://localhost:8000/v1".into()), model_name: std::env::var("LLM_MODEL") .unwrap_or_else(|_| "mistral-7b".into()), api_key: None, timeout_secs: 120, }; // ── Service construction ─────────────────────────────────────── let embedder = LocalEmbeddingService::new(embedding_config)?; let store = QdrantVectorStore::new(vector_config).await?; let llm = LocalLlmService::new(llm_config)?; // ── Index sample documents ───────────────────────────────────── let collection = "example_docs"; store.ensure_collection(collection, 1024).await.ok(); let docs = vec![ "Rust is a systems programming language focused on safety.", "MagicAF provides embeddings, vector search, and LLM orchestration.", "RAG combines retrieval with language model generation.", ]; let doc_texts: Vec<String> = docs.iter().map(|s| s.to_string()).collect(); use magicaf_core::embeddings::EmbeddingService; let embeddings = embedder.embed(&doc_texts).await?; let payloads: Vec<serde_json::Value> = docs.iter().enumerate() .map(|(i, text)| serde_json::json!({ "content": text, "doc_index": i, })) .collect(); use magicaf_core::vector_store::VectorStore; store.index(collection, embeddings, payloads).await?; println!("Indexed {} documents into '{collection}'\n", docs.len()); // ── Build and run RAG workflow ───────────────────────────────── let workflow = RAGWorkflow::builder() .embedding_service(embedder) .vector_store(store) .llm_service(llm) .evidence_formatter(DefaultEvidenceFormatter) .prompt_builder(DefaultPromptBuilder::new().with_system( "You are a helpful technical assistant. \ Answer using only the provided evidence.", )) .result_parser(RawResultParser) .collection(collection) .top_k(3) .build()?; let result = workflow.run("What is MagicAF?", None).await?; println!("── Answer ──────────────────────────"); println!("{}", result.result); println!("────────────────────────────────────"); println!( "Evidence items: {} | Tokens: {:?}", result.evidence_count, result.usage.map(|u| u.total_tokens) ); Ok(()) } Key Points DefaultEvidenceFormatter — pretty-prints each search result’s JSON payload DefaultPromptBuilder::new().with_system(...) — wraps evidence in <context> tags with an optional system instruction RawResultParser — returns the LLM’s output as a plain String top_k(3) — retrieve the 3 most similar documents Running cargo run -p example-minimal-rag Next Steps → Document Q&A — add custom adapters and structured output ...

Prerequisites

MagicAF requires three local services and a Rust toolchain. All services communicate over HTTP — no cloud accounts, no vendor SDKs. Rust Toolchain MagicAF requires Rust 1.75+ (2024 edition). Install via rustup: curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh Verify your installation: rustc --version # 1.75.0 or later cargo --version Local Services MagicAF connects to three local HTTP services. You can use any compatible software. Service Default Port Purpose Recommended Software Embedding server 8080 Produce dense vector embeddings llama.cpp (--embedding), text-embeddings-inference, vLLM Vector database 6333 Store and search embeddings Qdrant LLM server 8000 Chat completion / text generation vLLM, llama.cpp, TGI, Ollama Quick Setup with Docker The fastest way to get all services running: ...

Air-Gapped Setup

MagicAF is designed for air-gapped environments from the ground up. Every service runs locally, and all dependencies can be vendored for offline use. For more information about MagicAF’s security architecture and compliance features, see the About page. Overview An air-gapped deployment requires four preparation steps on an internet-connected machine, followed by transfer and build on the isolated host. Step 1 — Vendor Rust Dependencies On an internet-connected machine: # From the MagicAF workspace root cargo vendor > .cargo/config.toml This downloads all crate dependencies into a vendor/ directory and generates a .cargo/config.toml that redirects Cargo to use local sources. ...

Document Q&A

This example builds a document Q&A system with custom adapters that produce structured JSON answers. Difficulty: ★★☆ Intermediate Custom adapters: EvidenceFormatter, PromptBuilder Output type: QAAnswer (JSON struct) What This Example Does Implements a custom EvidenceFormatter that includes source indices Implements a custom PromptBuilder that requests JSON output Uses JsonResultParser<QAAnswer> to get typed results Domain Result Type use serde::{Deserialize, Serialize}; #[derive(Debug, Serialize, Deserialize)] struct QAAnswer { answer: String, confidence: f32, source_indices: Vec<usize>, } Custom Evidence Formatter Formats results with source indices so the LLM can cite them: ...

Installation

Add Dependencies Add the MagicAF crates to your Cargo.toml: [dependencies] magicaf-core = { path = "path/to/magicaf/magicaf-core" } magicaf-qdrant = { path = "path/to/magicaf/magicaf-qdrant" } magicaf-local-llm = { path = "path/to/magicaf/magicaf-local-llm" } tokio = { version = "1", features = ["full"] } Note: MagicAF is distributed as source. Replace the path values with the actual location of the MagicAF workspace on your system. Optional Dependencies These are commonly used alongside MagicAF: [dependencies] anyhow = "1" # Ergonomic error handling serde = { version = "1", features = ["derive"] } # For custom result types serde_json = "1" # JSON payloads tracing = "0.1" # Structured logging tracing-subscriber = { version = "0.3", features = ["json", "env-filter"] } async-trait = "0.1" # For implementing adapter traits Environment Variables Configure service endpoints via environment variables: ...

RAG Pipeline

The RAGWorkflow engine executes a deterministic six-step pipeline every time you call .run(). Each step is handled by a pluggable component. Pipeline Overview Q u e r y → [ E m b e d ] → [ S e a r c h ] → [ F o r m a t ] → [ B u i l d P r o m p t ] → [ L L M ] → [ P a r s e ] → R e s u l t < T > Step-by-Step Breakdown Step 1 — Embed the Query let query_vector = embedding_service.embed_single(query).await?; The user’s question is converted into a dense vector using the configured EmbeddingService. This vector represents the semantic meaning of the query. ...

Related resources: Intracav AI · Intracav Blog · QPolicy