MagicAF requires three local services and a Rust toolchain. All services communicate over HTTP — no cloud accounts, no vendor SDKs.

Rust Toolchain

MagicAF requires Rust 1.75+ (2024 edition). Install via rustup:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Verify your installation:

rustc --version   # 1.75.0 or later
cargo --version

Local Services

MagicAF connects to three local HTTP services. You can use any compatible software.

ServiceDefault PortPurposeRecommended Software
Embedding server8080Produce dense vector embeddingsllama.cpp (--embedding), text-embeddings-inference, vLLM
Vector database6333Store and search embeddingsQdrant
LLM server8000Chat completion / text generationvLLM, llama.cpp, TGI, Ollama

Quick Setup with Docker

The fastest way to get all services running:

1. Qdrant — Vector database

docker run -d --name qdrant \
  -p 6333:6333 -p 6334:6334 \
  qdrant/qdrant:latest

2. Embedding Server — llama.cpp with an embedding model

# Download a quantized embedding model
wget https://huggingface.co/second-state/BGE-large-EN-v1.5-GGUF/resolve/main/bge-large-en-v1.5-Q4_K_M.gguf

# Start the embedding server
./llama-server \
  -m bge-large-en-v1.5-Q4_K_M.gguf \
  --embedding \
  --port 8080

3. LLM Server — vLLM with an instruction-tuned model

python -m vllm.entrypoints.openai.api_server \
  --model mistralai/Mistral-7B-Instruct-v0.2 \
  --port 8000

Tip: Any server that exposes an OpenAI-compatible /v1/chat/completions endpoint will work — Ollama, LocalAI, text-generation-inference, or a custom FastAPI server.

Verify Services

Check that all services are responding:

# Qdrant
curl http://localhost:6333/healthz

# Embedding server
curl http://localhost:8080/health

# LLM server
curl http://localhost:8000/v1/models

All three should return successful HTTP responses.


Next: Installation →