Architecture
How Récif components fit together — two-layer architecture, Kubernetes-native, evaluation-driven lifecycle.
High-Level Overview
Récif separates governance (platform) from execution (agents). Agents are autonomous — they run independently with their own LLM, tools, and channels. Récif is the control tower that optionally watches, evaluates, and governs them.
Note
Key insight: Corail agents work standalone (direct REST/Slack/CLI access) OR through the Récif platform (dashboard, governance, eval). Same agent, two access paths, no code changes. SeeIntroduction — Two Ways to Use Réciffor details.
Component Breakdown
| Component | Tech | Port | Role |
|---|---|---|---|
| Récif API | Go 1.22 | 8080 | 65+ REST endpoints — agent CRUD, releases, eval, governance, feedback |
| Operator | Go, kubebuilder | 8081 | Watches Agent CRDs → reconciles Deployment + Service + ConfigMap |
| Dashboard | Next.js 14 | 3000 | Chat, agent management, governance scorecards, AI Radar |
| Corail | Python 3.13 | 8000/8001 | Agent runtime — LLM, tools, memory, SSE streaming, evaluation |
| PostgreSQL | pgvector | 5432 | Agent metadata, knowledge base vectors |
| MLflow | Python | 5000 | Trace storage, evaluation metrics, experiment tracking |
| recif-state | Git repo | — | Immutable release artifacts (YAML), GitOps |
| Istio | Envoy | — | mTLS, traffic splitting for canary, observability |
Request Flow
What happens when a user sends a message:
Deployment Flow
What happens when you deploy an agent:
Evaluation-Driven Lifecycle
The core differentiator: no agent ships without proof.
14 MLflow Scorers: Safety, Relevance, Correctness, Completeness, Fluency, Equivalence, Summarization, Guidelines, ExpectationsGuidelines, RetrievalRelevance, RetrievalGroundedness, RetrievalSufficiency, ToolCallCorrectness, ToolCallEfficiency.
Namespace Layout
| Namespace | What Runs | Purpose |
|---|---|---|
recif-system | API, Operator, Dashboard | Platform control plane |
team-default | Agent pods (one per agent) | Default team namespace |
team-{name} | Agent pods for team | Multi-tenant isolation |
mlflow-system | MLflow tracking server | Evaluation & traces |
istio-system | Istiod, gateways | Service mesh (optional) |
GitOps: Release Artifacts
Every agent version is an immutable YAML committed to Git:
recif-state/
├── agents/
│ ├── code-reviewer/
│ │ ├── current.yaml ← active version
│ │ └── releases/
│ │ ├── v1.yaml ← immutable
│ │ ├── v2.yaml
│ │ └── v3.yaml
│ └── hr-assistant/
│ ├── current.yaml
│ └── releases/
│ ├── v1.yaml
│ ├── v2.yaml ← status: rejected
│ └── v3.yaml ← status: pending_evalEach release artifact contains: model config, system prompt, tools, skills, governance rules, checksum, changelog. Full audit trail via Git history.
Example Agent CRD
apiVersion: agents.recif.dev/v1
kind: Agent
metadata:
name: hr-assistant
namespace: team-default
spec:
name: "HR Assistant"
framework: corail
strategy: agent-react
channel: rest
modelType: vertex-ai
modelId: gemini-2.5-flash
gcpServiceAccount: "hr-agent@my-project.iam.gserviceaccount.com"
systemPrompt: "You are an HR assistant helping employees..."
tools:
- web-search
- hr-knowledge-lookup
knowledgeBases:
- hr-policies
skills:
- agui-render
storage: postgresql
replicas: 2
evalSampleRate: 10
judgeModel: "openai:/gpt-4o-mini"Technology Stack
| Layer | Component | Technology | Why |
|---|---|---|---|
| Platform | API Server | Go 1.22 + chi | Performance, K8s native |
| Dashboard | Next.js 14 + React | Modern frontend, SSR | |
| Operator | kubebuilder (Go) | Native K8s operator pattern | |
| Runtime | Corail | Python 3.13 | AI/ML ecosystem compatibility |
| Models | 7 providers | Registry pattern, lazy loading | |
| Channels | REST, Slack, CLI, WS | Multi-channel by design | |
| Infra | Database | PostgreSQL + pgvector | Reliable, vector search |
| Eval | MLflow GenAI | Industry standard, 14 scorers | |
| Mesh | Istio + Envoy | mTLS, canary traffic splitting | |
| Packaging | Helm 3 | Standard K8s packaging | |
| Ingestion | Marée (Python) | Docling, pluggable pipeline |