09 · Architecture Patterns Reference

Quick reference for the architectural patterns that underpin production AI systems.

System Architecture Patterns

Pattern	Description	When to Use
Agentic Loop	LLM + tools in a perception-plan-act cycle	Any multi-step autonomous task
RAG Pipeline	Retrieval-augmented generation with vector search	Knowledge that changes frequently
Human-in-the-Loop (HITL)	Interrupt workflow for human approval	Irreversible actions
Plan-and-Execute	Separate planning from execution agents	Long, structured tasks
Reflection	Agent critiques and revises its own output	High-stakes generation
Supervisor-Worker	Orchestrator delegates to specialist agents	Multi-domain pipelines
Event-Driven Agent	Agent triggered by webhooks (CI, JIRA)	Automation pipelines
Offline RAG Indexing	Background process keeps vector index fresh	Large, changing codebases

Component Selection Matrix

Need	Recommended Component
LLM for code generation	Claude 3.5 Sonnet or GPT-4o
LLM for fast/cheap tasks	GPT-4o-mini or Claude Haiku
Self-hosted LLM	LLaMA 3.3 70B via Ollama or vLLM
Orchestration / graph	LangGraph
Java-native AI integration	Spring AI
Vector DB (simple)	pgvector (Postgres extension)
Vector DB (scale)	Weaviate or Qdrant
Code embedding	nomic-embed-code
Text embedding	text-embedding-3-large (OpenAI)
Reranking	Cohere Rerank-3
JIRA integration	JIRA MCP Server
GitHub integration	GitHub MCP Server
Tracing/observability	LangSmith or Langfuse
Guardrails	Guardrails AI
PII detection	Microsoft Presidio
Secret detection	gitleaks or TruffleHog

End-to-End Architecture: JIRA → PR

graph TD
    A[JIRA Webhook] --> B[Agent Service · FastAPI]
    B --> C[LangGraph Workflow]

    subgraph Orchestration
        C --> D[Ticket Reader Node]
        D --> E[Service Identifier Node]
        E --> F[Code Retrieval Node · RAG]
        F --> G[Analysis Node · ReAct]
        G --> H[Code Generator Node]
        H --> I[Test Writer Node]
        I --> J[Validator Node · build + lint]
        J --> K[INTERRUPT: Human Review]
        K --> L[PR Creator Node]
    end

    subgraph Tools
        M[JIRA MCP Server]
        N[GitHub MCP Server]
        O[Vector DB · pgvector]
        P[Build Sandbox · Docker]
    end

    D --> M
    F --> O
    H --> P
    L --> N

Data Flow Diagram

graph LR
    subgraph External
        A[JIRA API]
        B[GitHub API]
        C[LLM API · Anthropic]
    end

    subgraph Agent Platform
        D[Input Guardrail]
        E[LangGraph State Machine]
        F[Output Guardrail]
        G[PostgreSQL · state + checkpoints]
        H[pgvector · code embeddings]
    end

    A --> D --> E
    E --> C
    C --> E
    E --> H
    E --> G
    E --> F --> B

Capacity Planning

Component	Key Metric	Typical Values
LLM API	Tokens per run	20K–100K tokens per JIRA ticket
Vector DB	Documents indexed	100K–10M chunks for large codebases
RAG retrieval	Tokens per retrieval call	500–2000 tokens per chunk
Build validation	Docker execution time	1–5 minutes per service
Full run time	End-to-end latency	3–15 minutes per JIRA ticket
Cost per run	API costs	\(0.05–\)2.00 per ticket (model-dependent)

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search