Building Blocks

At this stage, the goal is to connect components that already make sense in isolation.

You are not optimizing yet; you are making interactions predictable and testable.

The current codebase is intentionally small, which makes it a good learning lab: src/api/server.py exposes /chat, src/core/engine.py orchestrates the request, and src/llm/router.py currently routes the word add to the calculator tool with fixed arguments.

Integration map

flowchart LR
    subgraph Interface
      API[API Contract]
    end

    subgraph Logic
      ENG[Engine]
      RTR[Router]
      TOOL[Tool Adapters]
    end

    subgraph State
      MEM[Memory Layer]
      TRACE[Tracing Layer]
    end

    API --> ENG --> RTR --> TOOL
    ENG --> MEM
    ENG --> TRACE

    style API fill:#1976d2,color:#fff
    style ENG fill:#1976d2,color:#fff
    style RTR fill:#1976d2,color:#fff
    style TOOL fill:#ff9800,color:#fff
    style MEM fill:#ff9800,color:#fff
    style TRACE fill:#ff9800,color:#fff

Build checklist

Block	Current status	Next implementation step
Request contract	Present	Add stricter validation for tool-intent metadata
Routing policy	Basic keyword route	Add confidence scoring and fallback tiers
Tool adapter layer	Single tool	Add adapter interface for external service tools
Memory layer	In-memory list	Add persistence backend abstraction
Tracing layer	Print-based traces	Emit structured logs or OpenTelemetry spans

Current implementation snapshot

Step	Source	What happens today
Receive prompt	`src/api/server.py`	FastAPI parses `ChatRequest(prompt: str)` and calls `handle_prompt`
Decide route	`src/llm/router.py`	Prompts containing `add` return `add, {'a': 2, 'b': 3}`
Execute tool	`src/tools/calculator.py`	The calculator adds two integers and returns the sum
Persist memory	`src/memory/store.py`	Prompt and response are appended to an in-memory list
Emit trace	`src/observability/tracer.py`	Trace data is printed with timestamp and payload

Example call flow in Python

from src.core.engine import handle_prompt

response = handle_prompt("add 2 and 3")
print(response)

In the current implementation, that call returns 5 because the router recognizes the keyword add and supplies the calculator with a=2 and b=3.

Suggested tests

Test type	Purpose	Example assertion
Contract test	Verify request schema behavior	Invalid payload returns validation error
Routing test	Verify correct tool mapping	Prompt with "add" selects calculator tool
Integration test	Verify full engine path	Prompt leads to tool output and memory write

When should we split engine responsibilities into multiple modules?

Split after responsibilities become independently testable and frequently changed by different owners. Avoid splitting too early because it increases coordination overhead.

What should be abstracted first for future growth?

Abstract memory and tool adapter boundaries first. Those interfaces change fastest as systems move from demo to production.

What is the biggest limitation in the current intermediate layer?

The router is still hard-coded and only understands one prompt pattern. That is fine for learning, but it needs policy, schemas, and evaluation before broad use.

--8<-- "_abbreviations.md"