04.01 · LangGraph Deep Dive

Level: Advanced
Pre-reading: 04 · LangGraph & LangChain · 02 · Agentic Patterns


Graph Structure

A LangGraph graph is a StateGraph — a directed graph where every node is a function and edges can be conditional. The graph compiles to a runnable that manages execution order.

graph TD
    START --> read_ticket
    read_ticket --> identify_service
    identify_service --> retrieve_code
    retrieve_code --> analyze_code
    analyze_code --> check_confidence{High confidence?}
    check_confidence -->|Yes| generate_solution
    check_confidence -->|No| retrieve_more_code
    retrieve_more_code --> analyze_code
    generate_solution --> write_tests
    write_tests --> human_review[INTERRUPT: human_review]
    human_review -->|approved| create_pr
    human_review -->|rejected| generate_solution
    create_pr --> END

Node Design Patterns

Tool Node

A node that executes registered tools based on the last LLM message:

# Tools are bound to the LLM model
tools = [read_file, search_codebase, run_tests, create_pr]
llm_with_tools = llm.bind_tools(tools)

def agent_node(state: AgentState) -> dict:
    response = llm_with_tools.invoke(state["messages"])
    return {"messages": [response]}

RAG Node

A node that retrieves context and injects it:

def retrieve_code_context(state: AgentState) -> dict:
    query = f"{state['jira_ticket']['summary']} {state['service_name']}"
    docs = vector_store.similarity_search(query, k=5, filter={"service": state["service_name"]})
    context = "\n\n".join([d.page_content for d in docs])
    return {"retrieved_context": context}

Conditional Edge Function

Decides routing based on state:

def should_continue(state: AgentState) -> str:
    last_message = state["messages"][-1]
    if last_message.tool_calls:
        return "tools"          # route to tool execution node
    if state.get("confidence", 0) < 0.7:
        return "retrieve_more"  # route back to retrieval
    return "generate"           # route to solution generation

Interrupt and Human-in-the-Loop

LangGraph's interrupt() function pauses execution, persists state, and waits for a human response. This is how you implement approval gates.

sequenceDiagram
    participant Graph
    participant Checkpointer
    participant Developer

    Graph->>Checkpointer: Persist state before interrupt
    Graph->>Developer: Notify: "Review this diff and approve/reject"
    Developer->>Graph: .invoke(Command(resume="approved"))
    Graph->>Graph: Continue from persisted state
    Graph->>GitHub: Create PR

Checkpointing Backends

For production, use PostgreSQL or Redis as the checkpointer backend (not in-memory). This means agents can be interrupted and resumed across server restarts — critical for long-running JIRA ticket implementations.


Parallel Subgraphs

LangGraph supports Send API for fan-out parallelism:

graph LR
    A[Identify affected files] --> B[Send to parallel processors]
    B --> C[Analyse file 1]
    B --> D[Analyse file 2]
    B --> E[Analyse file 3]
    C --> F[Aggregate results]
    D --> F
    E --> F

Use this to analyse multiple files concurrently, or to run the test writer and documentation updater in parallel with the code generator.


Error Handling Patterns

Error LangGraph Pattern
Tool call fails Conditional edge routes to retry node (max 3 attempts)
LLM returns invalid structured output Validation node, re-prompt with error message
Max iterations reached Fallback edge to escalation node
External API timeout State flag retry_count, exponential backoff in tool impl
graph LR
    A[Tool Call] --> B{Success?}
    B -->|Yes| C[Continue]
    B -->|No, retry < 3| D[Retry with error context]
    D --> A
    B -->|No, retry = 3| E[Escalate to human]

The ToolNode Pattern

LangGraph provides a pre-built ToolNode that executes all tool calls in the last AI message automatically:

graph LR
    A[LLM Agent Node] -->|has tool calls| B[ToolNode: execute all tool calls]
    B --> A
    A -->|no tool calls| C[END or next node]

This eliminates the need to write tool routing logic manually in most single-agent scenarios.


What is the difference between LangGraph and CrewAI?

LangGraph is a low-level graph execution framework — you define the graph explicitly. CrewAI is a higher-level framework that hides the graph and defines agents by role (e.g., "Researcher", "Writer"). LangGraph gives more control and is better for production systems with complex state; CrewAI is faster to prototype with.

How do you persist a LangGraph agent state across HTTP requests?

Use a PostgresSaver or RedisSaver checkpointer and a thread_id as the conversation identifier. Each HTTP request passes the thread_id to .invoke() — LangGraph loads the saved state and continues from where it left off. This is how you build a stateful JIRA ticket agent that can be interrupted and resumed.

How do you test a LangGraph agent without calling the LLM?

Mock the LLM node by injecting a deterministic function in its place. Pass a pre-populated state with known data directly to downstream nodes. LangGraph lets you invoke individual nodes in isolation — use this for unit tests of your state transformation logic.