01.03 · Prompt Engineering — Deep Dive

Level: Intermediate
Pre-reading: 01 · AI & LLM Foundations · 01.01 · How LLMs Work


Prompt Structure

Every LLM interaction has three roles:

Role Purpose Example
System Defines agent persona, constraints, output format "You are a senior Spring Boot developer. Return only valid Java code."
User The actual request or input "Implement a REST endpoint for creating an order."
Assistant The model's response (or few-shot examples) Previous turn's output, provided to establish pattern

Core Techniques

Zero-Shot Prompting

Ask the model directly without examples. Works for common tasks where the model has strong priors.

Summarise this JIRA ticket in one sentence and extract the acceptance criteria as a JSON array.

Ticket: [ticket text here]

Few-Shot Prompting

Provide 2–3 input/output examples before asking for the real task. Dramatically improves consistency.

Example 1:
Input: "Login button not responding on mobile"
Output: { "type": "bug", "component": "auth-service", "severity": "high" }

Example 2:
Input: "Add support for OAuth2 login"
Output: { "type": "feature", "component": "auth-service", "severity": "medium" }

Now classify:
Input: "Checkout throws 500 when cart is empty"

Chain-of-Thought (CoT)

Ask the model to reason step by step before giving the final answer. Essential for complex code analysis.

Think step by step:
1. What exception is being thrown?
2. What code path leads to it?
3. What is the root cause?
4. What is the minimal fix?

Stack trace: [...]

ReAct Prompting

Combines Reasoning and Acting. The model alternates between thinking and tool use. The foundation of agents.

Thought: I need to find which service handles the payment flow.
Action: search_codebase("PaymentService")
Observation: Found PaymentService.java in billing-service/
Thought: I should read this file to understand the bug.
Action: read_file("billing-service/src/PaymentService.java")
...
Final Answer: The bug is on line 42 — null check missing before .getAmount()

System Prompt Design for Dev Agents

A well-designed system prompt for a JIRA→PR agent:

You are an expert Java/Spring Boot developer working on a microservices platform.

Your capabilities:
- Read JIRA tickets and extract requirements
- Search and read source code files
- Make targeted, minimal code changes
- Write JUnit 5 test cases
- Create descriptive Git commit messages

Your constraints:
- Never modify files outside the identified service boundary
- Always write at least one unit test for every change
- Output code changes as unified diffs, not full file rewrites
- If unsure about intent, ask a clarifying question rather than guessing
- Never hardcode credentials or secrets

Output format for code changes:
{
  "service": "order-service",
  "files_changed": [...],
  "diff": "...",
  "tests_added": [...],
  "pr_description": "..."
}

System Prompt Injection Risk

If any user-controlled input (JIRA ticket content, code comments) reaches the system prompt, it can override your instructions. Always separate trusted system instructions from untrusted user data. See 08.01 · Prompt Injection.


Structured Output

Always request structured output for agentic pipelines. Parsing free text is fragile.

Method When to Use
JSON mode (OpenAI) Force valid JSON output, no schema validation
Structured outputs (OpenAI) Force output conforming to a JSON schema — best for production
Tool call response Model returns a function call structure instead of text
XML tags (Claude) Wrap sections in <code>, <analysis>, <summary> for reliable parsing

What is the difference between few-shot and fine-tuning?

Few-shot puts examples in the prompt at inference time — fast to iterate, no training cost, but consumes context window tokens. Fine-tuning bakes examples into model weights — no context overhead, consistent style, but requires data collection and retraining. For most dev automation tasks, few-shot is sufficient and preferable.

How do you prevent an agent from going off-task?

Use a tight system prompt with explicit constraints, structured output formats, and output validation before any action is taken. LangGraph allows you to add conditional edges that reject outputs not matching a schema and re-prompt the model. See 04 · LangGraph.