06.03 · CI/CD Integration
Level: Intermediate
Pre-reading: 06 · AI Tool Ecosystem · 05 · MCP Servers
AI in the CI/CD Pipeline
CI/CD pipelines are where agent automation delivers the most value — failures are structured, outputs are deterministic, and the feedback loop is tight.
graph LR
A[Developer pushes code] --> B[CI: Build + Test]
B -->|Tests pass| C[Human review + merge]
B -->|Tests fail| D[AI Agent triggered]
D --> E[Analyse failure]
E --> F{Fixable automatically?}
F -->|Yes| G[Push fix commit]
F -->|No| H[Post RCA comment on PR]
G --> B
GitHub Actions Integration Pattern
An AI agent can be triggered as a GitHub Actions workflow step:
# .github/workflows/ai-fix-tests.yml
name: AI Test Failure Analysis
on:
workflow_run:
workflows: ["CI"]
types: [completed]
jobs:
analyze:
if: ${{ github.event.workflow_run.conclusion == 'failure' }}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Download test artifacts
uses: actions/download-artifact@v4
with:
name: test-results
- name: Run AI analysis agent
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
python scripts/analyze_failures.py \
--report test-results/playwright-report.json \
--pr ${{ github.event.workflow_run.pull_requests[0].number }}
Test Report Parsing
The agent's first step is reading a structured test report:
| Report Format | Framework | Key Fields |
|---|---|---|
| JUnit XML | Maven Surefire, Spring Boot tests | testcase[failure], classname, name, message |
| Playwright JSON | Playwright | suites[].specs[].tests[].results[].status, error.message, error.stack |
| Playwright HTML | Playwright | Human-readable, needs HTML parsing |
| Allure JSON | Multi-framework | Rich metadata including steps and attachments |
Structure Over Screenshots
Prefer JSON/XML test reports over HTML screenshots for agent input. Structured data is cheaper to process (fewer tokens) and more reliably parseable than HTML.
Playwright-Specific Integration
| Failure Type | Agent Response |
|---|---|
| Selector not found | Check if UI component was renamed/restructured; update selector or report UI change |
| API call returned 4xx/5xx | Check service logs, identify if it's a test data issue or a real regression |
| Timeout | Check if page load regression, network issue, or flaky animation causing delay |
| Assertion failure | Compare expected vs actual, check if spec changed or feature regressed |
| Network error | Check if the service is up, inspect environment config |
The Playwright MCP server provides get_network_log and take_screenshot to gather evidence for each failure type.
Scope Constraints for CI Agents
CI agents should never auto-merge
A CI agent can push a fix commit to the PR branch, but the PR merge must remain a human action. Configure branch protection rules to require at least one human reviewer approval before any merge, even for "AI-assisted" PRs.
| Action | Agent | Human |
|---|---|---|
| Analyse test failure | ✓ | |
| Post RCA comment on PR | ✓ | |
| Push fix commit to feature branch | ✓ | |
| Request PR review | ✓ | |
| Approve PR | ✓ | |
| Merge to main | ✓ | |
| Hotfix to production | ✓ |
How do you prevent the AI agent from creating an infinite fix loop?
Track the number of AI-generated fix commits per PR. After 3 attempts without human approval, automatically comment "Unable to auto-fix — needs human investigation" and remove the agent from the loop. Store the attempt count in PR labels or a database keyed on the PR ID.