Interview Preparation: Observability Questions¶

Q1: How would you monitor an AI system in production?¶

Answer: Multi-Level Observability¶

Level 1: Metrics (Quantifiable) - Latency: P50, P95, P99 - Cost: per request, per hour, per month - Quality: parsing success rate - Availability: error rate, fallback rate

Level 2: Logging (Detailed) - Trace ID: correlate across services - Tokens used: cost attribution - Error details: type, cause, solution - User context: for debugging

Level 3: Tracing (End-to-End) - Request flow visualization - Bottleneck identification - Service dependencies

Level 4: Analytics (Patterns) - Cost trends - Latency percentile trends - Quality regression detection

Q2: Critical alerts you'd set?¶

Answer¶

Metric	Threshold	Impact
P95 Latency	> 5s	Users experiencing slow/timeout
Error Rate	> 5%	System degraded
Daily Cost	> 1.5x normal	Cost overrun
Cache Hit	< 30%	Wasting money
Fallback Rate	> 10%	AI provider issues

YAML

alerts:
  - alert: HighLatency
    condition: histogram_quantile(0.95, latency) > 5000ms
    duration: 5m
    action: page_oncall

  - alert: HighErrorRate
    condition: error_rate > 0.05
    duration: 5m
    action: page_oncall

  - alert: CostOverrun
    condition: daily_cost > budget * 1.5
    duration: 1h
    action: notify_team

Q3: How to track quality?¶

Answer¶

Automated Metrics: - Response parsing success rate - Schema validation success - Known pattern matching

User Feedback:

Java

// Collect explicit feedback
public void recordUserFeedback(Long resultId, int rating) {
    metrics.recordRating("result", resultId, rating);
}

// Implicit feedback
// - Click-through rate
// - Conversion rate
// - Return rate

Cohort Analysis: - Did AI improve results for 80% of users? - Which use cases benefit most? - Where is quality poor?

Q4: Cost tracking in detail?¶

Answer¶

Java

public void trackLlmCall(AiCallEvent event) {
    // Per-request level
    double cost = event.inputTokens * pricePerInputToken 
               + event.outputTokens * pricePerOutputToken;

    metrics.recordCost(cost, 
        "feature", event.feature,
        "model", event.model);

    // Aggregate levels
    dailyCost += cost;
    monthlyCost += cost;
    featureCost[event.feature] += cost;
}

Visibility: - Per-request cost (granular) - Per-feature cost (feature-level ROI) - Hourly cost (trends) - Daily cost (budget tracking) - Monthly cost (financial planning)

Q5: How to optimize based on observability?¶

Answer¶

Optimize Based on Data:

Low Cache Hit Rate → Improve caching strategy
High Latency → Use faster model or caching
High Cost → Implement caching or cheaper model
Low Quality → Improve prompts or use better model
High Error Rate → Improve error handling

Closed Loop:

Text Only

Observe metrics
    ↓
Identify problem
    ↓
Hypothesize solution
    ↓
Implement change
    ↓
Observe impact
    ↓
Validate improvement

Next: Behavioral Questions