AI + SRE case study

Infrastructure Assist Agent

A reusable agent framework for platform and security support workflows, designed to reduce triage overhead and give teams faster self-service answers.

Google ADKAgent framework
New RelicMetrics
SplunkLogs
SlackUser workflow

Problem

Platform teams spend too much time answering repeated support questions, gathering basic context, and jumping across observability tools before they can provide useful guidance.

What I built

I designed an AI-driven support workflow using a root orchestrator and specialized sub-agents for observability, logs, and knowledge retrieval. The agent gathers issue context, queries the right systems, and produces a clear troubleshooting summary.

Design principles

  • Use static workflows where review correctness matters.
  • Use dynamic routing where support context is ambiguous.
  • Keep tool calls scoped and auditable.
  • Summarize evidence, assumptions, and recommended next steps.

Impact

The goal is to reduce support wait time, lower triage toil, and make day-2 platform support more self-service for engineering teams.

← Back to projects