Support triage at Mergify meant juggling Datadog, Sentry, PostgreSQL, Linear, and source code. We built a repo with MCP servers and Claude Code that investigates tickets in parallel — cutting triage from 15 minutes to under 5, with 75% first-pass accuracy.
Support in a B2B infrastructure company is distributed debugging. At Mergify, every engineer is involved in support. When a customer reports an issue, we need to answer fast, accurately, and with confidence. But the investigation surface is fragmented:
Logs in Datadog
Errors in Sentry
Production read-only PostgreSQL
Source code in multiple repositories
Existing tickets in Linear
Customer conversation in our support platform
Each system has context. None of them shares it.
The result used to look like this:
Open ticket
Identify customer org
Search logs
Search Sentry
Check DB state
Look at code
Check if there’s already a Linear issue
Build a timeline
Draft a response
That was 10 to 15 minutes of focused work per ticket, sometimes more. We reduced that to 2-5 minutes of background time using Claude Code + MCP.
This is the architecture.
The Idea: Turn Claude Into a Cross-System Investigator
Instead of building another internal dashboard, we built a GitHub repository.
That repository is an investigation surface for Claude Code.
Inside it:
All company repositories are linked as Git submodules under
src/A static
.mcp.jsondeclares external systemsA
mcp/directory hosts local MCP serversSlash commands encapsulate production SQL workflows
A
CLAUDE.mdfile encodes our full support runbook
Engineers run claude then paste a support ticket URL. Claude performs a first-pass investigation, generates a structured timeline, suggests a root cause hypothesis, and drafts a response. If relevant, it proposes or creates a Linear ticket with a plan.
Architecture
MCP Layer: Unified Tooling via Official Protocol
We use the MCP protocol via @modelcontextprotocol/sdk in TypeScript. There are two types of servers:
1. Remote HTTP MCP (Vendor Hosted)
Sentry
Linear
Configured in .mcp.json:
Claude connects directly. OAuth handled by vendor. No local code required.
2. Local stdio MCP Servers (Thin Wrappers)
We wrote two local MCP servers:
Plain (support platform)
Datadog
Each is a Node.js TypeScript file executed via tsx. Claude Code spawns them as subprocesses using stdio transport.
They are stateless wrappers around vendor SDKs:
@team-plain/typescript-sdk@datadog/datadog-api-client
Example tool definition:
Key characteristics:
Zod schema → auto-converted to JSON Schema
Stateless API calls
No in-memory persistence
Pagination via cursors returned to caller
Auth is environment-driven:
If env vars are missing, servers exit early. CLAUDE.md instructs operators how to fix it.
No dynamic tool registration. Entire .mcp.json is static and versioned.
Claude Code Integration
We use Claude Code CLI.
The repo follows Claude Code conventions:
.mcp.json.claude/settings.json.claude/commands/CLAUDE.md
When you run claude, it:
Loads MCP config
Spawns local servers
Connects to remote MCP endpoints
Loads
CLAUDE.mdEnforces tool permissions
CLAUDE.md: The Investigation Brain
This file is more than a system prompt. It is:
A purpose statement
A repository map
A self-healing setup procedure
A full 5-step support runbook
The investigation section enforces parallelism:
Then:
Claude is instructed to maximize the number of independent background agents. And this matters: sequential thinking kills triage speed. Parallel exploration compresses time-to-signal.
Production Database: Read-Only but Powerful
Claude can query production. The connection path is: Claude → Bash → query-prod.sh → Cloud SQL Proxy → PostgreSQL
There is no persistent connection. The script:
Starts ephemeral Cloud SQL proxy
Runs
psql -c "$QUERY"Tears down the connection
Claude constructs free-form SQL. There is no run_sql() MCP tool. However, we set some guardrails:
Layer | Mechanism |
|---|---|
DB role | Strict read-only IAM role |
Auth | Short-lived gcloud tokens |
Script whitelist | Only |
Prompt guardrails | LIMIT required, no SELECT *, time filters |
Timeouts | 30s proxy timeout + DB query timeout |
The hard boundary is infra-level read-only. Everything else reduces blast radius.
Slash commands like /pr-event-log embed templated SQL patterns for common investigations. This reconstructs:
Event timeline
Check runs
Queue state
Speculative merges
Claude merges that with logs and errors to produce a unified timeline.
End-to-End Flow
When an engineer pastes a ticket URL, the following happens:
Claude fetches thread via Plain MCP
Extracts metadata:
Organization
Repository
PR number
Timestamps
Launches Wave 1 in parallel:
SQL timeline
Sentry searches
Datadog logs
Linear related tickets
Synthesizes findings
If needed:
Inspects engine code under
src/Checks GitHub outage status
Generates:
Structured investigation notes
Hypothesis
Suggested customer response
Linear issue draft with plan
Measurable Impact
Triage time: from 10–15 minutes focused → 2–5 minutes mostly background
First-pass accurate diagnosis: ~75% (based on how often the engineer's final response matches Claude's initial hypothesis)
Team adoption: 100%
Build time: ~10 hours spread over a week
ROI: positive within weeks
The biggest gain is not speed. It is cognitive load reduction.
Instead of juggling five tabs, the engineer reviews a synthesized narrative.
The Hard Part: Confident Wrong Leads
Claude's main failure mode is assuming causality. An example pattern might be:
Sees error in Sentry
Sees a log spike
Correlates with the customer report
Concludes root cause
Sometimes it is correlation, not causation. To detect such cases, we use:
Human intuition
System knowledge
Cross-check inconsistencies
AI is very good at being convincingly wrong across multiple systems. This is why:
Write operations are not fully auto-approved
Linear issue creation may require operator approval
Engineers validate before responding
The system accelerates investigation. It does not replace judgment.
Why This Worked
There are three major reasons that made this new system work:
1. We Encoded the Runbook
Support expertise lived in engineers' heads: now it lives in CLAUDE.md.
Parallel search rules.
Specific Sentry query shapes.
Mandatory time-window scans.
GitHub outage checks.
This standardization alone would have improved quality. Claude just executes it faster.
2. MCP as a Clean Abstraction
MCP gave us:
Unified tool discovery
Schema validation
Clear permission control
Separation between remote and local systems
MCP avoids using custom protocols, gluing microservices, and having to manage a persistent backend. It just leverages a simple GitHub repository.
3. Low Build Cost
Building the entire repository took around 10 hours. Thanks to Claude Code, development ran in parallel with other work. The MCP servers were thin wrappers.
The key insight: you don't need a big AI platform to get leverage. You need integration depth.
What's Next: Autonomous Triage
Today triage is interactive: engineer runs Claude → gets investigation.
Our next step would be:
Background worker triggered on ticket creation
Automatically generates an investigation note
Stores:
Compressed reasoning trace
Structured investigation graph
Attaches context to support the platform
When the engineer reads the ticket, the investigation is already there.
This moves from a human-driven AI assistant to an AI-prepared human decision.
However, we are cautious here: wrong-lead risk increases when no human seeds context. The structured investigation graph is key to making the reasoning auditable.
Conclusion
Support triage is distributed debugging. The hard part is not finding information — it is assembling context from systems that were never designed to share it.
Claude Code and MCP gave us a way to collapse that fragmentation without building new infrastructure. A repository, a few thin wrappers, and an explicit runbook. Total investment: about 10 hours.
What changed is not the speed (though that matters). It is the shift from active investigation to review. Engineers now spend their time validating hypotheses instead of constructing them.
We think the next step — pre-computed investigations attached to tickets before a human opens them — is where this gets genuinely interesting. But even without that, the current setup has already changed how our team thinks about support.





