Julian Maurin

Feb 20, 2026

9 min

read

How We Turned Claude Into a Cross-System Support Investigator

landmark photography of trees near rocky mountain under blue skies daytime
landmark photography of trees near rocky mountain under blue skies daytime

Support triage at Mergify meant juggling Datadog, Sentry, PostgreSQL, Linear, and source code. We built a repo with MCP servers and Claude Code that investigates tickets in parallel — cutting triage from 15 minutes to under 5, with 75% first-pass accuracy.

Support in a B2B infrastructure company is distributed debugging. At Mergify, every engineer is involved in support. When a customer reports an issue, we need to answer fast, accurately, and with confidence. But the investigation surface is fragmented:

  • Logs in Datadog

  • Errors in Sentry

  • Production read-only PostgreSQL

  • Source code in multiple repositories

  • Existing tickets in Linear

  • Customer conversation in our support platform

Each system has context. None of them shares it.

The result used to look like this:

  1. Open ticket

  2. Identify customer org

  3. Search logs

  4. Search Sentry

  5. Check DB state

  6. Look at code

  7. Check if there’s already a Linear issue

  8. Build a timeline

  9. Draft a response

That was 10 to 15 minutes of focused work per ticket, sometimes more. We reduced that to 2-5 minutes of background time using Claude Code + MCP.

This is the architecture.

The Idea: Turn Claude Into a Cross-System Investigator

Instead of building another internal dashboard, we built a GitHub repository.

That repository is an investigation surface for Claude Code.

Inside it:

  • All company repositories are linked as Git submodules under src/

  • A static .mcp.json declares external systems

  • A mcp/ directory hosts local MCP servers

  • Slash commands encapsulate production SQL workflows

  • A CLAUDE.md file encodes our full support runbook

Engineers run claude then paste a support ticket URL. Claude performs a first-pass investigation, generates a structured timeline, suggests a root cause hypothesis, and drafts a response. If relevant, it proposes or creates a Linear ticket with a plan.

Architecture

MCP Layer: Unified Tooling via Official Protocol

We use the MCP protocol via @modelcontextprotocol/sdk in TypeScript. There are two types of servers:

1. Remote HTTP MCP (Vendor Hosted)

  • Sentry

  • Linear

Configured in .mcp.json:

{
  "mcpServers": {
    "sentry": {
      "type": "http",
      "url": "<https://mcp.sentry.dev/mcp>"
    },
    "linear": {
      "type": "http",
      "url": "<https://mcp.linear.app/mcp>"
    }
  }
}

Claude connects directly. OAuth handled by vendor. No local code required.

2. Local stdio MCP Servers (Thin Wrappers)

We wrote two local MCP servers:

  • Plain (support platform)

  • Datadog

Each is a Node.js TypeScript file executed via tsx. Claude Code spawns them as subprocesses using stdio transport.

They are stateless wrappers around vendor SDKs:

  • @team-plain/typescript-sdk

  • @datadog/datadog-api-client

Example tool definition:

mcp.tool(
  "search_logs",
  "Search Datadog logs by query",
  {
    query: z.string(),
    from: z.string().default("now-1h"),
    to: z.string().default("now"),
    limit: z.number().int().min(1).max(1000).default(25),
  },
  async ({ query, from, to, limit }) => {
    // call Datadog API
  }
);

Key characteristics:

  • Zod schema → auto-converted to JSON Schema

  • Stateless API calls

  • No in-memory persistence

  • Pagination via cursors returned to caller

Auth is environment-driven:

"env": {
  "DD_API_KEY": "${DD_API_KEY}",
  "DD_APP_KEY": "${DD_APP_KEY}"
}

If env vars are missing, servers exit early. CLAUDE.md instructs operators how to fix it.

No dynamic tool registration. Entire .mcp.json is static and versioned.

Claude Code Integration

We use Claude Code CLI.

The repo follows Claude Code conventions:

  • .mcp.json

  • .claude/settings.json

  • .claude/commands/

  • CLAUDE.md

When you run claude, it:

  1. Loads MCP config

  2. Spawns local servers

  3. Connects to remote MCP endpoints

  4. Loads CLAUDE.md

  5. Enforces tool permissions

CLAUDE.md: The Investigation Brain

This file is more than a system prompt. It is:

  • A purpose statement

  • A repository map

  • A self-healing setup procedure

  • A full 5-step support runbook

The investigation section enforces parallelism:


Then:


Claude is instructed to maximize the number of independent background agents. And this matters: sequential thinking kills triage speed. Parallel exploration compresses time-to-signal.

Production Database: Read-Only but Powerful

Claude can query production. The connection path is: Claude → Bash → query-prod.sh → Cloud SQL Proxy → PostgreSQL

There is no persistent connection. The script:

  1. Starts ephemeral Cloud SQL proxy

  2. Runs psql -c "$QUERY"

  3. Tears down the connection

Claude constructs free-form SQL. There is no run_sql() MCP tool. However, we set some guardrails:

Layer

Mechanism

DB role

Strict read-only IAM role

Auth

Short-lived gcloud tokens

Script whitelist

Only query-prod.sh allowed

Prompt guardrails

LIMIT required, no SELECT *, time filters

Timeouts

30s proxy timeout + DB query timeout

The hard boundary is infra-level read-only. Everything else reduces blast radius.

Slash commands like /pr-event-log embed templated SQL patterns for common investigations. This reconstructs:

  • Event timeline

  • Check runs

  • Queue state

  • Speculative merges

Claude merges that with logs and errors to produce a unified timeline.

End-to-End Flow

When an engineer pastes a ticket URL, the following happens:

  1. Claude fetches thread via Plain MCP

  2. Extracts metadata:

    • Organization

    • Repository

    • PR number

    • Timestamps

  3. Launches Wave 1 in parallel:

    • SQL timeline

    • Sentry searches

    • Datadog logs

    • Linear related tickets

  4. Synthesizes findings

  5. If needed:

    • Inspects engine code under src/

    • Checks GitHub outage status

  6. Generates:

    • Structured investigation notes

    • Hypothesis

    • Suggested customer response

    • Linear issue draft with plan

Measurable Impact

  • Triage time: from 10–15 minutes focused → 2–5 minutes mostly background

  • First-pass accurate diagnosis: ~75% (based on how often the engineer's final response matches Claude's initial hypothesis)

  • Team adoption: 100%

  • Build time: ~10 hours spread over a week

  • ROI: positive within weeks

The biggest gain is not speed. It is cognitive load reduction.

Instead of juggling five tabs, the engineer reviews a synthesized narrative.

The Hard Part: Confident Wrong Leads

Claude's main failure mode is assuming causality. An example pattern might be:

  • Sees error in Sentry

  • Sees a log spike

  • Correlates with the customer report

  • Concludes root cause

Sometimes it is correlation, not causation. To detect such cases, we use:

  • Human intuition

  • System knowledge

  • Cross-check inconsistencies

AI is very good at being convincingly wrong across multiple systems. This is why:

  • Write operations are not fully auto-approved

  • Linear issue creation may require operator approval

  • Engineers validate before responding

The system accelerates investigation. It does not replace judgment.

Why This Worked

There are three major reasons that made this new system work:

1. We Encoded the Runbook

Support expertise lived in engineers' heads: now it lives in CLAUDE.md.

  • Parallel search rules.

  • Specific Sentry query shapes.

  • Mandatory time-window scans.

  • GitHub outage checks.

This standardization alone would have improved quality. Claude just executes it faster.

2. MCP as a Clean Abstraction

MCP gave us:

  • Unified tool discovery

  • Schema validation

  • Clear permission control

  • Separation between remote and local systems

MCP avoids using custom protocols, gluing microservices, and having to manage a persistent backend. It just leverages a simple GitHub repository.

3. Low Build Cost

Building the entire repository took around 10 hours. Thanks to Claude Code, development ran in parallel with other work. The MCP servers were thin wrappers.

The key insight: you don't need a big AI platform to get leverage. You need integration depth.

What's Next: Autonomous Triage

Today triage is interactive: engineer runs Claude → gets investigation.

Our next step would be:

  • Background worker triggered on ticket creation

  • Automatically generates an investigation note

  • Stores:

    • Compressed reasoning trace

    • Structured investigation graph

  • Attaches context to support the platform

When the engineer reads the ticket, the investigation is already there.

This moves from a human-driven AI assistant to an AI-prepared human decision.

However, we are cautious here: wrong-lead risk increases when no human seeds context. The structured investigation graph is key to making the reasoning auditable.

Conclusion

Support triage is distributed debugging. The hard part is not finding information — it is assembling context from systems that were never designed to share it.

Claude Code and MCP gave us a way to collapse that fragmentation without building new infrastructure. A repository, a few thin wrappers, and an explicit runbook. Total investment: about 10 hours.

What changed is not the speed (though that matters). It is the shift from active investigation to review. Engineers now spend their time validating hypotheses instead of constructing them.

We think the next step — pre-computed investigations attached to tickets before a human opens them — is where this gets genuinely interesting. But even without that, the current setup has already changed how our team thinks about support.

Stay ahead in CI/CD

Blog posts, release news, and automation tips straight in your inbox

Stay ahead in CI/CD

Blog posts, release news, and automation tips straight in your inbox

Recommended blogposts

9 min

read

How We Turned Claude Into a Cross-System Support Investigator

Support triage at Mergify meant juggling Datadog, Sentry, PostgreSQL, Linear, and source code. We built a repo with MCP servers and Claude Code that investigates tickets in parallel — cutting triage from 15 minutes to under 5, with 75% first-pass accuracy.

Julian Maurin

9 min

read

How We Turned Claude Into a Cross-System Support Investigator

Support triage at Mergify meant juggling Datadog, Sentry, PostgreSQL, Linear, and source code. We built a repo with MCP servers and Claude Code that investigates tickets in parallel — cutting triage from 15 minutes to under 5, with 75% first-pass accuracy.

Julian Maurin

9 min

read

How We Turned Claude Into a Cross-System Support Investigator

Support triage at Mergify meant juggling Datadog, Sentry, PostgreSQL, Linear, and source code. We built a repo with MCP servers and Claude Code that investigates tickets in parallel — cutting triage from 15 minutes to under 5, with 75% first-pass accuracy.

Julian Maurin

9 min

read

How We Turned Claude Into a Cross-System Support Investigator

Support triage at Mergify meant juggling Datadog, Sentry, PostgreSQL, Linear, and source code. We built a repo with MCP servers and Claude Code that investigates tickets in parallel — cutting triage from 15 minutes to under 5, with 75% first-pass accuracy.

Julian Maurin

7 min

read

Spinners Are the UX Equivalent of “TODO: Fix Later”

We replaced a spinner with a chart-shaped skeleton and realized loading states are part of the layout contract. Bad skeletons cause layout shift. Good ones match the final UI exactly. Here's what we learned fixing ours — and why CLS is a UX problem, not just an SEO metric.

Alexandre Gaubert

7 min

read

Spinners Are the UX Equivalent of “TODO: Fix Later”

We replaced a spinner with a chart-shaped skeleton and realized loading states are part of the layout contract. Bad skeletons cause layout shift. Good ones match the final UI exactly. Here's what we learned fixing ours — and why CLS is a UX problem, not just an SEO metric.

Alexandre Gaubert

7 min

read

Spinners Are the UX Equivalent of “TODO: Fix Later”

We replaced a spinner with a chart-shaped skeleton and realized loading states are part of the layout contract. Bad skeletons cause layout shift. Good ones match the final UI exactly. Here's what we learned fixing ours — and why CLS is a UX problem, not just an SEO metric.

Alexandre Gaubert

7 min

read

Spinners Are the UX Equivalent of “TODO: Fix Later”

We replaced a spinner with a chart-shaped skeleton and realized loading states are part of the layout contract. Bad skeletons cause layout shift. Good ones match the final UI exactly. Here's what we learned fixing ours — and why CLS is a UX problem, not just an SEO metric.

Alexandre Gaubert

5 min

read

Claude Didn’t Kill Craftsmanship

AI doesn't remove craftsmanship: it moves it. The goal was never to protect the purity of the saw. It's to build good furniture. Engineers can now focus on intent, judgment, and product quality instead of translating tickets into code.

Rémy Duthu

5 min

read

Claude Didn’t Kill Craftsmanship

AI doesn't remove craftsmanship: it moves it. The goal was never to protect the purity of the saw. It's to build good furniture. Engineers can now focus on intent, judgment, and product quality instead of translating tickets into code.

Rémy Duthu

5 min

read

Claude Didn’t Kill Craftsmanship

AI doesn't remove craftsmanship: it moves it. The goal was never to protect the purity of the saw. It's to build good furniture. Engineers can now focus on intent, judgment, and product quality instead of translating tickets into code.

Rémy Duthu

5 min

read

Claude Didn’t Kill Craftsmanship

AI doesn't remove craftsmanship: it moves it. The goal was never to protect the purity of the saw. It's to build good furniture. Engineers can now focus on intent, judgment, and product quality instead of translating tickets into code.

Rémy Duthu

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.

Curious where your CI is slowing you down?

Try CI Insights — observability for CI teams.