What is MCP tool deferred loading?

Instead of injecting all tool schemas at startup, the agent searches for tools it needs and loads only those schemas on demand. Claude Code 2.1.7 shipped this as opt-in, reducing init tokens from 75k to 8k in typical setups.

Is MCP context bloat a cost problem or a capability problem?

Both. Capability: agents truncate, forget, fail when context fills. Cost: at enterprise scale with thousands of agent calls per day, paying for 100k+ wasted tokens per request compounds fast.

Why does MCP context bloat block enterprise agents?

Every MCP server sends its full tool schema to the LLM on init. With 3 typical servers (~40 tools), this burns 143,000 of a 200,000-token context before the agent handles any user query.

Does MCP context bloat affect Claude Code specifically?

Yes. Before deferred loading shipped in late 2025, a typical Claude Code setup with 4 MCP servers consumed 51k tokens at startup. A Reddit user documented 67k tokens consumed before writing a single prompt.

How many tokens does a typical MCP server use?

500-1,500 tokens per tool. A server with 30 tools uses 15,000-45,000 tokens just for schema descriptions, injected on every request.

What tools fix MCP context bloat?

Claude Code's deferred tool loading (85% reduction), mcp-compressor by Atlassian Labs (70-97% reduction), mcp2cli (96-99% fewer tokens), MCPlexor (95% reduction). None are universal across all clients.

← Back to dashboard

clawsmith.com/signal/mcp-context-bloat-enterprise-agent-blocker

⚠ IssueUnderservedai_agent_mcpLive

MCP Context Bloat Is Breaking Enterprise AI Agents

Every MCP server injects its full tool schema (500-1500 tokens per tool) on initialization. In a 3-server enterprise setup (GitHub + Slack + Sentry, ~40 tools), 143,000 of 200,000 tokens are consumed before the agent processes a single query. Agents start forgetting, truncating, and failing. Claude Code's tool-search deferred loading cut this 85%, but the problem is still live for any stack not running the latest Claude Code version. Multiple HN threads, paid solutions, and open-source workarounds confirm it's a real deployment blocker.

Score Breakdown

923

Social Proof 2 sources

MCP server that reduces Claude Code context consumption by 98%

mksglu · 3/1/2026

677 HN

Show HN: Mcp2cli – One CLI for every API, 96-99% fewer tokens than native MCP

knowsuchagency · 3/1/2026

246

Gap Assessment

UnderservedExisting solutions leave gaps

Several tools (mcp-compressor, mcp2cli, MCPlexor, Claude Code deferred loading) address it, but none are universal — solutions only work within specific clients or require manual config. Wide-open for a protocol-level fix.

Frequently Asked Questions

Virality Score

923

across 0 platforms

Details

Signalissue

Ecosystemai_agent_mcp

Sources2

Platforms0

Updated4h ago

Trend→ stable

Top ideas

All ideas →

0A mobile app that detects and removes AI-generated tracks from your Spotify playlists before you hear them 0A web app that tracks AI coding tool spend across Copilot, Claude Code, and Cursor, normalized per commit and per PR 0A browser extension that monitors installed extensions for ownership transfers, permission scope changes, and suspicious outbound data requests in real time

Related signals

All signals →

1.1KOpenAI MCP Support in Agents SDK Signals Full Ecosystem Lock-In 261Shadow MCP: Employees Deploying Unauthorized MCP Servers Without IT Oversight 180MCP Security Crisis: 40+ CVEs, 36% SSRF Exposure, Prompt Injection at Scale