Caching
MCPlexer provides a multi-layer caching system that reduces latency and load on downstream MCP servers. Caching is configured per-server and automatically distinguishes between read-only and mutation operations.
Per-Server Configuration
Enable caching on any downstream server definition:
downstream_servers:
- id: github-mcp
command: github-mcp-server
cache:
enabled: true
ttl_seconds: 300
max_entries: 1000| Name | Type | Default | Description |
|---|---|---|---|
enabled | boolean | false | Enable tool call caching for this server |
ttl_seconds | integer | 300 | Time-to-live for cached responses in seconds |
max_entries | integer | 1000 | Maximum number of cached responses before eviction |
Cacheable Patterns
MCPlexer automatically identifies read-only tools by their name prefix. Tools matching these patterns are eligible for caching:
get_*— single resource lookupslist_*— collection queriessearch_*— search operations
When a cacheable tool is called with the same arguments, MCPlexer returns the cached response instead of forwarding to the downstream server.
Mutation Patterns
Tools matching mutation patterns always bypass the cache and invalidate related cached entries:
create_*— new resource creationupdate_*— resource modificationdelete_*— resource deletion
When a mutation tool is called, MCPlexer:
- Bypasses the cache and forwards directly to the downstream server
- Invalidates cached entries for related read tools on the same server
Automatic invalidation
For example, calling github__create_issue will invalidate cached responses for github__list_issues and github__get_issue on the same server.
Targeted Invalidation Rules
For more control, you can define explicit invalidation rules that specify which cached tools should be cleared when a specific tool is called:
downstream_servers:
- id: github-mcp
command: github-mcp-server
cache:
enabled: true
ttl_seconds: 300
invalidation_rules:
- trigger: "github__merge_pull_request"
invalidate:
- "github__list_pull_requests"
- "github__get_pull_request"
- "github__list_reviews"This ensures that merging a PR clears any cached PR listings and review data.
Cache Busting
Force a fresh response for any individual tool call by passing _cache_bust: true in the arguments:
{
"tool": "github__list_repos",
"arguments": {
"owner": "acme",
"_cache_bust": true
}
}The _cache_bust parameter is stripped before forwarding to the downstream server. The fresh response replaces the existing cache entry.
When to cache-bust
Use cache busting when you know data has changed outside of MCPlexer (e.g., a manual edit in the GitHub UI) and need a fresh response.
Flush API
Clear cached entries programmatically via the REST API:
/api/v1/cache/flushBody: {"layer": "tool_call|route|all", "server_id": "optional"}
| Name | Type | Default | Description |
|---|---|---|---|
layer | string | — | Cache layer to flush: tool_call (response cache), route (route resolution cache), or all |
server_id | string | — | Optional — only flush entries for this downstream server |
MCP Flush Tool
MCPlexer also exposes a built-in MCP tool for flushing the cache from within an AI session:
Cache Stats
Monitor cache performance via the stats endpoint:
/api/v1/cache/stats{
"tool_call": {
"hits": 1847,
"misses": 523,
"hit_rate": 0.779
},
"route_resolution": {
"hits": 9214,
"misses": 312,
"hit_rate": 0.967
}
}The dashboard displays these stats in the cache performance panel.
Route Resolution Cache
In addition to tool call caching, MCPlexer caches route resolution results. When a tool call arrives, the routing engine resolves which downstream server and auth scope to use. This resolution is cached with a 30-second TTL per workspace to avoid repeated route matching for identical tool patterns.
The route resolution cache is automatically invalidated when route rules are created, updated, or deleted.
Tools/list cache
MCPlexer caches the aggregated tools list (the response to tools/list requests) with a configurable TTL (default: 15 seconds). Adjust this in Settings > Tools List Cache TTL or via the tools_cache_ttl_sec setting. This prevents repeated tool discovery calls from hitting all downstream servers on every request.
Automatic Cache Invalidation via Notifications
When a downstream MCP server sends a notifications/tools/list_changed notification, MCPlexer automatically flushes the tools/list cache and forwards the notification to the connected AI client. This ensures tool listings stay fresh without manual intervention or waiting for TTL expiry.