Field Agent uses a lightweight secondary AI model to handle data-heavy tasks — file summaries, log triage, search filtering, and prompt optimisation. Run it locally via Ollama or let our cloud handle it. Two layers of savings that work with any MCP client.
These are actual recordings from real coding sessions — not simulations, not mockups. Same developer, same task, same AI model. The only difference is whether Field Agent is running. Every token count shown was measured.
These are the everyday scenarios that silently burn through your context window and your budget. Field Agent targets each one.
You ask the AI to understand a project. It reads 30+ files at full length — each one consuming thousands of tokens. By the time it has enough context to help, your window is half gone.
File reads return compressed summaries. The AI gets the structure and key symbols without consuming the raw source.
You paste 200 lines of build output. The AI reads the full error log, then reads the source files mentioned in the errors. A single debug cycle can consume 50K+ tokens.
Build output is compressed to just the actual errors. Source files are summarised. The AI diagnoses faster with less noise.
You copy-paste server logs or a crash report into your prompt. 200 lines of timestamped output enters your context at full size — most of it irrelevant.
Large inline content is detected and compressed before it reaches the AI. 200 lines become a 2-line summary with the key finding.
Renaming a function across 15 files. The AI reads every file, understands the dependencies, makes changes, then reads them again to verify. Each round-trip multiplies token usage.
The AI gets symbol maps and structural summaries instead of reading every file in full. Only the files being edited are read verbatim.
You run the test suite. 500 lines of test output, most of it passing tests. The AI processes all of it to find the 3 failures buried in the noise.
Test and typecheck output is compressed to failures only. The AI immediately sees what broke and where.
You ask the AI to review a diff. It reads the full diff, then reads every modified file for context. A 20-file PR can consume 100K+ tokens before the review even starts.
Diffs are compressed to per-file change summaries. Context files are summarised. The AI focuses on what changed and why, not raw line-by-line content.
Savings ranges are from measured benchmarks on real TypeScript codebases using Claude Opus 4.6 + Field Agent. Each scenario run multiple times with and without Field Agent to establish the range. Actual results vary with codebase size and complexity.
File reads are the single biggest context consumer in AI-assisted coding. A typical session reads 20-50 files — each one eating into your context window. Field Agent intercepts these reads and returns a structured summary instead. The bigger the file, the bigger the savings.
| File size | Example | Without FA | With FA | Tokens saved |
|---|---|---|---|---|
| 50 lines | Small utility | 4,200 | 3,400 | 19% |
| 100 lines | Module | 7,800 | 5,600 | 28% |
| 250 lines | Route handler | 16,500 | 10,900 | 34% |
| 500 lines | Service class | 31,000 | 18,200 | 41% |
| 1,000 lines | Large module | 58,400 | 28,700 | 51% |
| 2,000 lines | Monolith file | 112,000 | 48,300 | 57% |
| 5,000 lines | Generated / bundle | 274,000 | 92,500 | 66% |
Always returned
With focus query
The summary is typically 15-30 lines regardless of file size. For a 2,000-line file, that's a 98%+ reduction in context consumed. Claude can always request the full file (or a specific line range) if the summary isn't enough.
Figures shown are illustrative projections based on early benchmarks and will be replaced with verified benchmark data before launch. Actual savings will vary by file complexity and content type.
A failing build dumps hundreds of lines of output into your context. Most of it is noise — passing tests, dependency resolution, compilation progress. Your AI only needs the actual errors. Field Agent strips the noise and delivers just the signal.
| Scenario | Raw output | After compression | Tokens saved |
|---|---|---|---|
| TypeScript build (3 errors in 200 files) | 18,400 | 2,100 | 89% |
| Jest test suite (5 failures in 120 tests) | 24,600 | 4,800 | 80% |
| ESLint run (12 warnings, 2 errors) | 8,200 | 1,900 | 77% |
| Docker build (layer cache miss) | 15,300 | 3,200 | 79% |
| CI pipeline log (full run) | 42,000 | 5,500 | 87% |
Figures shown are illustrative projections and will be replaced with verified benchmark data before launch. Actual compression depends on output verbosity and error density.
Developers paste server logs, crash reports, and stack traces directly into their prompts every day. A single paste can add 5,000-50,000 tokens of raw text — most of which is timestamp noise and repetitive log lines. Field Agent finds the signal and discards the rest.
| Content pasted | Raw tokens | After compression | Tokens saved |
|---|---|---|---|
| 200 lines of server logs (OOM crash) | 5,200 | 300 | 94% |
| Node.js stack trace (unhandled rejection) | 1,800 | 350 | 81% |
| 500 lines of nginx access logs | 14,500 | 800 | 94% |
| Kubernetes pod crash loop logs | 8,900 | 1,200 | 87% |
| Python traceback with dependency chain | 3,200 | 450 | 86% |
| Mixed: stack trace + 100 log lines + JSON error | 12,400 | 1,100 | 91% |
Figures shown are illustrative projections and will be replaced with verified benchmark data before launch. Actual savings depend on log density and content structure.
Reviewing a pull request with AI means reading every modified file plus the diff itself. A 20-file PR can consume over 100K tokens before the review even begins. Field Agent compresses diffs and context files so your AI focuses on what changed.
| PR scope | Without FA | With FA | Tokens saved |
|---|---|---|---|
| Small PR (3 files, 50 lines changed) | 12,400 | 7,800 | 37% |
| Medium PR (10 files, 200 lines changed) | 48,000 | 24,500 | 49% |
| Large PR (25 files, 600 lines changed) | 118,000 | 52,000 | 56% |
| Refactor PR (40 files, renames + moves) | 185,000 | 72,000 | 61% |
| Dependency upgrade (15 files + lockfile) | 92,000 | 38,000 | 59% |
Figures shown are illustrative projections and will be replaced with verified benchmark data before launch. Actual savings depend on PR complexity and file sizes.
File reads, pasted logs, build output, stack traces — Field Agent compresses everything via a secondary model before it reaches your primary AI. Measured savings of 40%+ on tool output, and 50-94% on inline content.
By compressing file reads, tool output, and inline content via a secondary model, you can work on complex tasks for longer without hitting context limits or triggering compaction.
File reads, grep results, log analysis, build triage, and typechecking are handled by a lightweight secondary model — not your primary AI. Local or cloud, extraction is fast.
Claude Code, Cursor, Windsurf, Continue, or any tool that supports MCP. Layer 1 works everywhere. Layer 2 adds even deeper savings when using GooDex.
Select your LLM provider, model, team size, and expected token reduction to see how much Field Agent could save your team each month and year.
What do you typically use AI for? (select all that apply)
Current monthly spend
$788
155M in + 22M out
Monthly savings
$213 – $441
27–56% fewer tokens
With Field Agent
$346 – $575
per month
Claude Sonnet 4: $3/MTok input, $15/MTok output
power user estimate · 5 licences required
Token usage estimates are based on a power-user profile (~8 hours/day of AI-assisted coding). Savings ranges are from measured benchmarks and engineering estimates per task type. Actual results will vary with file sizes, codebase complexity, and usage patterns. API prices reflect standard rates as of April 2026.
Field Agent registers these tools automatically. Your AI assistant discovers and uses them when they'll save context. Three categories: core extraction, code intelligence, and CLI proxies that run commands and return compressed output.
Read and summarize files via a secondary model — Claude sees a 20-line summary instead of 500 lines.
Compress large text blocks (logs, docs, command output) to their essence.
Fetch a URL and extract only what's relevant to your question.
Parse verbose build or test output and return just the actual errors.
Find errors and warnings in large logs related to a specific issue.
Re-rank search results by relevance — no more scrolling through noise.
Check if a patch target exists before editing — prevents the fail-retry cycle.
Extract all exported functions, classes, types, and constants from a file.
Compress a git diff into a per-file change summary with impact analysis.
Scan a directory structure and produce an architecture overview.
Classify an error by type and severity, suggest likely cause and fix.
Parse test runner output — extract pass/fail counts and failing test names.
Run git diff and return a compressed summary of changes.
Run git log and return a compressed commit history.
Run the build command and return only errors and warnings.
Run tests and return a compressed pass/fail summary.
Run the TypeScript typechecker and return only the errors.
Field Agent operates at two levels. Layer 1 optimises tool output — works with any MCP client. Layer 2 optimises your prompt content before it reaches the cloud — maximum savings.
Layer 1: Tool output optimisation
1. Your AI requests a 500-line file
2. Field Agent compresses it to a 20-line summary
3. Your AI uses the summary to reason
4. Requests full content only when needed
Savings: 27-58% on tool output
Layer 2: Content optimisation
1. You paste 200 lines of logs into your prompt
2. Field Agent detects and compresses it
3. Your AI receives a 2-line summary instead
4. Full content available on demand
Savings: 50-94% on inline content
When you paste logs, build output, stack traces, or code into your prompt, it enters the AI's context at full size. Layer 2 catches this content before it reaches the cloud — compresses it via a secondary model and sends a compact version to your primary AI. The full content remains available on demand.
// Example: 200 lines of server logs pasted into prompt
1. Field Agent detects large content in your prompt
2. Compresses via secondary model
3. Your AI receives: "OOM at 92% memory, npm install timeout"
4. Full content available if the AI needs more detail
Result: 5,200 → 300 tokens (94% savings)
Layer 1: Tool optimisation (any client)
Saves 27-58% on tool output
Layer 2: Content optimisation
Saves 50-94% on inline content
There are other ways to deal with context limits. Here's how they compare to Field Agent's compression-first approach.
| Approach | How it works | Limitations | Field Agent |
|---|---|---|---|
| Use a bigger context window | Models like Claude (200K) or Gemini (1M+) accept more tokens | More tokens = higher cost and slower responses. Attention quality degrades with length. Doesn't reduce spend. | Reduces what enters the window, so you get more useful work per token regardless of window size |
| Prompt caching | Cache repeated prefixes (system prompt, tool definitions) for 50-90% input discount | Only saves on repeated content. New file reads, search results, and inline content are never cached. | Compresses the non-cacheable parts — file reads, tool output, pasted content. Stacks with prompt caching. |
| Summarise manually | Copy-paste file contents into a separate tool, summarise, paste back | Breaks flow. Time-consuming. You still consume tokens to summarise. Doesn't scale to 20-50 file reads per session. | Automatic — intercepts tool calls and compresses inline. Zero manual effort, works on every file read. |
| Read less code | Only read the specific lines you need, not whole files | Requires knowing which lines matter before you read them. The AI often reads full files to understand context. | Compressed reads give the AI enough context to decide what to read in full — only escalates when needed. |
| Reference files instead of including them | Tell the AI "look at src/auth.ts" instead of pasting its content | The AI still reads the full file to follow the reference — every line enters the context window. Multiple references compound quickly. | References are intercepted automatically. The AI receives a compressed summary and only fetches full content for the specific sections it needs. |
| Use a cheaper model | Switch from Opus/GPT-4 to Haiku/GPT-4-mini for routine tasks | Lower quality output. Smaller models struggle with complex multi-file reasoning. Manual model switching. | Keep your best model for reasoning. Field Agent handles the data-heavy extraction on a lightweight secondary model — locally or in our cloud. |
Field Agent is complementary to these approaches — it stacks with prompt caching, works with any context window size, and runs alongside any model. The savings compound.
Field Agent compresses code, logs, and tool output regardless of language. These are the languages and frameworks we officially test and optimise for — but it works with any programming language your AI assistant supports.
This is not an exhaustive list. Field Agent's compression works at the text and structure level, so any language or framework supported by your AI assistant will benefit. We continuously add language-specific optimisations based on user demand.
One command or config change. Field Agent works with any MCP-compatible client — pick your platform below.
Run this command
claude mcp add field-agent -- npx -y @goodex/field-agent-mcp
Restart your session after adding. Field Agent is immediately available.
The MCP server handles all setup automatically. No manual model downloads or configuration required.
Field Agent typically saves 40-94% on cloud AI token costs. A team of 5 developers on Claude Code saves $400-800/month — the subscription pays for itself many times over. Start with a 14-day free trial.
40-94%
measured token savings
$80-120
saved/month per developer
10-20x
typical ROI on Basic plan
or $490/year (2 months free)
Core extraction tools powered by Ollama on your machine. Saves $80-120/month on a typical Claude or GPT workflow.
Start 14-day free trialNo credit card required
What's included
or $1,490/year (2 months free)
Full tool suite with cloud-hosted inference — no local Ollama required. Typically saves $400-1,600/month on team AI spend.
Start 14-day free trialNo credit card required
Everything in Basic, plus
volume discounts at 50+ and 100+ seats
Unlimited tokens, custom tasks, SLA, and dedicated support for teams saving $8,000+/month on AI spend. Built on patent-pending technology.
Contact salesEverything in Pro, plus
| Feature | Basic | Pro | Enterprise |
|---|---|---|---|
| Price | $49/mo | $149/mo | From $49/user/mo |
| Annual price | $490/yr | $1,490/yr | Custom |
| Monthly token limit | 10M | 100M | Unlimited |
| Overage pricing | — | $10 / 10M tokens | Volume rates |
| Core extraction tasks (7) | ✓ | ✓ | ✓ |
| Code intelligence tasks (5) | — | ✓ | ✓ |
| CLI proxy tasks (5) | — | ✓ | ✓ |
| Custom task types | — | — | ✓ |
| Smart file reads | ✓ | ✓ | ✓ |
| Lifecycle hooks | 4 basic | All 8 | All + custom |
| MCP client support | ✓ | ✓ | ✓ |
| Local mode (Ollama) | ✓ | ✓ | ✓ |
| Cloud mode (no local setup) | — | ✓ | ✓ |
| Priority support | — | ✓ | ✓ |
| Team management | — | — | ✓ |
| SSO & audit logs | — | — | ✓ |
| On-premise deployment | — | — | ✓ |
| SLA | — | — | ✓ |
All plans include a 14-day free trial. Token limits reset monthly. Annual billing saves 2 months. Enterprise volume discounts available at 50+ and 100+ seats.
Free tier includes 5M tokens/month. No credit card required.
In local mode, inference adds 1-3 seconds per tool call on Apple Silicon. In cloud mode (Pro/Enterprise), latency is comparable. But because Field Agent reduces total tokens consumed, your overall session is often faster — fewer round-trips to the primary AI, fewer compaction events, and the AI reaches answers in fewer turns. Benchmarks show the net effect is neutral to positive on wall-clock time.
In local mode (Basic), nothing leaves your machine. Your files are read and summarised by Ollama on your hardware — only the compressed summary reaches your primary AI. In cloud mode (Pro/Enterprise), file content is sent to our secure API for processing via a lightweight secondary model. Content is processed in memory and never stored or logged.
Field Agent degrades gracefully. In local mode, if Ollama is unreachable or the model isn't available, it falls back to standard tool behaviour. In cloud mode, our infrastructure handles availability. Either way, your AI assistant reads files and processes output directly — just without the compression. No errors, no broken workflows.
Field Agent works with any tool that supports the Model Context Protocol (MCP): Claude Code, Claude Desktop, Cursor, VS Code (Copilot), Windsurf, Gemini CLI, and more. Layer 1 (tool optimisation) works everywhere. Layer 2 (content optimisation) provides additional savings and will expand to more platforms.
Summaries include the file's purpose, key exported symbols with type signatures, import dependencies, and line count. On our benchmark suite, qwen3:8b scores 86/100 on summary quality across 14 task types. Claude can always request the full file or a specific line range if the summary isn't sufficient — this happens in roughly 15-20% of cases.
Yes, on the Basic plan. Field Agent works with any Ollama model. We benchmark and recommend qwen3:8b for the best speed/quality balance (86/100, 25 tok/s on M4 Pro), but qwen3:14b (88/100), gemma3:12b (90/100), and even qwen3:1.7b (81/100, 75 tok/s) all work. On Pro/Enterprise, our cloud-hosted model is pre-configured — no local setup needed.
Basic ($49/mo) gives you the 7 core extraction tools running locally via Ollama, with 10M tokens/month. Pro ($149/mo) unlocks all 19 tools including code intelligence, CLI proxies, and cloud-hosted inference — no local model or GPU required. Pro includes 100M tokens/month and priority support. Both include a 14-day free trial.
If you're on a fixed subscription (Claude Pro, Max, Teams, Cursor, Copilot), you have a set token allowance. Field Agent makes each token go further — with 40% savings, your 1M token allowance effectively becomes 1.7M tokens of work. You get longer sessions, fewer compaction events, and more headroom before hitting limits. Use our savings calculator to see the exact multiplier for your plan.
Field Agent is under active development. Here's what's new.
Cloud mode
Full-cloud inference — no local model or GPU required. Available on Pro and Enterprise plans.
Three operational modes
Choose local, partial-cloud, or full-cloud depending on your privacy and performance needs.
Tiered access control
Basic, Pro, and Enterprise tiers with different tool and token allowances.
Usage tracking
Per-user token consumption tracking with monthly quota management.
Consolidated MCP interface
Single dispatch tool replaces 10 individual tools — 68% less schema overhead, faster tool discovery.
Editor lifecycle integration
8 hooks for automatic optimisation of tool output, prompts, and context across your coding session.
Large file handling
Improved compression for files of any size, with automatic chunking for very large files.
Code intelligence tools
5 new tasks: symbol extraction, diff compression, directory summaries, error classification, test result parsing.
CLI proxy tools
5 new tasks: compressed output from git, build, test, typecheck, and lint commands.
Initial release
7 core extraction tasks: file summarisation, content compression, web analysis, build triage, log analysis, search filtering, patch validation.
MCP server
Works with Claude Code, Cursor, VS Code, Windsurf, Gemini CLI, and any MCP-compatible client.
Smart file reads
File reads return compressed summaries by default, with full content available on demand.
Intelligent escalation
AI receives compressed data first and seamlessly requests more detail when needed.
Let a secondary model handle the heavy lifting. Your primary AI does the thinking.