mrgoonie

context-engineering

@mrgoonie/context-engineering
mrgoonie
1,108
227 forks
Updated 1/6/2026
View on GitHub

Master context engineering for AI agent systems. Use when designing agent architectures, debugging context failures, optimizing token usage, implementing memory systems, building multi-agent coordination, evaluating agent performance, or developing LLM-powered pipelines. Covers context fundamentals, degradation patterns, optimization techniques (compaction, masking, caching), compression strategies, memory architectures, multi-agent patterns, LLM-as-Judge evaluation, tool design, and project development.

Installation

$skills install @mrgoonie/context-engineering
Claude Code
Cursor
Copilot
Codex
Antigravity

Details

Path.claude/skills/context-engineering/SKILL.md
Branchmain
Scoped Name@mrgoonie/context-engineering

Usage

After installing, this skill will be available to your AI coding assistant.

Verify installation:

skills list

Skill Instructions


name: context-engineering description: >- Master context engineering for AI agent systems. Use when designing agent architectures, debugging context failures, optimizing token usage, implementing memory systems, building multi-agent coordination, evaluating agent performance, or developing LLM-powered pipelines. Covers context fundamentals, degradation patterns, optimization techniques (compaction, masking, caching), compression strategies, memory architectures, multi-agent patterns, LLM-as-Judge evaluation, tool design, and project development. version: 1.0.0

Context Engineering

Context engineering curates the smallest high-signal token set for LLM tasks. The goal: maximize reasoning quality while minimizing token usage.

When to Activate

  • Designing/debugging agent systems
  • Context limits constrain performance
  • Optimizing cost/latency
  • Building multi-agent coordination
  • Implementing memory systems
  • Evaluating agent performance
  • Developing LLM-powered pipelines

Core Principles

  1. Context quality > quantity - High-signal tokens beat exhaustive content
  2. Attention is finite - U-shaped curve favors beginning/end positions
  3. Progressive disclosure - Load information just-in-time
  4. Isolation prevents degradation - Partition work across sub-agents
  5. Measure before optimizing - Know your baseline

Quick Reference

TopicWhen to UseReference
FundamentalsUnderstanding context anatomy, attention mechanicscontext-fundamentals.md
DegradationDebugging failures, lost-in-middle, poisoningcontext-degradation.md
OptimizationCompaction, masking, caching, partitioningcontext-optimization.md
CompressionLong sessions, summarization strategiescontext-compression.md
MemoryCross-session persistence, knowledge graphsmemory-systems.md
Multi-AgentCoordination patterns, context isolationmulti-agent-patterns.md
EvaluationTesting agents, LLM-as-Judge, metricsevaluation.md
Tool DesignTool consolidation, description engineeringtool-design.md
PipelinesProject development, batch processingproject-development.md

Key Metrics

  • Token utilization: Warning at 70%, trigger optimization at 80%
  • Token variance: Explains 80% of agent performance variance
  • Multi-agent cost: ~15x single agent baseline
  • Compaction target: 50-70% reduction, <5% quality loss
  • Cache hit target: 70%+ for stable workloads

Four-Bucket Strategy

  1. Write: Save context externally (scratchpads, files)
  2. Select: Pull only relevant context (retrieval, filtering)
  3. Compress: Reduce tokens while preserving info (summarization)
  4. Isolate: Split across sub-agents (partitioning)

Anti-Patterns

  • Exhaustive context over curated context
  • Critical info in middle positions
  • No compaction triggers before limits
  • Single agent for parallelizable tasks
  • Tools without clear descriptions

Guidelines

  1. Place critical info at beginning/end of context
  2. Implement compaction at 70-80% utilization
  3. Use sub-agents for context isolation, not role-play
  4. Design tools with 4-question framework (what, when, inputs, returns)
  5. Optimize for tokens-per-task, not tokens-per-request
  6. Validate with probe-based evaluation
  7. Monitor KV-cache hit rates in production
  8. Start minimal, add complexity only when proven necessary

Scripts

More by mrgoonie

View all
ai-multimodal
1,108

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.

root-cause-tracing
1,108

Systematically trace bugs backward through call stack to find original trigger

databases
1,108

Work with MongoDB (document database, BSON documents, aggregation pipelines, Atlas cloud) and PostgreSQL (relational database, SQL queries, psql CLI, pgAdmin). Use when designing database schemas, writing queries and aggregations, optimizing indexes for performance, performing database migrations, configuring replication and sharding, implementing backup and restore strategies, managing database users and permissions, analyzing query performance, or administering production databases.

chrome-devtools
1,108

Browser automation, debugging, and performance analysis using Puppeteer CLI scripts. Use for automating browsers, taking screenshots, analyzing performance, monitoring network traffic, web scraping, form automation, and JavaScript debugging.