This skill should be used when working on Lean 4 formalization projects to maintain persistent memory of successful proof patterns, failed approaches, project conventions, and user preferences across sessions using MCP memory server integration
Installation
Details
Usage
After installing, this skill will be available to your AI coding assistant.
Verify installation:
skills listSkill Instructions
name: lean4-memories description: This skill should be used when working on Lean 4 formalization projects to maintain persistent memory of successful proof patterns, failed approaches, project conventions, and user preferences across sessions using MCP memory server integration
Lean 4 Memories
Overview
This skill enables persistent learning and knowledge accumulation across Lean 4 formalization sessions by leveraging MCP (Model Context Protocol) memory servers. It transforms stateless proof assistance into a learning system that remembers successful patterns, avoids known dead-ends, and adapts to project-specific conventions.
Core principle: Learn from each proof session and apply accumulated knowledge to accelerate future work.
When to Use This Skill
This skill applies when working on Lean 4 formalization projects, especially:
- Multi-session projects - Long-running formalizations spanning days/weeks/months
- Repeated proof patterns - Similar theorems requiring similar approaches
- Complex proofs - Theorems with multiple attempted approaches
- Team projects - Shared knowledge across multiple developers
- Learning workflows - Building up domain-specific proof expertise
Especially important when:
- Starting a new session on an existing project
- Encountering a proof pattern similar to previous work
- Trying an approach that previously failed
- Needing to recall project-specific conventions
- Building on successful proof strategies from earlier sessions
How Memory Integration Works
Memory Scoping
All memories are scoped by:
- Project path - Prevents cross-project contamination
- Skill context - Memories tagged with
lean4-memories - Entity type - Structured by pattern type (ProofPattern, FailedApproach, etc.)
Example scoping:
Project: /Users/freer/work/exch-repos/exchangeability-cursor
Skill: lean4-memories
Entity: ProofPattern:condExp_unique_pattern
Memory Types
1. ProofPattern - Successful proof strategies
Store when: Proof completes successfully after exploration
Retrieve when: Similar goal pattern detected
2. FailedApproach - Known dead-ends to avoid
Store when: Approach attempted but failed/looped/errored
Retrieve when: About to try similar approach
3. ProjectConvention - Code style and patterns
Store when: Consistent pattern observed (naming, structure, tactics)
Retrieve when: Creating new definitions/theorems
4. UserPreference - Workflow customization
Store when: User expresses preference (verbose output, specific tools, etc.)
Retrieve when: Choosing between options
5. TheoremDependency - Relationships between theorems
Store when: One theorem proves useful for proving another
Retrieve when: Looking for helper lemmas
Memory Workflows
Storing Memories
After successful proof:
-- Just proved: exchangeable_iff_fullyExchangeable
-- Store the successful pattern
Store:
- Goal pattern:
exchangeable X ↔ fullyExchangeable X - Successful tactics:
[apply measure_eq_of_fin_marginals_eq, intro, simp] - Helper lemmas used:
[prefixCylinder_measurable, isPiSystem_prefixCylinders] - Difficulty: medium (54 lines)
- Confidence: high (proof clean, no warnings)
After failed approach:
-- Attempted: simp only [condExp_indicator, mul_comm]
-- Result: infinite loop, build timeout
Store:
- Failed tactic:
simp only [condExp_indicator, mul_comm] - Error: "infinite simp loop"
- Context: conditional expectation with indicator
- Recommendation: "Use simp only [condExp_indicator] without mul_comm"
Project conventions observed:
-- Pattern: All measure theory proofs start with haveI
haveI : MeasurableSpace Ω := inferInstance
Store:
- Convention: "Measure theory proofs require explicit MeasurableSpace instance"
- Pattern:
haveI : MeasurableSpace Ω - Frequency: 15 occurrences
- Files: DeFinetti/ViaL2.lean, Core.lean, Contractability.lean
Retrieving Memories
Starting new proof session:
- Load project-specific conventions
- Retrieve similar proof patterns from past work
- Surface any known issues with current file/module
Encountering similar goal:
⊢ condExp μ m X =ᵐ[μ] condExp μ m Y
Memory retrieved: "Similar goals proved using condExp_unique"
Pattern: "Show ae_eq, verify measurability, apply condExp_unique"
Success rate: 3/3 in this project
Before trying a tactic:
About to: simp only [condExp_indicator, mul_comm]
Memory retrieved: ⚠️ WARNING - This combination causes infinite loop
Failed in: ViaL2.lean:2830 (2025-10-17)
Alternative: Use simp only [condExp_indicator], then ring
Integration with lean4-theorem-proving Skill
The lean4-memories skill complements (doesn't replace) lean4-theorem-proving:
lean4-theorem-proving provides:
- General Lean 4 workflows (4-Phase approach)
- mathlib search and tactics reference
- Automation scripts
- Domain-specific knowledge (measure theory, probability)
lean4-memories adds:
- Project-specific learned patterns
- History of what worked/failed in this project
- Accumulated domain expertise from your proofs
- Personalized workflow preferences
Use together:
- lean4-theorem-proving guides general workflow
- lean4-memories provides project-specific context
- Memories inform tactics choices from lean4-theorem-proving
Memory Operations
Storing a Successful Proof Pattern
After completing a proof, store the pattern using MCP memory:
What to capture:
- Goal pattern - Type/structure of goal (equality, exists, forall, etc.)
- Tactics sequence - Tactics that worked, in order
- Helper lemmas - Key lemmas applied
- Difficulty - Lines of proof, complexity estimate
- Confidence - Clean proof vs sorries/warnings
- Context - File, module, theorem name
When to store:
- Proof completed successfully (no sorries)
- Non-trivial (>10 lines or required exploration)
- Likely to be useful again (similar theorems expected)
Storage format:
Entity type: ProofPattern
Name: {descriptive_name}
Attributes:
- project: {absolute_path}
- goal_pattern: {pattern_description}
- tactics: [list, of, tactics]
- helper_lemmas: [lemma1, lemma2]
- difficulty: {small|medium|large}
- confidence: {0.0-1.0}
- file: {filename}
- timestamp: {date}
Storing a Failed Approach
When an approach fails (error, loop, timeout), store to avoid repeating:
What to capture:
- Failed tactic - Exact tactic/sequence that failed
- Error type - Loop, timeout, type error, etc.
- Context - What was being proved
- Alternative - What worked instead (if known)
When to store:
- Infinite simp loops
- Tactics causing build timeouts
- Type mismatches from subtle issues
- Approaches that seemed promising but didn't work
Storage format:
Entity type: FailedApproach
Name: {descriptive_name}
Attributes:
- project: {absolute_path}
- failed_tactic: {tactic_text}
- error: {error_description}
- context: {what_was_being_proved}
- alternative: {what_worked}
- timestamp: {date}
Storing Project Conventions
Track consistent patterns that emerge:
What to capture:
- Naming conventions - h_ for hypotheses, have_ for results
- Proof structure - Standard opening moves (haveI, intro patterns)
- Import patterns - Commonly used imports
- Tactic preferences - measurability vs explicit proofs
When to store:
- Pattern observed 3+ times consistently
- Convention affects multiple files
- Style guide established
Retrieving Memories
Before starting proof:
1. Query for similar goal patterns
2. Surface successful tactics for this pattern
3. Check for known issues with current context
4. Suggest helper lemmas from similar proofs
During proof:
1. Before each major tactic, check for known failures
2. When stuck, retrieve alternative approaches
3. Suggest next tactics based on past success
Query patterns:
# Find similar proofs
search_entities(
query="condExp equality goal",
filters={"project": current_project, "entity_type": "ProofPattern"}
)
# Check for failures
search_entities(
query="simp only condExp_indicator",
filters={"project": current_project, "entity_type": "FailedApproach"}
)
# Get conventions
search_entities(
query="naming conventions measure theory",
filters={"project": current_project, "entity_type": "ProjectConvention"}
)
Best Practices
Memory Quality
DO store:
- ✅ Successful non-trivial proofs (>10 lines)
- ✅ Failed approaches that wasted significant time
- ✅ Consistent patterns observed multiple times
- ✅ Project-specific insights
DON'T store:
- ❌ Trivial proofs (rfl, simp, exact)
- ❌ One-off tactics unlikely to recur
- ❌ General Lean knowledge (already in training/mathlib)
- ❌ Temporary workarounds
Memory Hygiene
Confidence scoring:
- High (0.8-1.0) - Clean proof, no warnings, well-tested
- Medium (0.5-0.8) - Works but has minor issues
- Low (0.0-0.5) - Hacky solution, needs refinement
Aging:
- Recent memories (same session) = higher relevance
- Older memories = verify still applicable
- Patterns from many sessions = high confidence
Pruning:
- Remove memories for deleted theorems
- Update when better approach found
- Mark as outdated if project evolves
User Control
Users can:
- Toggle lean4-memories skill on/off independently
- Clear project-specific memories
- Review stored memories
- Adjust confidence thresholds
- Export/import memories for sharing
Example Workflow
Session 1: First proof
-- Proving: measure_eq_of_fin_marginals_eq
-- No memories yet, explore from scratch
-- [After 30 minutes of exploration]
-- ✅ Success with π-system uniqueness approach
Store: ProofPattern "pi_system_uniqueness"
- Works for: measure equality via finite marginals
- Tactics: [isPiSystem, generateFrom_eq, measure_eq_on_piSystem]
- Confidence: 0.9
Session 2: Similar theorem (weeks later)
-- Proving: fullyExchangeable_via_pathLaw
-- Goal: Show two measures equal
-- System: "Similar to measure_eq_of_fin_marginals_eq"
-- Retrieve memory: pi_system_uniqueness pattern
-- Suggestion: "Try isPiSystem approach?"
-- ✅ Success in 5 minutes using remembered pattern
Session 3: Avoiding failure
-- Proving: condIndep_of_condExp_eq
-- About to: simp only [condExp_indicator, mul_comm]
-- ⚠️ Memory: This causes infinite loop (stored Session 1)
-- Alternative: simp only [condExp_indicator], then ring
-- Avoid 20-minute debugging session by using memory
Configuration
Memory Server Setup
Ensure MCP memory server is configured:
// In Claude Desktop config
{
"mcpServers": {
"memory": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-memory"]
}
}
}
Project-Specific Settings
Memories are automatically scoped by project path. To work across multiple projects:
Same formalization, different repos:
# Link memories using project aliases
# (Future enhancement - not yet implemented)
Sharing memories with team:
# Export/import functionality
# (Future enhancement - not yet implemented)
Integration with Automation Scripts
Memories enhance script usage:
proof_templates.sh:
- Retrieve project-specific template preferences
- Include common proof patterns in scaffolding
suggest_tactics.sh:
- Prioritize tactics that succeeded in this project
- Warn about tactics with known issues
sorry_analyzer.py:
- Link sorries to similar completed proofs
- Suggest approaches based on memory
Limitations and Caveats
What memories DON'T replace:
- Mathematical understanding
- Lean type system knowledge
- mathlib API documentation
- Formal verification principles
Potential issues:
- Stale memories if project evolves significantly
- Over-fitting to specific project patterns
- Memory bloat if not maintained
- Cross-project contamination if scoping fails
Mitigation:
- Regular review of stored memories
- Confidence scoring and aging
- Strict project-path scoping
- User control over memory operations
Future Enhancements
Planned features:
- Memory visualization dashboard
- Pattern mining across projects
- Collaborative memory sharing
- Automated memory pruning
- Integration with git history
- Cross-project pattern detection (with user consent)
See Also
- lean4-theorem-proving skill - Core workflows and automation
- MCP memory server docs - https://modelcontextprotocol.io/docs/getting-started/intro
- references/memory-patterns.md - Detailed memory operation examples
More by benchflow-ai
View allRepair an (often imperfect) Flexible Job Shop Scheduling baseline into a downtime-feasible, precedence-correct schedule while staying within policy budgets and matching the evaluator’s exact metrics and “local minimal right-shift” checks.
Test Temporal workflows with pytest, time-skipping, and mocking strategies. Covers unit testing, integration testing, replay testing, and local development setup. Use when implementing Temporal workflow tests or debugging test failures.
Extract locational marginal prices (LMPs) from DC-OPF solutions using dual values. Use when computing nodal electricity prices, reserve clearing prices, or performing price impact analysis.
This skill should be used when the user asks to "design package structure", "create managed package", "configure 2GP", "set up namespace", "version management", or mentions managed package topics like "LMA", "subscriber orgs", or "package versioning". Provides comprehensive guidance for second-generation managed package (2GP) architecture, ISV development patterns, and package lifecycle management.
