name: codebase-classification description: Classify codebases before modification to choose appropriate development approach

Codebase Classification Skill

Analyze and classify codebases before making changes to ensure appropriate development approach.

Overview

Before modifying any codebase, classify it to determine whether to:

Follow existing patterns (Disciplined)
Gradually improve while following conventions (Transitional)
Propose improvements to legacy patterns (Legacy)
Establish best practices from scratch (Greenfield)

Classification Types

1. Disciplined Codebase

Signals:

Consistent code style (formatting, naming conventions)
Comprehensive test coverage (>70%)
Clear module boundaries and interfaces
Type hints/annotations throughout
Up-to-date dependencies
Active CI/CD pipeline
Good documentation (README, docstrings)

Approach: Follow existing patterns strictly. Don't introduce new conventions.

2. Transitional Codebase

Signals:

Mixed code quality (some areas good, others not)
Partial test coverage (30-70%)
Some type hints, inconsistent usage
Active development with modernization efforts
Dependencies somewhat current

Approach: Follow existing conventions in touched areas. Propose improvements for new code.

3. Legacy Codebase

Signals:

Inconsistent patterns across the codebase
Minimal or no tests (<30% coverage)
No type hints
Outdated dependencies
Complex, undocumented logic
Possibly unmaintained

Approach: Be careful with changes. Add tests before modifying. Propose gradual improvements.

4. Greenfield Codebase

Signals:

New project (<6 months old)
Few files (<20 source files)
No established patterns yet
Minimal or no tests (but not legacy)
Active initial development

Approach: Establish best practices from the start. Set up proper structure, testing, CI.

Quick Classification Checklist

Run this analysis before making significant changes:

1. Check test coverage: Is there a test/ or tests/ directory? How comprehensive?
2. Check type hints: Are functions annotated? Is there py.typed or mypy config?
3. Check CI/CD: Is there .github/workflows/, .gitlab-ci.yml, or similar?
4. Check code style: Is there .pre-commit-config.yaml, ruff.toml, or similar?
5. Check dependencies: When was requirements.txt/pyproject.toml last updated?
6. Check documentation: Is there a comprehensive README? API docs?

Decision Matrix

Signal	Disciplined	Transitional	Legacy	Greenfield
Test coverage	>70%	30-70%	<30%	Varies (new)
Type hints	Comprehensive	Partial	None/minimal	Varies
CI/CD	Active	Present	None/broken	May be new
Code style	Consistent	Mixed	Inconsistent	Establishing
Dependencies	Current	Somewhat current	Outdated	Latest
Age	Any	Any	Usually old	<6 months

Behavior Guidelines

When Disciplined

Study existing patterns before writing new code
Match naming conventions exactly
Follow established module structure
Add tests matching existing test style
Don't propose architectural changes without strong justification

When Transitional

Follow patterns in the specific area you're modifying
Match quality of surrounding code or slightly better
Add tests for new functionality
Document rationale for any pattern deviations

When Legacy

Add tests BEFORE modifying code
Make minimal changes to achieve goal
Document assumptions and findings
Propose improvements as separate follow-up work
Be extra careful with untested code paths

When Greenfield

Establish best practices immediately
Set up proper project structure
Configure linting, formatting, type checking
Write tests for new functionality
Create comprehensive documentation

Examples

Identifying Disciplined Codebase

$ ls -la
pyproject.toml          # Modern packaging
.pre-commit-config.yaml # Style enforcement
mypy.ini                # Type checking
.github/workflows/      # CI/CD

$ wc -l tests/**/*.py
2500 total              # Substantial tests

→ Classification: DISCIPLINED
→ Approach: Follow existing patterns strictly

Identifying Legacy Codebase

$ ls -la
setup.py                # Old-style packaging
requirements.txt        # Pinned 3 years ago
# No tests directory
# No CI configuration

$ grep -r "def " src/ | head -5
def process_data(x):    # No type hints
def handle_input(data): # No docstrings

→ Classification: LEGACY
→ Approach: Careful changes, add tests first

Integration

This skill helps agents:

Avoid imposing new patterns on well-structured codebases
Avoid perpetuating bad patterns in legacy codebases
Make appropriate improvement suggestions
Set up proper structure for new projects

Tool: shell (for running analysis commands)
Tool: read (for examining codebase structure)

codebase-classification

Installation

Details

Usage

Skill Instructions

name: codebase-classification description: Classify codebases before modification to choose appropriate development approach

Codebase Classification Skill

Overview

Classification Types

1. Disciplined Codebase

2. Transitional Codebase

3. Legacy Codebase

4. Greenfield Codebase

Quick Classification Checklist

Decision Matrix

Behavior Guidelines

When Disciplined

When Transitional

When Legacy

When Greenfield

Examples

Identifying Disciplined Codebase

Identifying Legacy Codebase

Integration

Related

More by gptme