When given a project codebase, this skill observes the important functions in the codebase for future action.
Installation
Details
Usage
After installing, this skill will be available to your AI coding assistant.
Verify installation:
skills listSkill Instructions
name: discover-important-function description: "When given a project codebase, this skill observes the important functions in the codebase for future action."
Fuzz Target Localizer
Purpose
This skill helps an agent quickly narrow a Python repository down to a small set of high-value fuzz targets. It produces:
- A ranked list of important files
- A ranked list of important functions and methods
- A summary of existing unit tests and inferred oracles
- A final shortlist of functions-under-test (FUTs) for fuzzing, each with a structured "note to self" for follow-up actions
When to use
Use this skill when the user asks to:
- Find the best functions to fuzz in a Python package/library
- Identify parsers, decoders, validators, or boundary code that is fuzz-worthy
- Decide what to fuzz based on existing test coverage and API surface
- Produce a structured shortlist of fuzz targets with harness guidance
- Automating testing pipeline setup for a new or existing Python project.
Do not use this skill when the user primarily wants to fix a specific bug, refactor code, or implement a harness immediately (unless they explicitly ask for target selection first).
Inputs expected from the environment
The agent should assume access to:
- Repository filesystem
- Ability to read files
- Ability to run local commands (optional but recommended)
Preferred repository listing command:
- Use
treeto get a fast, high-signal view of repository structure (limit depth if needed).
If tree is not available, fall back to a recursive listing via other standard shell tooling.
Outputs
Produce result:
A report in any format with the following information: - Localize important files - Localize important functions - Summarize existing unit tests - Decide function under test for fuzzing
This file should be placed in the root of the repository as APIs.txt,
which servers as the guidelines for future fuzzing harness implementation.
Guardrails
- Prefer analysis and reporting over code changes.
- Do not modify source code unless the user explicitly requests changes.
- If running commands could be disruptive, default to read-only analysis.
- Avoid assumptions about runtime behavior; base conclusions on code and tests.
- Explicitly consider existing tests in the repo when selecting targets and deciding harness shape.
- Keep the final FUT shortlist small (typically 1–5).
Localize important files
Goal
Produce a ranked list of files that are most likely to contain fuzz-worthy logic: parsing, decoding, deserialization, validation, protocol handling, file/network boundaries, or native bindings.
Procedure
-
Build a repository map
- Start with a repository overview using
treeto identify:- Root packages and layouts (src/ layout vs flat layout)
- Test directories and configs
- Bindings/native directories
- Examples, docs, and tooling directories
- Identify packaging and metadata files such as:
- pyproject.toml
- setup.cfg
- setup.py
- Identify test configuration and entry points:
- tests/
- conftest.py
- pytest.ini
- tox.ini
- noxfile.py
- Start with a repository overview using
-
Exclude low-value areas
- Skip: virtual environments, build outputs, vendored code, documentation-only directories, examples-only directories, generated files.
-
Score files using explainable heuristics Assign a file a higher score when it matches more of these indicators:
- Public API exposure: init.py re-exports, all, api modules
- Input boundary keywords in file path or symbols: parse, load, dump, decode, encode, deserialize, serialize, validate, normalize, schema, protocol, message
- Format handlers: json, yaml, xml, csv, toml, protobuf, msgpack, pickle
- Regex-heavy or templating-heavy code
- Native boundaries: ctypes, cffi, cython, extension modules, bindings
- Central modules: high import fan-in across the package
- Test adjacency: directly imported or heavily referenced by tests
-
Rank and select
- Produce a Top-N list (default 10–30) with a short rationale per file.
Output format
For each file:
- path
- score (relative, not necessarily normalized)
- rationale (2–5 bullets)
- indicators_hit (list)
Localize important functions
Goal
From the important files, identify and rank functions or methods that are strong fuzz targets while minimizing full-body reading until needed.
Approach 1: AST-based header and docstring scan
For each localized Python file, parse it using Python’s ast module and extract only:
- Module docstring
- Function and async function headers (name, args, defaults, annotations, decorators)
- Class headers and method headers
- Docstrings for modules, classes, and functions/methods
Do not read full function bodies during the initial pass unless needed for disambiguation or final selection.
Procedure
-
Build per-file declarations via AST
- Parse the file with
ast - Enumerate:
- Top-level functions
- Classes and their methods
- Nested functions only if they are likely to be directly fuzzable via an exposed wrapper
- For each symbol, collect:
- Fully-qualified name
- Signature details (as available from AST)
- Decorators
- Docstring (if present)
- Location information (file, line range if available)
- Parse the file with
-
Generate an initial candidate set using headers and docstrings Prioritize functions/methods that:
- Accept bytes, str, file-like objects, dicts, or user-controlled payloads
- Convert between representations (raw ↔ structured)
- Perform validation, normalization, parsing, decoding, deserialization
- Touch filesystem/network/protocol boundaries
- Call into native extensions or bindings
- Clearly document strictness, schemas, formats, or error conditions
-
Use tests to refine candidate selection early
- Before reading full bodies, check if tests reference these functions/modules:
- Direct imports in tests
- Fixtures that exercise particular entry points
- Parameterizations over formats and inputs
- Down-rank candidates that are already well-covered unless they are high-risk boundaries (parsers/native bindings).
- Before reading full bodies, check if tests reference these functions/modules:
-
Confirm with targeted reading only for top candidates For the top candidates (typically 10–20), read the full function bodies and capture:
- Preconditions and assumptions
- Internal helpers called
- Error handling style and exception types
- Any obvious invariants and postconditions
- Statefulness and global dependencies
-
Rank and shortlist Rank candidates using an explainable rubric:
- Input surface and reachability
- Boundary risk (parsing/decoding/native)
- Structural complexity (from targeted reading only)
- Existing test coverage strength and breadth
- Ease of harnessing
Output format
For each function/method:
- qualname
- file
- line_range (if available)
- score (relative)
- rationale (2–6 bullets)
- dependencies (key helpers, modules, external state)
- harnessability (low/medium/high)
Approach 2: Scanning all important files yourself
For each localized Python file, read the file contents directly to extract:
- Module docstring and overall structure
- Function and async function definitions (name, args, defaults, annotations, decorators)
- Class definitions and their methods
- Full function bodies and implementation details
- Docstrings for modules, classes, and functions/methods
This approach reads complete file contents, allowing for deeper analysis at the cost of higher token usage.
Procedure
-
Read important files sequentially
- For each file from the important files list, read the full contents.
- Extract by direct inspection:
- Top-level functions and their complete implementations
- Classes and their methods with full bodies
- Nested functions if they are exposed or called by public APIs
- For each symbol, collect:
- Fully-qualified name
- Complete signature (from source text)
- Decorators
- Docstring (if present)
- Full function body
- Location information (file, approximate line range)
-
Generate an initial candidate set using full source analysis Prioritize functions/methods that:
- Accept bytes, str, file-like objects, dicts, or user-controlled payloads
- Convert between representations (raw ↔ structured)
- Perform validation, normalization, parsing, decoding, deserialization
- Touch filesystem/network/protocol boundaries
- Call into native extensions or bindings
- Contain complex control flow, loops, or recursive calls
- Handle exceptions or edge cases
- Clearly document strictness, schemas, formats, or error conditions
-
Analyze implementation details from full bodies For each candidate function, inspect the body for:
- Preconditions and assumptions (explicit checks, assertions, early returns)
- Internal helpers called and their purposes
- Error handling style and exception types raised
- Invariants and postconditions (explicit or implicit)
- Statefulness and global dependencies
- Input transformations and data flow
- Native calls or external process invocations
- Resource allocation and cleanup patterns
-
Use tests to refine candidate selection
- Check if tests reference these functions/modules:
- Direct imports in tests
- Fixtures that exercise particular entry points
- Parameterizations over formats and inputs
- Down-rank candidates that are already well-covered unless they are high-risk boundaries (parsers/native bindings).
- Note which aspects of each function are tested vs untested.
- Check if tests reference these functions/modules:
-
Rank and shortlist Rank candidates using an explainable rubric:
- Input surface and reachability
- Boundary risk (parsing/decoding/native)
- Structural complexity (from full body analysis)
- Existing test coverage strength and breadth
- Ease of harnessing
- Observable implementation risks (unsafe operations, unchecked inputs, complex state)
Output format
For each function/method:
- qualname
- file
- line_range (if available)
- score (relative)
- rationale (2–6 bullets)
- dependencies (key helpers, modules, external state)
- harnessability (low/medium/high)
- implementation_notes (key observations from body analysis)
Summarize existing unit tests
Goal
Summarize what is already tested, infer test oracles, and identify gaps that fuzzing can complement.
Hard requirement
Always inspect and incorporate existing tests in the repository when:
- Ranking functions
- Selecting FUTs
- Designing input models and oracles
- Proposing seed corpus sources
Procedure
-
Inventory tests
- Locate tests and their discovery configuration (pytest.ini, pyproject.toml, tox.ini, noxfile.py).
- Enumerate test modules and map them to source modules via imports.
- Identify shared fixtures and data factories (conftest.py, fixture files, test utilities).
-
Summarize test intent For each test module:
- What behaviors are asserted
- What inputs are used
- What exceptions are expected
- What invariants are implied
-
Infer oracles and properties Common fuzz-friendly oracles include:
- Round-trip properties
- Idempotence of normalization
- Parser consistency across equivalent inputs
- Deterministic output given deterministic input
- No-crash and no-hang for malformed inputs
-
Identify coverage gaps
- FUT candidates with no direct tests
- Input classes not covered by tests (size extremes, malformed encodings, deep nesting, edge unicode, invalid schemas)
- Code paths guarded by complex conditionals or exception handlers with no tests
Output format
- test_map: module_under_test → tests → asserted behaviors
- inferred_oracles: list of reusable invariants with the functions they apply to
- gaps: ranked list of untested or weakly-tested candidates
Decide function under test for fuzzing
Goal
Select a final set of FUTs (typically 1–5) and produce a structured "note to self" for each FUT so the agent can proceed to harness implementation, corpus seeding, and fuzz execution.
Selection criteria
Prefer FUTs that maximize:
- Security/robustness payoff (parsing, decoding, validation, native boundary)
- Reachability with minimal setup
- Low existing test assurance or narrow test input coverage
- High fuzzability (simple input channel, clear oracle or crash-only target)
Explicitly weigh:
- What tests already cover (and what they do not)
- What seeds can be extracted from tests, fixtures, and sample data
Required "note to self" template
For each selected FUT, produce exactly this structure:
Fuzzing Target Note
- Target:
- File / location:
- Why this target:
- Input surface:
- Boundary or native considerations:
- Complexity or path depth:
- Current test gaps:
- Callable contract:
- Required imports or initialization:
- Preconditions:
- Determinism concerns:
- External dependencies:
- Input model:
- Primary payload type:
- Decoding or parsing steps:
- Constraints to respect:
- Edge classes to emphasize:
- Oracles:
- Must-hold properties:
- Acceptable exceptions:
- Suspicious exceptions:
- Crash-only vs correctness-checking:
- Harness plan:
- Recommended approach:
- Minimal harness signature:
- Seed corpus ideas:
- Timeouts and resource limits:
- Risk flags:
- Native extension involved:
- Potential DoS paths:
- External I/O:
- Next actions: 1. 2. 3. 4.
Output format
- selected_futs: list of chosen FUTs with brief justification
- notes_to_self: one "Fuzzing Target Note" per FUT
Final JSON block
At the end of the report, include a JSON object with:
- important_files: [{path, score, rationale, indicators_hit}]
- important_functions: [{qualname, file, line_range, score, rationale, dependencies, harnessability}]
- test_summary: {test_map, inferred_oracles, gaps}
- selected_futs: [{qualname, file, line_range, justification}]
- notes_to_self: [{target_qualname, note}]
More by benchflow-ai
View allRepair an (often imperfect) Flexible Job Shop Scheduling baseline into a downtime-feasible, precedence-correct schedule while staying within policy budgets and matching the evaluator’s exact metrics and “local minimal right-shift” checks.
Test Temporal workflows with pytest, time-skipping, and mocking strategies. Covers unit testing, integration testing, replay testing, and local development setup. Use when implementing Temporal workflow tests or debugging test failures.
Extract locational marginal prices (LMPs) from DC-OPF solutions using dual values. Use when computing nodal electricity prices, reserve clearing prices, or performing price impact analysis.
This skill should be used when the user asks to "design package structure", "create managed package", "configure 2GP", "set up namespace", "version management", or mentions managed package topics like "LMA", "subscriber orgs", or "package versioning". Provides comprehensive guidance for second-generation managed package (2GP) architecture, ISV development patterns, and package lifecycle management.
