benchflow-ai

discover-important-function

@benchflow-ai/discover-important-function
benchflow-ai
230
165 forks
Updated 1/18/2026
View on GitHub

When given a project codebase, this skill observes the important functions in the codebase for future action.

Installation

$skills install @benchflow-ai/discover-important-function
Claude Code
Cursor
Copilot
Codex
Antigravity

Details

Pathtasks/setup-fuzzing-py/environment/skills/discover-important-function/SKILL.md
Branchmain
Scoped Name@benchflow-ai/discover-important-function

Usage

After installing, this skill will be available to your AI coding assistant.

Verify installation:

skills list

Skill Instructions


name: discover-important-function description: "When given a project codebase, this skill observes the important functions in the codebase for future action."

Fuzz Target Localizer

Purpose

This skill helps an agent quickly narrow a Python repository down to a small set of high-value fuzz targets. It produces:

  • A ranked list of important files
  • A ranked list of important functions and methods
  • A summary of existing unit tests and inferred oracles
  • A final shortlist of functions-under-test (FUTs) for fuzzing, each with a structured "note to self" for follow-up actions

When to use

Use this skill when the user asks to:

  • Find the best functions to fuzz in a Python package/library
  • Identify parsers, decoders, validators, or boundary code that is fuzz-worthy
  • Decide what to fuzz based on existing test coverage and API surface
  • Produce a structured shortlist of fuzz targets with harness guidance
  • Automating testing pipeline setup for a new or existing Python project.

Do not use this skill when the user primarily wants to fix a specific bug, refactor code, or implement a harness immediately (unless they explicitly ask for target selection first).

Inputs expected from the environment

The agent should assume access to:

  • Repository filesystem
  • Ability to read files
  • Ability to run local commands (optional but recommended)

Preferred repository listing command:

  • Use tree to get a fast, high-signal view of repository structure (limit depth if needed).

If tree is not available, fall back to a recursive listing via other standard shell tooling.

Outputs

Produce result:

A report in any format with the following information: - Localize important files - Localize important functions - Summarize existing unit tests - Decide function under test for fuzzing This file should be placed in the root of the repository as APIs.txt, which servers as the guidelines for future fuzzing harness implementation.

Guardrails

  • Prefer analysis and reporting over code changes.
  • Do not modify source code unless the user explicitly requests changes.
  • If running commands could be disruptive, default to read-only analysis.
  • Avoid assumptions about runtime behavior; base conclusions on code and tests.
  • Explicitly consider existing tests in the repo when selecting targets and deciding harness shape.
  • Keep the final FUT shortlist small (typically 1–5).

Localize important files

Goal

Produce a ranked list of files that are most likely to contain fuzz-worthy logic: parsing, decoding, deserialization, validation, protocol handling, file/network boundaries, or native bindings.

Procedure

  1. Build a repository map

    • Start with a repository overview using tree to identify:
      • Root packages and layouts (src/ layout vs flat layout)
      • Test directories and configs
      • Bindings/native directories
      • Examples, docs, and tooling directories
    • Identify packaging and metadata files such as:
      • pyproject.toml
      • setup.cfg
      • setup.py
    • Identify test configuration and entry points:
      • tests/
      • conftest.py
      • pytest.ini
      • tox.ini
      • noxfile.py
  2. Exclude low-value areas

    • Skip: virtual environments, build outputs, vendored code, documentation-only directories, examples-only directories, generated files.
  3. Score files using explainable heuristics Assign a file a higher score when it matches more of these indicators:

    • Public API exposure: init.py re-exports, all, api modules
    • Input boundary keywords in file path or symbols: parse, load, dump, decode, encode, deserialize, serialize, validate, normalize, schema, protocol, message
    • Format handlers: json, yaml, xml, csv, toml, protobuf, msgpack, pickle
    • Regex-heavy or templating-heavy code
    • Native boundaries: ctypes, cffi, cython, extension modules, bindings
    • Central modules: high import fan-in across the package
    • Test adjacency: directly imported or heavily referenced by tests
  4. Rank and select

    • Produce a Top-N list (default 10–30) with a short rationale per file.

Output format

For each file:

  • path
  • score (relative, not necessarily normalized)
  • rationale (2–5 bullets)
  • indicators_hit (list)

Localize important functions

Goal

From the important files, identify and rank functions or methods that are strong fuzz targets while minimizing full-body reading until needed.

Approach 1: AST-based header and docstring scan

For each localized Python file, parse it using Python’s ast module and extract only:

  • Module docstring
  • Function and async function headers (name, args, defaults, annotations, decorators)
  • Class headers and method headers
  • Docstrings for modules, classes, and functions/methods

Do not read full function bodies during the initial pass unless needed for disambiguation or final selection.

Procedure

  1. Build per-file declarations via AST

    • Parse the file with ast
    • Enumerate:
      • Top-level functions
      • Classes and their methods
      • Nested functions only if they are likely to be directly fuzzable via an exposed wrapper
    • For each symbol, collect:
      • Fully-qualified name
      • Signature details (as available from AST)
      • Decorators
      • Docstring (if present)
      • Location information (file, line range if available)
  2. Generate an initial candidate set using headers and docstrings Prioritize functions/methods that:

    • Accept bytes, str, file-like objects, dicts, or user-controlled payloads
    • Convert between representations (raw ↔ structured)
    • Perform validation, normalization, parsing, decoding, deserialization
    • Touch filesystem/network/protocol boundaries
    • Call into native extensions or bindings
    • Clearly document strictness, schemas, formats, or error conditions
  3. Use tests to refine candidate selection early

    • Before reading full bodies, check if tests reference these functions/modules:
      • Direct imports in tests
      • Fixtures that exercise particular entry points
      • Parameterizations over formats and inputs
    • Down-rank candidates that are already well-covered unless they are high-risk boundaries (parsers/native bindings).
  4. Confirm with targeted reading only for top candidates For the top candidates (typically 10–20), read the full function bodies and capture:

    • Preconditions and assumptions
    • Internal helpers called
    • Error handling style and exception types
    • Any obvious invariants and postconditions
    • Statefulness and global dependencies
  5. Rank and shortlist Rank candidates using an explainable rubric:

    • Input surface and reachability
    • Boundary risk (parsing/decoding/native)
    • Structural complexity (from targeted reading only)
    • Existing test coverage strength and breadth
    • Ease of harnessing

Output format

For each function/method:

  • qualname
  • file
  • line_range (if available)
  • score (relative)
  • rationale (2–6 bullets)
  • dependencies (key helpers, modules, external state)
  • harnessability (low/medium/high)

Approach 2: Scanning all important files yourself

For each localized Python file, read the file contents directly to extract:

  • Module docstring and overall structure
  • Function and async function definitions (name, args, defaults, annotations, decorators)
  • Class definitions and their methods
  • Full function bodies and implementation details
  • Docstrings for modules, classes, and functions/methods

This approach reads complete file contents, allowing for deeper analysis at the cost of higher token usage.

Procedure

  1. Read important files sequentially

    • For each file from the important files list, read the full contents.
    • Extract by direct inspection:
      • Top-level functions and their complete implementations
      • Classes and their methods with full bodies
      • Nested functions if they are exposed or called by public APIs
    • For each symbol, collect:
      • Fully-qualified name
      • Complete signature (from source text)
      • Decorators
      • Docstring (if present)
      • Full function body
      • Location information (file, approximate line range)
  2. Generate an initial candidate set using full source analysis Prioritize functions/methods that:

    • Accept bytes, str, file-like objects, dicts, or user-controlled payloads
    • Convert between representations (raw ↔ structured)
    • Perform validation, normalization, parsing, decoding, deserialization
    • Touch filesystem/network/protocol boundaries
    • Call into native extensions or bindings
    • Contain complex control flow, loops, or recursive calls
    • Handle exceptions or edge cases
    • Clearly document strictness, schemas, formats, or error conditions
  3. Analyze implementation details from full bodies For each candidate function, inspect the body for:

    • Preconditions and assumptions (explicit checks, assertions, early returns)
    • Internal helpers called and their purposes
    • Error handling style and exception types raised
    • Invariants and postconditions (explicit or implicit)
    • Statefulness and global dependencies
    • Input transformations and data flow
    • Native calls or external process invocations
    • Resource allocation and cleanup patterns
  4. Use tests to refine candidate selection

    • Check if tests reference these functions/modules:
      • Direct imports in tests
      • Fixtures that exercise particular entry points
      • Parameterizations over formats and inputs
    • Down-rank candidates that are already well-covered unless they are high-risk boundaries (parsers/native bindings).
    • Note which aspects of each function are tested vs untested.
  5. Rank and shortlist Rank candidates using an explainable rubric:

    • Input surface and reachability
    • Boundary risk (parsing/decoding/native)
    • Structural complexity (from full body analysis)
    • Existing test coverage strength and breadth
    • Ease of harnessing
    • Observable implementation risks (unsafe operations, unchecked inputs, complex state)

Output format

For each function/method:

  • qualname
  • file
  • line_range (if available)
  • score (relative)
  • rationale (2–6 bullets)
  • dependencies (key helpers, modules, external state)
  • harnessability (low/medium/high)
  • implementation_notes (key observations from body analysis)

Summarize existing unit tests

Goal

Summarize what is already tested, infer test oracles, and identify gaps that fuzzing can complement.

Hard requirement

Always inspect and incorporate existing tests in the repository when:

  • Ranking functions
  • Selecting FUTs
  • Designing input models and oracles
  • Proposing seed corpus sources

Procedure

  1. Inventory tests

    • Locate tests and their discovery configuration (pytest.ini, pyproject.toml, tox.ini, noxfile.py).
    • Enumerate test modules and map them to source modules via imports.
    • Identify shared fixtures and data factories (conftest.py, fixture files, test utilities).
  2. Summarize test intent For each test module:

    • What behaviors are asserted
    • What inputs are used
    • What exceptions are expected
    • What invariants are implied
  3. Infer oracles and properties Common fuzz-friendly oracles include:

    • Round-trip properties
    • Idempotence of normalization
    • Parser consistency across equivalent inputs
    • Deterministic output given deterministic input
    • No-crash and no-hang for malformed inputs
  4. Identify coverage gaps

    • FUT candidates with no direct tests
    • Input classes not covered by tests (size extremes, malformed encodings, deep nesting, edge unicode, invalid schemas)
    • Code paths guarded by complex conditionals or exception handlers with no tests

Output format

  • test_map: module_under_test → tests → asserted behaviors
  • inferred_oracles: list of reusable invariants with the functions they apply to
  • gaps: ranked list of untested or weakly-tested candidates

Decide function under test for fuzzing

Goal

Select a final set of FUTs (typically 1–5) and produce a structured "note to self" for each FUT so the agent can proceed to harness implementation, corpus seeding, and fuzz execution.

Selection criteria

Prefer FUTs that maximize:

  • Security/robustness payoff (parsing, decoding, validation, native boundary)
  • Reachability with minimal setup
  • Low existing test assurance or narrow test input coverage
  • High fuzzability (simple input channel, clear oracle or crash-only target)

Explicitly weigh:

  • What tests already cover (and what they do not)
  • What seeds can be extracted from tests, fixtures, and sample data

Required "note to self" template

For each selected FUT, produce exactly this structure:

Fuzzing Target Note

  • Target:
  • File / location:
  • Why this target:
    • Input surface:
    • Boundary or native considerations:
    • Complexity or path depth:
    • Current test gaps:
  • Callable contract:
    • Required imports or initialization:
    • Preconditions:
    • Determinism concerns:
    • External dependencies:
  • Input model:
    • Primary payload type:
    • Decoding or parsing steps:
    • Constraints to respect:
    • Edge classes to emphasize:
  • Oracles:
    • Must-hold properties:
    • Acceptable exceptions:
    • Suspicious exceptions:
    • Crash-only vs correctness-checking:
  • Harness plan:
    • Recommended approach:
    • Minimal harness signature:
    • Seed corpus ideas:
    • Timeouts and resource limits:
  • Risk flags:
    • Native extension involved:
    • Potential DoS paths:
    • External I/O:
  • Next actions: 1. 2. 3. 4.

Output format

  • selected_futs: list of chosen FUTs with brief justification
  • notes_to_self: one "Fuzzing Target Note" per FUT

Final JSON block

At the end of the report, include a JSON object with:

  • important_files: [{path, score, rationale, indicators_hit}]
  • important_functions: [{qualname, file, line_range, score, rationale, dependencies, harnessability}]
  • test_summary: {test_map, inferred_oracles, gaps}
  • selected_futs: [{qualname, file, line_range, justification}]
  • notes_to_self: [{target_qualname, note}]