Agent SkillsAgent Skills
Atemndobs

rfp-ingest

@Atemndobs/rfp-ingest
Atemndobs
0
0 forks
Updated 3/31/2026
View on GitHub

Ingest RFP opportunities from multiple data sources (SAM.gov, eMMA, RFPMart). Use when adding new data sources, modifying ingestion logic, or debugging data fetching issues.

Installation

$npx agent-skills-cli install @Atemndobs/rfp-ingest
Claude Code
Cursor
Copilot
Codex
Antigravity

Details

Path.claude/skills/rfp-ingest/SKILL.md
Branchmain
Scoped Name@Atemndobs/rfp-ingest

Usage

After installing, this skill will be available to your AI coding assistant.

Verify installation:

npx agent-skills-cli list

Skill Instructions


name: rfp-ingest description: Ingest RFP opportunities from multiple data sources (SAM.gov, eMMA, RFPMart). Use when adding new data sources, modifying ingestion logic, or debugging data fetching issues. allowed-tools: Read, Grep, Glob, Bash(npm:), Bash(npx:)

RFP Ingestion Skill

Overview

This skill helps implement multi-source RFP data ingestion with canonical schema normalization and deduplication.

Supported Data Sources

SourcePriorityAPI TypeRate Limits
SAM.govP1REST API10 req/sec, 10k/day
Maryland eMMAP1Web scrapingRespectful crawling
RFPMart APICurrentREST APIAs documented
RFPMart CSVCurrentManual uploadN/A
GovTribeP2REST API (paid)Per subscription

CSV Upload (RFPMart Email Alerts)

RFPMart sends periodic email alerts with CSV attachments. These can be manually uploaded through the Admin UI.

CSV Format (No Header Row)

ColumnIndexContentExample
ID0RFP identifierSW-82097
Country1Country codeUSA
State2State nameIdaho
Title3Full title with locationSW-82097 - USA (Idaho) - Data Concealment...
Deadline4Due dateMarch 25,2026
URL5RFPMart linkhttps://www.rfpmart.com/...

ID Prefix → Category Mapping

const categoryMap: Record<string, string> = {
  SW: "Software Development",
  ITES: "IT Services",
  NET: "Networking",
  TELCOM: "Telecommunications",
  DRA: "Data & Research",
  CSE: "Security Services",
  HR: "Human Resources",
  PM: "Project Management",
  MRB: "Marketing & Branding",
  // ... other prefixes default to "Other"
};

IT-Relevant Prefixes

When filtering for IT-relevant RFPs only, these prefixes are included:

  • SW - Software Development
  • ITES - IT Services
  • NET - Networking
  • TELCOM - Telecommunications
  • DRA - Data & Research
  • CSE - Security Services

Key Files

FilePurpose
convex/ingestion/rfpmartCsv.tsCSV parser and Convex action
components/admin/CsvUpload.tsxDrag-and-drop upload UI

Usage

  1. Navigate to AdminData Sources tab
  2. Scroll to RFPMart CSV Upload section
  3. Drop a CSV file or click to browse
  4. Toggle "Only import IT-relevant RFPs" if desired
  5. View results summary (new/updated/skipped/errors)

Implementation Example

// Parsing CSV with quoted fields
function parseCSVLine(line: string): string[] {
  const fields: string[] = [];
  let current = "";
  let inQuotes = false;

  for (let i = 0; i < line.length; i++) {
    const char = line[i];
    if (char === '"') {
      if (inQuotes && line[i + 1] === '"') {
        current += '"';
        i++;
      } else {
        inQuotes = !inQuotes;
      }
    } else if (char === "," && !inQuotes) {
      fields.push(current);
      current = "";
    } else {
      current += char;
    }
  }
  fields.push(current);
  return fields;
}

Canonical Schema

All sources must normalize to this schema:

interface Opportunity {
  externalId: string;         // Source-specific ID
  source: "sam.gov" | "emma" | "rfpmart" | "govtribe";
  title: string;
  description: string;
  summary?: string;
  location: string;
  category: string;
  naicsCode?: string;
  setAside?: string;          // "Small Business", "8(a)", etc.
  postedDate: number;         // Unix timestamp
  expiryDate: number;         // Unix timestamp
  url: string;
  attachments?: Attachment[];
  eligibilityFlags?: string[];
  rawData: Record<string, unknown>;
  ingestedAt: number;
}

SAM.gov Integration

API Endpoint

https://api.sam.gov/opportunities/v2/search

Required Headers

{
  "Accept": "application/json",
  "X-Api-Key": process.env.SAM_GOV_API_KEY
}

Example Query

const params = new URLSearchParams({
  postedFrom: "2024-01-01",
  postedTo: "2024-12-31",
  limit: "100",
  offset: "0",
  ptype: "o",  // Opportunities only
});

Field Mapping

SAM.gov FieldCanonical Field
noticeIdexternalId
titletitle
descriptiondescription
postedDatepostedDate (parse to timestamp)
responseDeadLineexpiryDate (parse to timestamp)
placeOfPerformance.statelocation
naicsCodenaicsCode
setAsideDescriptionsetAside

Convex Implementation

Ingestion Action

// convex/ingestion.ts
import { action, internalMutation } from "./_generated/server";
import { v } from "convex/values";
import { internal } from "./_generated/api";

export const ingestFromSam = action({
  args: { daysBack: v.optional(v.number()) },
  handler: async (ctx, args) => {
    const apiKey = process.env.SAM_GOV_API_KEY;
    if (!apiKey) throw new Error("SAM_GOV_API_KEY not configured");

    const fromDate = new Date();
    fromDate.setDate(fromDate.getDate() - (args.daysBack ?? 7));

    const response = await fetch(
      `https://api.sam.gov/opportunities/v2/search?` +
      `api_key=${apiKey}&postedFrom=${fromDate.toISOString().split("T")[0]}&limit=100`,
      { headers: { Accept: "application/json" } }
    );

    if (!response.ok) {
      throw new Error(`SAM.gov API error: ${response.status}`);
    }

    const data = await response.json();
    let ingested = 0;
    let updated = 0;

    for (const opp of data.opportunitiesData ?? []) {
      const result = await ctx.runMutation(internal.rfps.upsert, {
        externalId: opp.noticeId,
        source: "sam.gov",
        title: opp.title ?? "Untitled",
        description: opp.description ?? "",
        location: opp.placeOfPerformance?.state ?? "USA",
        category: opp.naicsCode ?? "Unknown",
        postedDate: new Date(opp.postedDate).getTime(),
        expiryDate: new Date(opp.responseDeadLine).getTime(),
        url: `https://sam.gov/opp/${opp.noticeId}/view`,
        rawData: opp,
      });

      if (result.action === "inserted") ingested++;
      else updated++;
    }

    // Log ingestion
    await ctx.runMutation(internal.ingestion.logIngestion, {
      source: "sam.gov",
      status: "completed",
      recordsProcessed: data.opportunitiesData?.length ?? 0,
      recordsInserted: ingested,
      recordsUpdated: updated,
    });

    return { ingested, updated, source: "sam.gov" };
  },
});

Upsert Mutation

// convex/rfps.ts (internal mutation)
export const upsert = internalMutation({
  args: {
    externalId: v.string(),
    source: v.string(),
    title: v.string(),
    description: v.string(),
    location: v.string(),
    category: v.string(),
    postedDate: v.number(),
    expiryDate: v.number(),
    url: v.string(),
    rawData: v.optional(v.any()),
  },
  handler: async (ctx, args) => {
    const existing = await ctx.db
      .query("rfps")
      .withIndex("by_external_id", (q) =>
        q.eq("externalId", args.externalId).eq("source", args.source)
      )
      .first();

    const now = Date.now();

    if (existing) {
      await ctx.db.patch(existing._id, { ...args, updatedAt: now });
      return { id: existing._id, action: "updated" as const };
    }

    const id = await ctx.db.insert("rfps", {
      ...args,
      ingestedAt: now,
      updatedAt: now,
    });
    return { id, action: "inserted" as const };
  },
});

Deduplication Strategy

  1. Exact match: externalId + source combination
  2. Title similarity: Fuzzy match titles within same deadline window
  3. URL canonicalization: Normalize URLs before comparison

Eligibility Pre-Filtering

Detect disqualifiers during ingestion:

const DISQUALIFIER_PATTERNS = [
  { pattern: /u\.?s\.?\s*(citizen|company|organization)\s*only/i, flag: "us-org-only" },
  { pattern: /onshore\s*(only|required)/i, flag: "onshore-required" },
  { pattern: /on-?site\s*(required|mandatory)/i, flag: "onsite-required" },
  { pattern: /security\s*clearance\s*required/i, flag: "clearance-required" },
  { pattern: /small\s*business\s*set[- ]aside/i, flag: "small-business-set-aside" },
];

function detectEligibilityFlags(text: string): string[] {
  return DISQUALIFIER_PATTERNS
    .filter(({ pattern }) => pattern.test(text))
    .map(({ flag }) => flag);
}

Scheduled Ingestion

// convex/crons.ts
import { cronJobs } from "convex/server";
import { internal } from "./_generated/api";

const crons = cronJobs();

crons.interval(
  "ingest-sam-gov",
  { hours: 6 },
  internal.ingestion.ingestFromSam,
  { daysBack: 3 }
);

export default crons;

Error Handling

Error TypeAction
Rate limit (429)Exponential backoff, retry after delay
Auth error (401/403)Log error, alert admin
Server error (5xx)Retry up to 3 times
Parse errorLog raw data, skip record

Testing Approach

  1. Mock API responses for unit tests
  2. Use sandbox/test endpoints when available
  3. Validate schema transformation
  4. Test deduplication logic
  5. Verify eligibility flag detection