Agent SkillsAgent Skills
co-labs-co

ios-vision-ocr

@co-labs-co/ios-vision-ocr
co-labs-co
0
0 forks
Updated 4/13/2026
View on GitHub

name: ios-vision-ocr

Installation

$npx agent-skills-cli install @co-labs-co/ios-vision-ocr
Claude Code
Cursor
Copilot
Codex
Antigravity

Details

Pathskill/ios-vision-ocr/SKILL.md
Branchmain
Scoped Name@co-labs-co/ios-vision-ocr

Usage

After installing, this skill will be available to your AI coding assistant.

Verify installation:

npx agent-skills-cli list

Skill Instructions


name: ios-vision-ocr version: 0.1.0 author: cmtzco description: iOS Vision Framework OCR for text recognition from images. When to use: when extracting text from images, implementing document scanning, building photo analysis features, working with VNRecognizeTextRequest, or processing captured photos. What problems it solves: provides accurate on-device text recognition, supports multiple languages, handles image preprocessing, and processes images efficiently.

iOS Vision OCR

Overview

The Vision Framework's text recognition capabilities enable iOS apps to extract text from images and photos. This skill covers setting up VNRecognizeTextRequest, configuring image handlers, processing recognition results, and handling different image sources.

Identified Patterns

The following patterns were identified in this codebase that this skill addresses:

Pattern 1: Text Recognition Request Setup

  • Location: DadBrain/DadBrain/ViewModels/PhotoScanner.swift
  • Description: VNRecognizeTextRequest initialization with configuration
  • Frequency: 1 complete implementation

Pattern 2: Image Handler Processing

  • Location: DadBrain/DadBrain/ViewModels/PhotoScanner.swift
  • Description: VNImageRequestHandler for processing captured images
  • Frequency: 1 complete implementation

Pattern 3: Result Extraction

  • Location: DadBrain/DadBrain/ViewModels/PhotoScanner.swift
  • Description: Processing VNRecognizedTextObservation results
  • Frequency: 1 complete implementation

TODO: Implementation

This skeleton skill needs Phase 2 refinement. Areas to develop:

Workflow Instructions

<!-- TODO: Add step-by-step workflow for Vision OCR integration -->
  1. Import Vision framework
  2. Configure VNRecognizeTextRequest with recognition level
  3. Create VNImageRequestHandler from image data
  4. Perform recognition request
  5. Extract and process text results

Best Practices

<!-- TODO: Research and document Vision Framework best practices -->
  • Choosing recognition accuracy levels
  • Handling multiple text regions
  • Processing images efficiently
  • Supporting multiple languages

Common Pitfalls

<!-- TODO: Document error-prone areas and how to avoid them -->
  • Forgetting to set recognition level
  • Not handling image orientation
  • Processing very large images (performance)
  • Missing camera permission

Suggested Resources

Based on codebase analysis, consider adding:

Scripts (scripts/)

  • vision-text-extractor.swift - Standalone text extraction utility
  • ocr-test-generator.swift - Test image generator for OCR

References (references/)

  • vision-framework-guide.md - Apple's Vision Framework documentation
  • text-recognition-config.md - VNRecognizeTextRequest configuration options
  • image-preprocessing.md - Preparing images for optimal OCR results

Assets (assets/)

  • recognition-templates/ - Different recognition scenarios
  • test-images/ - Sample images for testing OCR

Evidence from Codebase

FilePatternRelevance
DadBrain/DadBrain/ViewModels/PhotoScanner.swiftComplete Vision OCR integrationFull implementation with result processing
DadBrain/DadBrain/Resources/Info.plistCamera permissionsNSCameraUsageDescription, NSPhotoLibraryUsageDescription

Refinement Priority

Score: 6.6/10 Priority: Medium

Refinement Tasks

  1. Research Vision Framework documentation
  2. Create vision-text-extractor script
  3. Document recognition level options
  4. Add examples for different image sources
  5. Create optimization guide for performance

Skeleton generated by ContextHarness /baseline Run Phase 2 skill refinement to complete implementation