PDF OCR API for Developers

    OCR

    PDF OCR API built for teams that need reliable document automation at scale. Convert and process files with simple REST requests, predictable output quality, and production-grade uptime. Use it for invoice extraction, archive digitization, and search indexing pipelines. Includes clear docs, SDK-ready endpoints, and quick testing in your browser.

    What it does

    OCR parse PDFs using page range selection.

    Configure lang, dpi, psm, oem for accuracy/performance trade-offs.

    Perfect for invoice parsing, archives, and search indexing pipelines.

    Endpoint & Example

    POST /v1/pdf/ocr/parse

    url / file
    required

    Input scanned PDF via public URL or multipart file upload.

    pages
    optional

    Page selection string like all, 1-3, or 1,3,5-7.

    lang, dpi, psm, oem
    optional

    OCR language and Tesseract engine settings for quality and speed control.

    curl -X POST https://pdfapihub.com/api/v1/pdf/ocr/parse \
      -H "CLIENT-API-KEY: your_api_key_here" \
      -H "Content-Type: application/json" \
      -d '{
        "url": "https://example.com/scanned.pdf",
        "pages": "1-3",
        "lang": "eng",
        "dpi": 220,
        "psm": 3,
        "oem": 3
      }'

    Sandbox

    Run OCR on your first document

    Get an API key and validate OCR quality in playground before wiring your workflow.