What is Documind API?
Documind provides a powerful document extraction API designed for automation developers and integration engineers. Upload documents, define extraction schemas, and receive structured JSON data with confidence scores and automatic review flagging.Key Capabilities
Multi-Model Extraction
Choose between Basic, VLM-based, or Advanced (multi-model ensemble) extraction modes with different accuracy and speed trade-offs.
Schema Flexibility
Use predefined schemas, generate from samples, or create custom schemas for any document type.
Review Workflow
Automatic flagging of low-confidence fields enables human-in-the-loop validation for critical data.
Credit-Based Usage
Transparent per-page pricing with different credit costs for Basic (2-6), VLM (10), and Advanced (15) extraction.
How It Works
1
Upload Documents
Submit PDF, Word, or image files via the
/upload endpoint. Returns document IDs for processing.2
Define Schema
Specify what data to extract using JSON Schema format. Generate schemas automatically or use predefined templates.
3
Extract Data
Process documents with
/extract/{document_id}. Choose extraction mode and review threshold.4
Handle Reviews
Poll
/data/extractions to detect when is_reviewed=true for documents that needed review. Use corrected data in your automation.Extraction Modes
Basic Extraction (2-6 credits/page)
Basic Extraction (2-6 credits/page)
Single-model extraction with your choice of:
- GPT-4o (6 credits) - Most accurate
- GPT-4.1 (4 credits) - Balanced
- Gemini 2.0 Flash (2 credits) - Fastest
VLM Based Extraction (10 credits/page)
VLM Based Extraction (10 credits/page)
Uses native visual data to process content through multiple Vision-Language Models.Best for low-text, high-visual content like scanned documents, images, or forms where layout is crucial.
Advanced Extraction (15 credits/page)
Advanced Extraction (15 credits/page)
Multi-model ensemble extraction utilizing document layout, reading order, and OCR’d text.Includes confidence scores and automatic review flagging. Best for structured documents like invoices, forms, and tables requiring high accuracy. Activated by not setting
model or extraction_mode parameters.Use Cases
Common Use Cases
- Invoice Processing: Extract line items, totals, vendor details
- Form Data Entry: Digitize paper forms into structured data
- Document Classification: Identify document types and route accordingly
- Compliance Checks: Extract specific fields for validation
Integration Examples
Authentication
All API requests require authentication using API keys passed in theX-API-Key header:
Rate Limits & Credits
- API Calls: Track usage via
/usage/currentendpoint - Credits: Deducted per page/image processed
- Daily Refresh: Credits refresh based on your subscription tier
- Insufficient Credits: Returns
402 Payment Requiredstatus