Skip to main content

Endpoint

POST https://api.documind.cloud/api/v1/extract/{document_id}

Authentication

Requires write:extractions scope.

Path Parameters

ParameterTypeRequiredDescription
document_idstring (UUID)YesID of the uploaded document

Request Body

FieldTypeRequiredDescription
schemaobjectYesJSON Schema defining fields to extract
promptstringNoCustom extraction instructions
modelstringNoModel for Basic mode: google-gemini-2.0-flash, openai-gpt-4.1, or openai-gpt-4o
extraction_modestringNoSet to vlm for VLM mode
review_thresholdnumberNoConfidence threshold for review (0-100, default: 80)
Extraction Modes:
  • Basic: Set model parameter (2-6 credits/page)
  • VLM: Set extraction_mode to vlm (10 credits/page)
  • Advanced: Don’t set model or extraction_mode (15 credits/page with confidence scoring)

Response

Success (200)

{
  "document_id": "123e4567-e89b-12d3-a456-426614174000",
  "results": {
    "invoice_number": "INV-2024-001",
    "total": 1250.00,
    "vendor": {
      "name": "Acme Corp"
    }
  },
  "needs_review": false,
  "needs_review_metadata": {
    "confidence_scores": {},
    "review_flags": {}
  }
}

Examples

import requests

API_KEY = "your_api_key"
document_id = "123e4567-e89b-12d3-a456-426614174000"

schema = {
    "type": "object",
    "named_entities": {
        "invoice_number": {"type": "string"},
        "total": {"type": "number"}
    },
    "required": ["invoice_number", "total"]
}

# Basic mode
response = requests.post(
    f"https://api.documind.cloud/api/v1/extract/{document_id}",
    headers={"X-API-Key": API_KEY},
    json={
        "schema": schema,
        "model": "google-gemini-2.0-flash"
    }
)

result = response.json()
print(f"Invoice: {result['results']['invoice_number']}")
print(f"Total: ${result['results']['total']}")

Error Responses

CodeDescription
400Invalid schema or parameters
402Insufficient credits
404Document not found
500Extraction failed

Next Steps