Skip to main content

Endpoint

POST /upload
Upload one or more documents to receive document IDs for extraction. Upload stores the files; schema generation and extraction perform document conversion later.

Authentication

X-API-Key
string
required
API key for authentication. Your unique API key.

Request Parameters

files
file[]
required
One or more document files to upload.Extraction-compatible formats:
  • application/pdf (.pdf)
  • image/jpeg (.jpg, .jpeg)
  • image/png (.png)
  • image/tiff (.tiff)
  • image/bmp (.bmp)
Other file types may upload, but schema generation and extraction can fail if the backend cannot convert the file.

Response

document_ids
string[]
Array of UUID v4 document identifiers. Use these IDs for extraction requests.

Examples

curl -X POST https://api.documind.cloud/api/v1/upload \
  -H 'X-API-Key: YOUR_API_KEY' \
  -F 'files=@invoice1.pdf' \
  -F 'files=@invoice2.pdf' \
  -F 'files=@receipt.jpg'
[
  "550e8400-e29b-41d4-a716-446655440000"
]

File Size & Limits

The backend does not enforce a documented per-file or per-request count limit in application code. Infrastructure may still reject oversized requests before they reach the backend. For larger batches, split files into multiple upload requests and retry failed requests.

Storage Duration

No public retention SLA is exposed by the backend. Keep your own source documents if you need long-term archival.

Processing Time

The API returns document IDs after the files are stored. OCR, layout analysis, and model processing happen later during schema generation or extraction; the backend does not expose a public upload timing SLA.

Error Responses

400 Bad Request

Malformed multipart request:
{
  "detail": "There was an error parsing the body"
}
Common causes:
  • Missing files form field
  • Invalid multipart body
  • Corrupt upload stream

413 Payload Too Large

Request exceeds size limits:
{
  "detail": "Request Entity Too Large"
}
Solution: Split your batch into smaller requests.

500 Internal Server Error

Server-side processing error:
{
  "detail": "Failed to upload documents. Please try again later."
}
Solution: Retry the request. If the error persists, contact support.

Best Practices

Upload multiple documents in a single request to reduce API calls:
# ✓ Good: Batch upload
files = [("files", open(f, "rb")) for f in file_paths]
response = requests.post(url, headers=headers, files=files)

# ✗ Avoid: Multiple single uploads
for file_path in file_paths:
    response = requests.post(url, headers=headers, 
                           files={"files": open(file_path, "rb")})
Always close file handles after upload:
# Using context manager (recommended)
with open("document.pdf", "rb") as f:
    response = requests.post(url, files={"files": f})

# Or explicitly close
files = [("files", open(f, "rb")) for f in file_paths]
try:
    response = requests.post(url, files=files)
finally:
    for _, fh in files:
        fh.close()
Check file format before uploading:
import os

SUPPORTED = {'.pdf', '.jpg', '.jpeg', '.png', '.tiff', '.bmp'}

def validate_file(file_path):
    ext = os.path.splitext(file_path)[1].lower()
    if ext not in SUPPORTED:
        raise ValueError(f"Unsupported format: {ext}")
    
    return True

# Validate before upload
valid_files = [f for f in file_paths if validate_file(f)]
Map document IDs to original filenames for tracking:
# Create mapping
filename_to_id = {}

for file_path in file_paths:
    with open(file_path, "rb") as f:
        response = requests.post(url, files={"files": f})
        doc_id = response.json()[0]
        filename_to_id[os.path.basename(file_path)] = doc_id

# Save mapping for later reference
import json
with open("document_mapping.json", "w") as f:
    json.dump(filename_to_id, f, indent=2)

Next Steps

After uploading documents, you can:

Generate Schema

Auto-generate extraction schema from a sample document

Extract Data

Extract structured data using a schema

List Documents

View documents with extraction records