/ocr
| Feature | Supported | 
|---|---|
| Cost Tracking | โ | 
| Logging | โ (Basic Logging not supported) | 
| Load Balancing | โ | 
| Supported Providers | mistral,azure_ai | 
tip
LiteLLM follows the Mistral API request/response for the OCR API
LiteLLM Python SDK Usageโ
Quick Startโ
from litellm import ocr
import os
os.environ["MISTRAL_API_KEY"] = "sk-.."
response = ocr(
    model="mistral/mistral-ocr-latest",
    document={
        "type": "document_url",
        "document_url": "https://arxiv.org/pdf/2201.04234"
    }
)
# Access extracted text
for page in response.pages:
    print(f"Page {page.index}:")
    print(page.markdown)
Async Usageโ
from litellm import aocr
import os, asyncio
os.environ["MISTRAL_API_KEY"] = "sk-.."
async def test_async_ocr(): 
    response = await aocr(
        model="mistral/mistral-ocr-latest",
        document={
            "type": "document_url",
            "document_url": "https://arxiv.org/pdf/2201.04234"
        }
    )
    
    # Access extracted text
    for page in response.pages:
        print(f"Page {page.index}:")
        print(page.markdown)
asyncio.run(test_async_ocr())
Using Base64 Encoded Documentsโ
import base64
from litellm import ocr
# Encode PDF to base64
with open("document.pdf", "rb") as f:
    base64_pdf = base64.b64encode(f.read()).decode('utf-8')
response = ocr(
    model="mistral/mistral-ocr-latest",
    document={
        "type": "document_url",
        "document_url": f"data:application/pdf;base64,{base64_pdf}"
    }
)
Optional Parametersโ
response = ocr(
    model="mistral/mistral-ocr-latest",
    document={
        "type": "document_url",
        "document_url": "https://example.com/doc.pdf"
    },
    # Optional Mistral parameters
    pages=[0, 1, 2],              # Only process specific pages
    include_image_base64=True,     # Include extracted images
    image_limit=10,                # Max images to return
    image_min_size=100             # Min image size to include
)
LiteLLM Proxy Usageโ
LiteLLM provides a Mistral API compatible /ocr endpoint for OCR calls.
Setup
Add this to your litellm proxy config.yaml
model_list:
  - model_name: mistral-ocr
    litellm_params:
      model: mistral/mistral-ocr-latest
      api_key: os.environ/MISTRAL_API_KEY
Start litellm
litellm --config /path/to/config.yaml
# RUNNING on http://0.0.0.0:4000
Test request
curl http://0.0.0.0:4000/v1/ocr \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral-ocr",
    "document": {
        "type": "document_url",
        "document_url": "https://arxiv.org/pdf/2201.04234"
    }
  }'
Request/Response Formatโ
info
LiteLLM follows the Mistral OCR API specification.
See the official Mistral OCR documentation for complete details.
Example Requestโ
{
    "model": "mistral/mistral-ocr-latest",
    "document": {
        "type": "document_url",
        "document_url": "https://arxiv.org/pdf/2201.04234"
    },
    "pages": [0, 1, 2],              # Optional: specific pages to process
    "include_image_base64": True,     # Optional: include extracted images
    "image_limit": 10,                # Optional: max images to return
    "image_min_size": 100             # Optional: min image size in pixels
}
Request Parametersโ
| Parameter | Type | Required | Description | 
|---|---|---|---|
| model | string | Yes | The OCR model to use (e.g., "mistral/mistral-ocr-latest") | 
| document | object | Yes | Document to process. Must contain typeand URL field | 
| document.type | string | Yes | Either "document_url"for PDFs/docs or"image_url"for images | 
| document.document_url | string | Conditional | URL to the document (required if typeis"document_url") | 
| document.image_url | string | Conditional | URL to the image (required if typeis"image_url") | 
| pages | array | No | List of specific page indices to process (0-indexed) | 
| include_image_base64 | boolean | No | Whether to include extracted images as base64 strings | 
| image_limit | integer | No | Maximum number of images to return | 
| image_min_size | integer | No | Minimum size (in pixels) for images to include | 
Document Format Examplesโ
For PDFs and documents:
{
  "type": "document_url",
  "document_url": "https://example.com/document.pdf"
}
For images:
{
  "type": "image_url",
  "image_url": "https://example.com/image.png"
}
For base64-encoded content:
{
  "type": "document_url",
  "document_url": "data:application/pdf;base64,JVBERi0xLjQKJ..."
}
Response Formatโ
The response follows Mistral's OCR format with the following structure:
{
  "pages": [
    {
      "index": 0,
      "markdown": "# Document Title\n\nExtracted text content...",
      "dimensions": {
        "dpi": 200,
        "height": 2200,
        "width": 1700
      },
      "images": [
        {
          "image_base64": "base64string...",
          "bbox": {
            "x": 100,
            "y": 200,
            "width": 300,
            "height": 400
          }
        }
      ]
    }
  ],
  "model": "mistral-ocr-2505-completion",
  "usage_info": {
    "pages_processed": 29,
    "doc_size_bytes": 3002783
  },
  "document_annotation": null,
  "object": "ocr"
}
Response Fieldsโ
| Field | Type | Description | 
|---|---|---|
| pages | array | List of processed pages with extracted content | 
| pages[].index | integer | Page number (0-indexed) | 
| pages[].markdown | string | Extracted text in Markdown format | 
| pages[].dimensions | object | Page dimensions (dpi, height, width in pixels) | 
| pages[].images | array | Extracted images from the page (if include_image_base64=true) | 
| model | string | The model used for OCR processing | 
| usage_info | object | Processing statistics (pages processed, document size) | 
| document_annotation | object | Optional document-level annotations | 
| object | string | Always "ocr"for OCR responses | 
Supported Providersโ
| Provider | Link to Usage | 
|---|---|
| Mistral AI | Usage | 
| Azure AI | Usage |