Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
120 changes: 120 additions & 0 deletions pages/generative-apis/how-to/query-ocr-models.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
---
title: How to query OCR models
description: Learn how to interact with powerful OCR models using Scaleway's Generative APIs service.
tags: generative-apis ai-data ocr-models ocr-api
dates:
validation: 2026-04-14
posted: 2026-04-14
---
import Requirements from '@macros/iam/requirements.mdx'

Scaleway's Generative APIs service allows users to interact with powerful OCR (Optical Character Recognition) models hosted on the platform.

OCR models can extract structured text from documents such as PDFs and images, preserving formatting and layout in the output.

<Message type="note">
OCR models are currently available via the [OCR API](https://www.scaleway.com/en/developers/api/generative-apis/#path-ocr-beta-create-a-text-extraction) only and are not yet integrated into the Scaleway console playground.
</Message>

<Requirements />

- A Scaleway account logged in to the [console](https://console.scaleway.com)
- [Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization
- A valid [API key](/iam/how-to/create-api-keys/) for API authentication
- Python 3.7+ installed on your system

## Query OCR models via API

You can query the models programmatically using your favorite tools or languages.
In the example that follows, we will use the MistralAI Python client.

### Install the MistralAI SDK

Install the MistralAI SDK using pip:

```bash
pip install mistralai
```

### Initialize the client

Initialize the MistralAI client with your base URL and API key:

```python
from mistralai.client import Mistral

# Initialize the client with your server URL and API key
mistral = Mistral(
server_url="https://api.scaleway.ai", # Scaleway's Generative APIs service URL
api_key="<SCW_SECRET_KEY>" # Your unique API secret key from Scaleway
)
```

Comment thread
firdevs-a marked this conversation as resolved.
<Message type="important">
This code sample requires `mistralai >= 2.0.0`. For `mistralai <= 1.12.4` (also named `v1`), replace `from mistralai.client import Mistral` with `from mistralai import Mistral`.
</Message>

### Generate an OCR text extraction

You can now generate a text extraction.
In the example below, the sample PDF file, [scaleway-impact-report-10-pages.pdf](https://genapi-documentation-assets.s3.fr-par.scw.cloud/scaleway-impact-report-10-pages.pdf), is sent to the OCR model via a public URL. The extracted text from each page is written to a local Markdown file.

```python
# Generate a text extraction using the 'mistral-ocr-2512' model
FILE_URL = "https://genapi-documentation-assets.s3.fr-par.scw.cloud/scaleway-impact-report-10-pages.pdf"
MODEL = "mistral-ocr-2512"

res = mistral.ocr.process(
model=MODEL,
document={
"document_url": FILE_URL,
"type": "document_url",
}
)

filename = FILE_URL.split("/")[-1].split(".")[0]
with open(f"{filename}.md", "w") as f:
for page in res.pages:
f.write(page.markdown)

# Print the generated response
print(f"File processed. Result markdown file stored in: {filename}.md")
```

Once the script completes, a Markdown file named `scaleway-impact-report-10-pages.md` is created in the current directory, containing the extracted and formatted text from each page of the PDF.

<Message type="tip">
You can replace `FILE_URL` with the URL of any publicly accessible PDF or image file.
For example, you can provide a file from Object Storage using an [Object Storage pre-signed URL](https://www.scaleway.com/en/docs/object-storage/how-to/access-objects-via-https/).
</Message>

Comment thread
firdevs-a marked this conversation as resolved.
Alternatively, you can also provide a local PDF file encoded in Base64 format.

```python
import base64

FILE_PATH = "path/to/your/file.pdf"
MODEL = "mistral-ocr-2512"

with open(FILE_PATH, "rb") as file:
file_content = file.read()
encoded_file= base64.b64encode(file_content).decode("utf-8")

res = mistral.ocr.process(
model=MODEL,
document={
"document_url": f"data:application/pdf;base64,{encoded_file}",
"type": "document_url",
}
)

filename = FILE_PATH.split("/")[-1].split(".")[0]
with open(f"{filename}.md", "w") as f:
for page in res.pages:
f.write(page.markdown)

# Print the generated response
print(f"File processed. Result markdown file stored in: {filename}.md")
```

Refer to the dedicated [OCR API documentation](https://www.scaleway.com/en/developers/api/generative-apis/#path-ocr-beta-create-a-text-extraction) for a full list of all available parameters.
6 changes: 5 additions & 1 deletion pages/generative-apis/menu.ts
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,11 @@ export const generativeApisMenu = {
label: 'Query audio models',
slug: 'query-audio-models'
},
{
{
label: 'Query OCR models',
slug: 'query-ocr-models'
},
{
label: 'Query reranking models',
slug: 'query-reranking-models'
},
Expand Down
Loading