Extract text from image and document files programmatically. This API uses an async task flow: create a task, read task_id, then poll for the result.
The result file URL is valid for 1 hour. Please download and store it promptly.
Authentication
Every API request must include your API Key in the X-API-KEY request header. Send it with each request exactly as shown in the examples and parameter descriptions.
X-API-KEY: YOUR_API_KEY Create an OCR task
/api/tasks/document/ocr Body Parameters
image_url string optional Source file URL. Use either image_url or image_file. If this parameter is present, the other file source parameter must be empty. Do not use ports other than 80 or 443.
image_file file optional Source file as binary multipart data. Use either image_file or image_url. If this parameter is present, the other file source parameter must be empty.
For image upload requirements, see Guidelines and Limits #4.
language string optional Source file language. Default: ChinesePRC, English, and Digits. A file can contain up to 10 languages. Languages must be comma-separated and case-sensitive, for example English,ChinesePRC,Digits.
password string optional Password of the source file if it is password protected. Passwords can be up to 32 characters.
format string optional Result file format. Supported values: txt, pdf, docx, xlsx, and pptx.
Return Parameters
status number HTTP response status code. 200 means success, and non-200 means failure. See Status Code Definitions.
message string API response message. If the task fails, refer to this message or contact support with it.
data.task_id string OCR task ID returned after the task is created.
Query OCR result
/api/tasks/document/ocr/{task_id} Path Parameters
task_id string required OCR task ID returned after creating a task. Use it to query the processing result.
Return Parameters
status number HTTP response status code. 200 means success, and non-200 means failure. See Status Code Definitions.
message string API response message. If the task fails, refer to this message or contact support with it.
data.task_id string OCR task ID. If the task fails, contact support with this task_id.
data.created_at string Task creation time as a Unix timestamp string.
data.processed_at string Task processing start time as a Unix timestamp string.
data.completed_at string Task completion time as a Unix timestamp string.
data.file string OCR result file URL. The URL is valid for 1 hour after processing succeeds.
data.progress number Task progress. 100 means processing is complete.
data.state number Task status code. 1 means succeeded, greater than 1 means processing, and less than 0 means failed. See Status Code Definitions.
Guidelines and Limits
-
The result file URL is valid for 1 hour. Please download and store it promptly.
-
HTTP status 200 indicates that the HTTP request succeeded, not necessarily that the OCR task succeeded. See Status Code Definitions for details.
-
When passing image_url, follow URL encoding standards and do not use ports other than 80 or 443.
-
Uploaded files must meet the following format, resolution, and file size limits.
Source format Output format Resolution Size pdf, ppt, pptx, xls, xlsx, doc, docx, jpeg, jpg, png, gif, bmp pdf, docx, pptx, xlsx, txt Image files up to 32512 x 32512 Up to 200 MB
# OCR API
Extract text from image and document files programmatically. The OCR API uses an asynchronous task flow: create a task, read `data.task_id`, then poll the query endpoint until the task finishes.
> Note: The result file URL is valid for 1 hour. Download and store it promptly.
## Base URL
```
https://techhk.aoscdn.com
```
## Authentication
Every request must include your API key in the `X-API-KEY` request header:
```http
X-API-KEY: YOUR_API_KEY
```
Get or manage your API key from [API Key](https://picwish.com/my-account?subRoute=api-key).
## Request mode
OCR currently uses an asynchronous task flow:
1. Create an OCR task with a source file.
2. Read `data.task_id` from the create response.
3. Poll the query endpoint until `data.state=1` or `data.state<0`.
## Source file
Exactly one source file is required:
- `image_url` - a publicly reachable source file URL. Do not use ports other than 80 or 443.
- `image_file` - the source file uploaded as multipart binary data.
Do not send both at the same time. If one file source parameter is present, the other file source parameter must be empty.
## Endpoints
| Purpose | Method | Path |
| --- | --- | --- |
| Create an OCR task | POST | /api/tasks/document/ocr |
| Query OCR result | GET | /api/tasks/document/ocr/{task_id} |
## Create an OCR task
`POST /api/tasks/document/ocr`
Content-Type: `multipart/form-data`
Each successful call consumes 1 credit.
### Body parameters
| Name | Type | Required | Description |
| --- | --- | --- | --- |
| image_url | string | one of image_url / image_file | Source file URL. Mutually exclusive with image_file. Do not use ports other than 80 or 443. |
| image_file | file | one of image_url / image_file | Source file as binary multipart data. Mutually exclusive with image_url. |
| language | string | optional | Source file language. Default: ChinesePRC, English, and Digits. A file can contain up to 10 languages. Use comma-separated, case-sensitive names, for example `English,ChinesePRC,Digits`. |
| password | string | optional | Password of the source file if it is password protected. Maximum length: 32 characters. |
| format | string | optional | Result file format. Supported values: `txt`, `pdf`, `docx`, `xlsx`, `pptx`. |
### Return parameters
| Name | Type | Description |
| --- | --- | --- |
| status | number | HTTP-style status code. 200 = success, non-200 = failure. See /states. |
| message | string | Response message. If the task fails, refer to this message or contact support with it. |
| data.task_id | string | OCR task ID. Use it to query the result later. |
### Examples
Source file URL:
```bash
curl -k 'https://techhk.aoscdn.com/api/tasks/document/ocr' \
-H 'X-API-KEY: YOUR_API_KEY' \
-F 'image_url=YOUR_FILE_URL' \
-F 'format=txt'
```
Local file upload:
```bash
curl -k 'https://techhk.aoscdn.com/api/tasks/document/ocr' \
-H 'X-API-KEY: YOUR_API_KEY' \
-F 'image_file=@/path/to/file.pdf' \
-F 'format=txt'
```
## Query OCR result
`GET /api/tasks/document/ocr/{task_id}`
Poll this endpoint after creating a task. We recommend polling every 1 second and keeping the total polling duration within 120 seconds. Stop polling when `data.state=1` or `data.state<0`.
### Path parameters
| Name | Type | Required | Description |
| --- | --- | --- | --- |
| task_id | string | required | The OCR task ID returned by the create request. |
### Return parameters
| Name | Type | Description |
| --- | --- | --- |
| status | number | HTTP-style status code. 200 = success, non-200 = failure. See /states. |
| message | string | Response message. If the task fails, refer to this message or contact support with it. |
| data.task_id | string | OCR task ID. If the task fails, contact support with this task_id. |
| data.created_at | string | Task creation timestamp. |
| data.processed_at | string | Timestamp when processing started. |
| data.completed_at | string | Task completion timestamp. |
| data.file | string | OCR result file URL. URL results are valid for 1 hour. |
| data.progress | number | Task progress. 100 means processing is complete. |
| data.state | number | 1 = succeeded; > 1 = still processing; < 0 = failed. See /states. |
### Example
```bash
curl -k 'https://techhk.aoscdn.com/api/tasks/document/ocr/{task_id}' \
-H 'X-API-KEY: YOUR_API_KEY'
```
## Recommended asynchronous flow
1. POST to /api/tasks/document/ocr with exactly one source file.
2. Read `data.task_id` from the create response.
3. GET /api/tasks/document/ocr/{task_id} every 1 second.
4. Stop polling when `data.state=1` or `data.state<0`, or when polling reaches about 120 seconds.
5. Download `data.file` within 1 hour after processing succeeds.
## Guidelines and Limits
- The result file URL is valid for **1 hour**. Please download the result file promptly.
- HTTP status 200 indicates that the HTTP request succeeded, not necessarily that the OCR task succeeded. Check `data.state` for task success or failure.
- When passing `image_url`, follow URL encoding standards and do not use ports other than 80 or 443.
- Uploaded files must meet the following format, resolution, and file size limits.
| Source format | Output format | Resolution | File size |
| --- | --- | --- | --- |
| pdf, ppt, pptx, xls, xlsx, doc, docx, jpeg, jpg, png, gif, bmp | pdf, docx, pptx, xlsx, txt | Image files up to 32512 x 32512 | Up to 200 MB |
## Status codes
Determine success by combining the HTTP response status code (`status`) with the task status code (`data.state`).
### HTTP response status codes
| Code | Meaning |
| --- | --- |
| 200 | The request is successful. |
| 400 | Wrong parameter passed by the client. Check whether a parameter is missing or has an incorrect value. |
| 401 | Unauthorized API key. Check that X-API-KEY is correct and the service is enabled. |
| 404 | The requested URL or resource does not exist. Check that the URL or task_id is correct. |
| 413 | The uploaded file exceeds the allowed size. Refer to the supported image size. |
| 429 | Request frequency exceeds the QPS limit (default QPS is 2). Slow down or contact us to raise your QPS. |
| 500 | Server-side exception. Please contact support. |
### Task status codes (data.state)
1 = succeeded; greater than 1 = still processing; less than 0 = failed.
| Code | Meaning |
| --- | --- |
| -17 | Processing failed because the prompt is invalid. |
| -16 | Processing failed because a third-party review detected prohibited content. |
| -15 | Processing failed due to insufficient resources. |
| -14 | Processing failed because the input image content does not meet the requirements. |
| -13 | Processing failed because the task was canceled due to an exception. |
| -11 | Processing failed because the result is empty. |
| -10 | Processing failed because internal review detected prohibited content. |
| -9 | Processing failed because the internal program failed during loop processing. |
| -8 | Processing timed out. The maximum processing time is 180 seconds. |
| -7 | Invalid image file (e.g. corrupted image or incorrect format). |
| -5 | The image_url image exceeds the size limit (30MB). |
| -3 | The server failed to download your file. Check that the source image URL is available. |
| -2 | Processing completed, but uploading the result to OSS failed. |
| -1 | Processing failed. |
| 0 | Queued. The task is waiting in the queue. |
| 1 | Completed. Processing succeeded. |
| 2 | Preparing. |
| 3 | Waiting. |
| 4 | Processing in progress. |
| 5 | Internally publishing the result. |
| 6 | Processing. Internal loop processing is in progress. |