abbyy-ocr-sdk

Clearly Local ABBYY Cloud OCR SDK

[!IMPORTANT] As of 2024-03-15, this repo is now archived, for reference only. For Clearly Local users, please see:

Usage

OcrSdk class

import { OcrSdk } from 'path/to/OcrSdk.ts'

// credentials obtained from https://cloud.ocrsdk.com/Account/Register
const ocrSdk = new OcrSdk(
	applicationId: Deno.env.get('ABBYY_APPLICATION_ID')!, // e.g. 7ea53f47-8bbc-477b-b17c-989a3184c363
	password: Deno.env.get('ABBYY_PASSWORD')!, // e.g. n6WL0rCFlhU9bDXDri6AQEZV
	serviceUrl: Deno.env.get('ABBYY_SERVICE_URL')!, // e.g. https://cloud-eu.ocrsdk.com/
)

const { txt } = await ocrSdk.ocr(await Deno.readFile('input.jpg'), {
	languages: ['English'],
	exportFormats: ['txt'],
})

await Deno.writeFile('output.txt', new Uint8Array(await txt.arrayBuffer()))

CLI

To run the CLI, note that the relevant ABBYY_APPLICATION_ID, ABBYY_PASSWORD, and ABBYY_SERVICE_URL must be available as environment variables.

# view help
deno task cli --help
# `convert` command, specifying output formats (default "txt")
deno task cli convert path/to/image.jpg -o txt -o xml
# `html`/`json` commands
deno task cli html path/to/image.jpg
deno task cli json path/to/image.jpg
# specify languages (default "English")
deno task cli json path/to/image.jpg -l ChinesePRC -l English

Files