[!IMPORTANT] As of 2024-03-15, this repo is now archived, for reference only. For Clearly Local users, please see:
- New demo site at https://cl-tools.deno.dev/pages/easylt-ocr/samples/ocr-sample-editor-poc.html (requires Clearly Local Toolkit account)
- New repo at https://github.com/clearlylocal/easylt-ocr (requires Clearly Local GitHub access)
OcrSdk
classimport { OcrSdk } from 'path/to/OcrSdk.ts'
// credentials obtained from https://cloud.ocrsdk.com/Account/Register
const ocrSdk = new OcrSdk(
applicationId: Deno.env.get('ABBYY_APPLICATION_ID')!, // e.g. 7ea53f47-8bbc-477b-b17c-989a3184c363
password: Deno.env.get('ABBYY_PASSWORD')!, // e.g. n6WL0rCFlhU9bDXDri6AQEZV
serviceUrl: Deno.env.get('ABBYY_SERVICE_URL')!, // e.g. https://cloud-eu.ocrsdk.com/
)
const { txt } = await ocrSdk.ocr(await Deno.readFile('input.jpg'), {
languages: ['English'],
exportFormats: ['txt'],
})
await Deno.writeFile('output.txt', new Uint8Array(await txt.arrayBuffer()))
To run the CLI, note that the relevant ABBYY_APPLICATION_ID
, ABBYY_PASSWORD
, and ABBYY_SERVICE_URL
must be available as environment variables.
# view help
deno task cli --help
# `convert` command, specifying output formats (default "txt")
deno task cli convert path/to/image.jpg -o txt -o xml
# `html`/`json` commands
deno task cli html path/to/image.jpg
deno task cli json path/to/image.jpg
# specify languages (default "English")
deno task cli json path/to/image.jpg -l ChinesePRC -l English
src/
core/
OcrSdk
class, with various methods for interacting with the ABBYY Cloud OCR API. Loosely based on ABBYY’s sample JS code, but with the following changes:
ocr
method to OCR an image and return the output file binary in the requested formatimageMap
function, for converting XML output to an image map that can be rendered in HTML etc.prettifyXml
function, for pretty-printing XML output while preserving significant whitespaceOcrSdk
cli/
functions/
OcrSdk
’s ocr
method to get text and XML files of the OCRed contentimageMap.ts
samples/
convertImage.ts
on ocr-sample.jpg
convertImage.ts
on ocr-sample.jpg
htmlImageMap.ts
on ocr-sample.jpg
jsonImageMap.ts
on ocr-sample.jpg