Document Information Extraction

The Document Information Extraction Agentic Skill extracts any information from documents, in any information hierarchy and any document format, from structured (e.g. tax forms) to semi-structured (e.g. invoices) to unstructured (e.g. contracts).

The Agentic Skill in Document Information Extraction enables users to setup and use it in minutes without the need for any training data. The user can also provide Instructions in natural language to the AI Skill as well as each individual Label - this enables advanced data processing features such as Generative Reasoning and synthesis.

A single AI model can capture and process the full variety of a document type, including a wide range of different formats, layouts and languages, with highest accuracy and low amount of training data.

Our models can process and extract line items, tables, checkboxes, handwriting, stamps, QR Codes, use vision to interpret images, and more.

Supported file formats

The supported file formats are as follows:

.pdf
.png, .jpg, .jpeg
.tif, .tiff

Supported file formats​

Supported file formats