Rossum Developer Hub

Rossum Data Capture for Developers and Integrators

Welcome to the Rossum developer hub. You'll find comprehensive guides and documentation to help you implement Rossum as quickly as possible, as well as support if you get stuck.

Let's jump right in!

Developer Guides    API Reference    User Help Center

Data Capture Automation with Rossum

Data Capture Automation means that document import, processing, validation, and export happen seamlessly without waiting for human review in most cases.

This represents the most advanced stage of a Cognitive Data Capture process implementation. Consider this: you import a document to Rossum. Right away, Rossum's AI will process the document. Once done, Rossum will decide whether the extracted data is good to automatically send to the downstream system, or if it needs a human to review and confirm its results.

To make this happen, Rossum offers automation on two levels:

  • Per-field automation allows operators to validate only a subset of fields, while fields that were confirmed automatically need not be revisited by a human. This is represented by the eye symbol next to a field (or the absence of thereof); while the TAB key moves field by field, the ENTER key skips only between fields that require per-field automation. A basic level of per-field automation is active by default, while additional implementation can expand on this.
  • Whole-document automation allows operators to entirely skip the manual validation phase for an entire document. The basic idea is that a document is automated if all its captured fields are automated. The detailed mechanism depends on the configuration of a particular queue.

Automation Requirements

While whole-document automation is ultimately desirable, achieving it is typically a gradual (and often long-term) process, and intermediate phases can already bring a lot of value in terms of overall effort reduction when compared to manual data entry.

The main prerequisite for automation is a high level of AI accuracy. The required level of accuracy depends entirely on how exactly the data is post-processed and used - we see users with 85% accuracy opting for full automation, while others demand 99.9% accuracy.

If the out-of-the-box accuracy of the system is sufficient, automation rollout is fairly straightforward in terms of automating all documents by default (and potentially setting up exception checks to keep specific cases for human review).

However, even if the general AI accuracy is lower than needed, Rossum can be configured to automate only a subset of documents that matches a sufficiently higher level of accuracy. Determining the sufficient level of accuracy to automate a document is based on AI output, the platform's built-in features, and custom rules defined in your integration.

Key components of Automation

To decide if a document would meet the business' expectations and therefore should be automatically sent to a downstream system, Rossum needs a strong "compass". We consider various use cases and perspectives on data extraction automation, so it's important for us to give you the flexibility to control automation levels and parameters. Currently, we work with a combination of these tools for automation:

  • Extraction confidence scores
  • Built-in validation checks
  • Custom validation checks

Extraction confidence scores

Rossum's AI engine processes your documents and captures the values. Extracted data comes with metadata which reveals us a number of useful details, including the confidence score for each field on a given document. That score is an indication of just how confident the AI Engine is that it got the text and the location of the field correctly. Confidence scores may range from 0 to 1, with the exception of line item fields which aren't currently scored (for such fields the confidence score is "null").

Built-in validation checks

In addition to the confidence scores, Rossum will automatically run a number of basic checks on some of the extracted values. e.g. if a set of amount values sums up properly, Rossum trusts these values and enables automatic export. Also, repeated sets of values for a particular supplier increases confidence over time.

These checks apply to confirm various fields, and their results are saved as validation sources. We are actively adding more checks and extending the configuration of this feature.

Custom validation checks

Built-in checks are naturally limited to a small number of generalizable rules. You can build your own business-specific validation checks to augment the automation logic. Like built-in checks, custom checks may prevent documents with errors from being automatically passed to the downstream system (negative checks). Or alternatively, custom checks may verify that values are certain to be correct and there's no reason to stop automation (positive checks).

Examples of such custom automation checks could be:

  • Mark value as validated if the “Total amount” on the invoice matches the sum of the “Total amount” column in line items
  • Mark value as validated if “Total amount” equals the sum of “Total base” and “Total tax”
  • Stop Automation if “Due date” is before “Issue date”

Updated 4 months ago


Data Capture Automation with Rossum


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.