Each document is associated with a set of data fields. The most typical case are fields that are captured from the document, such as date due, order id or total amount. These fields are pre-captured by the AI engine, shown in the sidebar, and part of the export file. This list of fields is controlled by the schema.
The easiest way to customize your fields is the Fields to capture screen that is part of the Settings section of the web app. You can enter it using the gear icon near the top left corner. In this screen, you can simply enable or disable fields. Follow this visual guide.
If you are using a fresh trial account, you will find soon that your schema contains all the AI fields, not just these visible in your sidebar – only that the rest is marked as hidden and non-exported. This is to give you a good starting point when customizing your schema, and also the user interface for configuring fields to capture will show only the fields that are actually listed in the schema.
Therefore, the checkbox approach above does not actually remove or add fields to the schema, merely disable and enable them (by flipping their hidden and can_export flags).
You may take a much finer-grained control of your fields by editing the schema at a lower level.
Fields in your schema may come in more shapes and forms than just whatever you see in the sidebar. A sidebar field can be bound to multiple AI fields in a fallback manner, carry a default value, the type may vary, and even advanced types like dropdown selectboxes or multivalue fields may be set up.
Another feature (relevant when writing your own extension) is the ability to pre-fill the value during document upload, not just based on the AI. Perhaps you already know during upload what PO an invoice ought to be associated with, or who it came from.
Just what is a finer-grained control? You may control types, constraints and defaults on fields. Or the options available in the dropdown enum fields. Which AI field (or multiple fields) to use as a data source. And even their labels in the sidebar.
The easiest way to work with the schema in a JSON format (a code-like text file) is using Rossum's in-app raw schema editor. Follow the visual guide to understand not just the editor, but also the basics of the schema structure.
Sometimes, it is useful to manage the schema via API using the rossumctl tool. This tool allows scripting schema updates, but also edit the schema as a specially formatted XLSX spreadsheet. We have prepared a detailed tutorial on XLSX schema configuration for you. The most powerful capability of the XLSX format is editing long dropdown lists easily.
Updated 2 days ago