Document import specification

Documents may be imported into Rossum using the REST API or email gateway. To ensure successful import and subsequent processing of documents in Rossum, files must meet certain criteria. We divide them to requirements (these must be matched for a file to be processed by Rossum) and recommendations (following which guarantees the highest possible processing reliability).

Rossum extracts data from all document pages. This behavior can be limited to extraction only from several initial pages. It is generally not necessary to remove additional pages of other types (for example a purchase order appended after an invoice). Splitting of documents can be done manually via UI or automatically using a special Separator page.

File Requirements

  • supported file formats are PDF, PNG, JPEG, TIFF, XLSX, and DOCX.
  • documents of maximum size 40 MB altogether can be sent in one API import call
  • size of a single document may not exceed 13 MB
  • one page may contain only one document (i.e. two receipts on one page cannot be extracted separately)

Document Recommendations

  • image resolution should be at least 150 DPI in case of scans/photos
  • minimum font size on a document should be 6pt
  • documents should be in A4 or Letter format (small-size documents like receipts should be scanned on top of a blank A4 page)
  • scans should not have extremely large dimensions, ideally no more than 3000 pixels on each side