Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Square 9 offers a number of traditional OCR options, but also has options that leverage tooling in the areas of AI and ML. While more modern extraction tooling can be very good at decreasing setup time, it’s often not a complete solution. Customer’s Customers may need to blend modern and traditional approaches for extracting data to form a complete / , all encompassing data capture platform.

Square 9’s most recent offering in the AI extraction space, dubbed Form and Table Extraction (FTE), involves AI assisted extraction models that are largely application/document/form independent. FTE works off of two core constructs: Forms forms and Tablestables.

FTE differs from other AI driven extraction offerings from Square 9 like TransformAI. Most notably, it’s it is not document specific . The tooling and can operate on any document type. Like TAI howeverHowever, like TransformAI, successful extraction does outcomes do have some rules for success. For TAI, those rules revolve around document characteristics that are common among Invoices and Receipts. For FTE, the rules revolve around data points being grouped into either Key key / Value value pairs and Tables.

Key / Value Pairs

Keys and their associated values are a core construct of extraction with FTE. For the more technical audience, Key / Value pairs are a common idea often type of data structure used in programing and scripts. In the context of a document however, Key / Value pairs can take on new meaning.

...

In a traditional extraction model, users are generally less concerned about keys and focus exclusively on values. It would be very simply simple to create an OCR template that extracted values for Customer Number, Invoice Number, Invoice Date, and Total Due. As your capture needs expand however, this model becomes fragile. Variances in scan resolution might impact positioning, and most certainly, similar documents produced by other vendors will introduce differences in layout. Square 9’s GlobalCapture offers a number of tools to help with such discrepancies in a more traditional manner, whether it be through Marker Zones, pattern matching, etc. FTE takes a different approach.

...

While the OCR results are extremely good, success does require adherence to a pattern of some kind. In the case of FTE, each value is expected to have a descriptive key in it’s its general vicinity. This does not mean keys and values need to be laid out horizontallypresented in a specific way visually, nor does it mean grid lines must be present in the document’s layout. It does however mean means that for each value one cares to extract, there must in fact be a related key.

And example A case where this may present as problematic might be an issue is in the upper left corner of the W.B. Mason example above. In this case, both a phone number and a website address are present below the logo and address block.

...

In this example, there are two possible outcomes:

  1. Either the The phone number, the website address, or both simply don’t extract.

  2. One or both values extract, but do so with a Key of “Address Service Requested”.

In either case, extracting one or both of these data points is likely better served with an alternate method. Certainly For example, a traditional zone extraction could be perform used to collect these data points. Alternately, There are also other use case specific tools like Transform AI (which is tuned specifically for invoices and receipts) that looks very specifically for data points that might match phone numbers and website web addresses regardless of the presence of a Key.

...

In addition to Key / Value extraction, FTE is also very good at identifying and extracting tables. Leveraging the power of AI, FTE can identify formatted tables on can be used to identify and extract tabular structured data from a document page. Because it is not bound to a document type, FTE is significantly less rigid about the offers greater flexibility with tables and their associated values when compared to a feature like TAI.

...