PDF Separate Node
This node is available for GlobalCapture workflows only, and will need to be downloaded from the Square 9 SDN in order to be used.
The PDF Separate node is used to separate a PDF file into single pages for further processing.
When GlobalCapture needs to perform page level operations, multi-page documents are always split into single page files. This allows for multi-threaded page level processing to improve overall performance and throughput, among other benefits. Using this node, the workflow designer can control when separation happens, and uses specific functions to improve handling of certain types of problem PDF files.
Node Properties
Title
The Title of your node should be brief but descriptive about what is being separated. Titles are important when revisiting workflows in the future and when migrating workflows. The title of the node will be displayed when resolving conflicts during imports.
Description
The Description of your node should provide notes about this node. This could include information about intended use, the documents its separating, etc. Descriptions can be very useful when revisiting workflows in the future.
Remove Unused Resources
Remove Unused Resources is enabled by default, and is recommended to be used. This can have a significant impact on document size during separation impacting processing time and storage.
Data Validation
Data Validation checks to ensure data being added to an index field conforms to the field’s properties such as length and data type. When enabled, a process error will occur if there is a mismatch. If disabled, data will populate the field even if there is a mismatch.
If disabling, there should be a validation step as documents with invalid data formats will error on release to GlobalSearch.
Ex. A numeric field only contains numbers, no letters or symbols.
Important Notes
This node is generally not necessary unless specific PDF files being processed prove problematic. If PDF’s are having output or processing problems, using this node will usually correct them. Insert the node into a workflow before any processing steps that might trigger separation. This include image clean-up, PDF conversion, and validation. The safest place for the node would be immediately following the import node.
Use Cases
Using the PDF Separate node to separate single page invoices scanned in bulk
In this example, I've configured the PDF Separate node to separate all of the pages scanned in a single document, into individual pages. This is useful for mass scanning in documents that are always going to be single page documents, without having to use the Separate Node.
Date | Version | Description |
---|---|---|
1/31/2022 | 1.0 | Initial Release |
|
|
|