Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • TextOCR.cfg: C:\GetSmart\CaptureServices\GlobalCapture_1
  • FullPageOCR.cfg:\GetSmart\CaptureServices\GlobalCapture_1

...

Info
titlePlease Be Advised

If the TextOCR.cfg files are different between the Template Designer and the capture engine, the Template will read differently from what the engine will perform.

...

 


Full Page OCR and Zonal Settings

...

  • FullPageOCR.cfg – TextPDF/Full Page OCR Configuration settings.
    • These settings are used when converting a document to a text searchable PDF or other electronic formats.
  • TextOCR.cfg
    • These settings are used when extracting data use Zonal OCR.
  • FullPageBaseSettings.cfg
    • These settings contain a profile of commonly used settings present in version 4.1, which are customized further by FullPageOCR.cfg.

...

Info
titlePlease Be Advised

Changes to the default settings of these configuration files are not supported and to be modified at your own risk. Generally speaking, changes to these files will be done, or came at the advisement of a Square 9 Technician

...


Configuration Objects

Both Files, TextOCR.cfg and FullPageOCR.cfg have similar configurations when GlobalSearch desktop client is installed. Each configuration files consists of one or many objects. In each object, there are a number of properties that can be defined, these objects are as follows and are found in the FullPageOCR.cfg and TextOCR.cfg files:

  • PDFExportParams
  • PagePreprocessingParams
  • PageAnalysisParams
  • ObjectsExtractionParams
  • RecognizerParams
  • DocumentStructureDetectionParams

...


Info
titlePlease Be Advised

These changes are global, changing these will affect all zonal OCR and text PDF activities

...


FullPageBaseSettings.cfg and ZonalBaseSettings.cfg Settings

...

DocumentStructureDetectionParams

...


FunctionDescriptionValue
ClassifySeparatorsAdditional properties of separators, such as their type is detected. GlobalSearch LAN does need this information and the value should be set to FalseBoolean
DetectFootnotesThe footnotes are detected during document synthesis. GlobalSearch LAN does not require this and for quicker extraction, this value should be set to false
DetectTableOfContentsThe TableOfContents are detected during document synthesis. GlobalSearch LAN does not require this and for quicker extraction, this value should be set to false.


Info

The default values for these parameters are set to TRUE. Smart Search does not require these parameters and for quicket extraction, these values should be set to FALSE

...

Text Parameters

TextLanguage Value Table

...

Square 9 OCR Settings & Tested High Performance Settings

In the event your TextOCR or FullPageOCR configuration files are lost or corrupted, you can construct a new one using the settings directly below. Additionally, Square 9 offers a higher perfomance set of OCR parameters in the second code box below. These should not be used on burdened or slower servers. This will increase processing time but also increase OCR accuracy. Modification to one configuration file should be done to all configuration files to maintain parity and consistency.

Original OCR Settings

Code Block
languagec#
[PDFExportParams]
PDFAComplianceMode=PCM_Pdfa_1b
TextExportMode=PEM_ImageOnText
Colority=PCM_KeepColority
[PagePreprocessingParams]
CorrectOrientation = true
[PrepareImageMode]
CorrectSkew = false
[PageAnalysisParams]
ProhibitModelAnalysis=true
[ObjectsExtractionParams]
FastObjectsExtraction=true
[RecognizerParams]
FastMode=true
[DocumentStructureDetectionParams]
ClassifySeparators=false
DetectFootnotes=false
DetectTableOfContents=false

...