...
- TextOCR.cfg: C:\GetSmart\CaptureServices\GlobalCapture_1
- FullPageOCR.cfg:\GetSmart\CaptureServices\GlobalCapture_1
...
Info | ||
---|---|---|
| ||
If the TextOCR.cfg files are different between the Template Designer and the capture engine, the Template will read differently from what the engine will perform. |
...
|
Full Page OCR and Zonal Settings
...
- FullPageOCR.cfg – TextPDF/Full Page OCR Configuration settings.
- These settings are used when converting a document to a text searchable PDF or other electronic formats.
- TextOCR.cfg
- These settings are used when extracting data use Zonal OCR.
- FullPageBaseSettings.cfg
- These settings contain a profile of commonly used settings present in version 4.1, which are customized further by FullPageOCR.cfg.
...
Info | ||
---|---|---|
| ||
Changes to the default settings of these configuration files are not supported and to be modified at your own risk. Generally speaking, changes to these files will be done, or came at the advisement of a Square 9 Technician |
...
Configuration Objects
Both Files, TextOCR.cfg and FullPageOCR.cfg have similar configurations when GlobalSearch desktop client is installed. Each configuration files consists of one or many objects. In each object, there are a number of properties that can be defined, these objects are as follows and are found in the FullPageOCR.cfg and TextOCR.cfg files:
- PDFExportParams
- PagePreprocessingParams
- PageAnalysisParams
- ObjectsExtractionParams
- RecognizerParams
- DocumentStructureDetectionParams
...
Info | ||
---|---|---|
| ||
These changes are global, changing these will affect all zonal OCR and text PDF activities |
...
FullPageBaseSettings.cfg and ZonalBaseSettings.cfg Settings
...
DocumentStructureDetectionParams
...
Function | Description | Value |
---|---|---|
ClassifySeparators | Additional properties of separators, such as their type is detected. GlobalSearch LAN does need this information and the value should be set to False | Boolean |
DetectFootnotes | The footnotes are detected during document synthesis. GlobalSearch LAN does not require this and for quicker extraction, this value should be set to false | |
DetectTableOfContents | The TableOfContents are detected during document synthesis. GlobalSearch LAN does not require this and for quicker extraction, this value should be set to false. |
Info |
---|
The default values for these parameters are set to TRUE. Smart Search does not require these parameters and for quicket extraction, these values should be set to FALSE |
...
Text Parameters
TextLanguage Value Table
...
Square 9 OCR Settings & Tested High Performance Settings
In the event your TextOCR or FullPageOCR configuration files are lost or corrupted, you can construct a new one using the settings directly below. Additionally, Square 9 offers a higher perfomance set of OCR parameters in the second code box below. These should not be used on burdened or slower servers. This will increase processing time but also increase OCR accuracy. Modification to one configuration file should be done to all configuration files to maintain parity and consistency.
Original OCR Settings
Code Block | ||
---|---|---|
| ||
[PDFExportParams] PDFAComplianceMode=PCM_Pdfa_1b TextExportMode=PEM_ImageOnText Colority=PCM_KeepColority [PagePreprocessingParams] CorrectOrientation = true [PrepareImageMode] CorrectSkew = false [PageAnalysisParams] ProhibitModelAnalysis=true [ObjectsExtractionParams] FastObjectsExtraction=true [RecognizerParams] FastMode=true [DocumentStructureDetectionParams] ClassifySeparators=false DetectFootnotes=false DetectTableOfContents=false |
...