...
In the event your TextOCR or FullPageOCR configuration files are lost or corrupted, you can construct a new one using the settings directly below. Additionally, Square 9 offers a higher perfomance performance set of OCR parameters in the second code box below. These should not be used on burdened or slower servers. This will increase processing time but also increase OCR accuracy. The third code box includes a "God Mode" setting that is an excellent compromise between speed and accuracy, at the expense of conversion in color. God Mode also excels at reading sub-optimal text, so if your text is coming in from a low quality source (Such as an old dotMatrix printer) God Mode may be a good option. The second and third configs should not be used on burdened or slower servers. Modification to one configuration file should be done to all configuration files to maintain parity and consistency.
...
Code Block | ||
---|---|---|
| ||
[PDFExportParams]
PDFAComplianceMode=PCM_Pdfa_1b
TextExportMode=PEM_ImageOnText
Colority=PCM_KeepColority
[PagePreprocessingParams]
CorrectOrientation = true
[PrepareImageMode]
CorrectSkew = true
[PageAnalysisParams]
ProhibitModelAnalysis=false
EnableTextExtractionMode=true
[ObjectsExtractionParams]
FastObjectsExtraction=false
EnableAggressiveTextExtraction=true
[RecognizerParams]
FastMode=false
[DocumentStructureDetectionParams]
ClassifySeparators=false
DetectFootnotes=false
DetectTableOfContents=false |
Square 9 "God Mode" OCR Settings (Square 9 Tested)
Code Block | ||
---|---|---|
| ||
[PDFExportParams] PDFAComplianceMode=PCM_Pdfa_1b TextExportMode=PEM_ImageOnText Colority=PCM_KeepColority [PagePreprocessingParams] CorrectOrientation = true [PrepareImageMode] CorrectSkew = true DiscardColorImage = true PhotoProcessingMode = PPM_TreatAsPhoto ImageCompression = IC_NoCompression [PageAnalysisParams] ProhibitModelAnalysis=false EnableTextExtractionMode=false DetectTables=false [ObjectsExtractionParams] FastObjectsExtraction=false DetectTextOnPictures=true EnableAggressiveTextExtraction=true DetectPorousText = true RemoveGarbage = true RemoveTexture = true [RecognizerParams] FastMode=false LowResolutionMode = true TextTypes = 487 [DocumentStructureDetectionParams] ClassifySeparators=false DetectFootnotes=false DetectTableOfContents=false |
...