Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In the event your TextOCR or FullPageOCR configuration files are lost or corrupted, you can construct a new one using the settings directly below. Additionally, Square 9 offers a higher performance Aggressive OCR set of OCR parameters in the second code box below. This will increase processing time but also increase OCR accuracy. The third code box includes a "God ModeHigh Performance" setting that is an excellent compromise between speed and accuracy, at the expense of conversion in color. God Mode This set of parameters also excels at reading sub-optimal text, so if your text is coming in from a low quality source (Such as an old dotMatrix printer) God Mode these settings may be a good option. The second and third configs should not be used on burdened or slower servers. Modification to one configuration file should be done to all configuration files to maintain parity and consistency.

...

Code Block
languagec#
[PDFExportParams]
PDFAComplianceMode=PCM_Pdfa_1b
TextExportMode=PEM_ImageOnText
Colority=PCM_KeepColority

[PagePreprocessingParams]
CorrectOrientation = true

[PrepareImageMode]
CorrectSkew = false

[PageAnalysisParams]
ProhibitModelAnalysis=true

[ObjectsExtractionParams]
FastObjectsExtraction=true

[RecognizerParams]
FastMode=true

[DocumentStructureDetectionParams]
ClassifySeparators=false
DetectFootnotes=false
DetectTableOfContents=false

...

Aggressive OCR Settings (Square 9 Tested)

Code Block
languagec#
[PDFExportParams]
PDFAComplianceMode=PCM_Pdfa_1b
TextExportMode=PEM_ImageOnText
Colority=PCM_KeepColority

[PagePreprocessingParams]
CorrectOrientation = true

[PrepareImageMode]
CorrectSkew = true

[PageAnalysisParams]
ProhibitModelAnalysis=false
EnableTextExtractionMode=true

[ObjectsExtractionParams]
FastObjectsExtraction=false
EnableAggressiveTextExtraction=true

[RecognizerParams] 
FastMode=false

[DocumentStructureDetectionParams]
ClassifySeparators=false
DetectFootnotes=false
DetectTableOfContents=false

...

High Performance OCR Settings (Square 9 Tested)

Code Block
languagec#
[PDFExportParams]
PDFAComplianceMode=PCM_Pdfa_1b
TextExportMode=PEM_ImageOnText
Colority=PCM_KeepColority

[PagePreprocessingParams] 
CorrectOrientation = true 

[PrepareImageMode]
CorrectSkew = true
DiscardColorImage = true
PhotoProcessingMode = PPM_TreatAsPhoto
ImageCompression = IC_NoCompression

[PageAnalysisParams]
ProhibitModelAnalysis=false
EnableTextExtractionMode=false
DetectTables=false

[ObjectsExtractionParams]
FastObjectsExtraction=false
DetectTextOnPictures=true
EnableAggressiveTextExtraction=true
DetectPorousText = true
RemoveGarbage = true
RemoveTexture = true

[RecognizerParams]
FastMode=false
LowResolutionMode = true
TextTypes = 487

[DocumentStructureDetectionParams]
ClassifySeparators=false
DetectFootnotes=false
DetectTableOfContents=false

...