Page Comparison

...

This object defines how PDFs are exported after undergoing TextPDF Creation.

Function	Description	Value
PDFAComplianceMode	PDFs will be exported adhering to the defined standard.	PCM_None, PCM_Pdfa_1b, PCM_Pdf_1b
Colority	Defines if PDFs are exported as Color or Grayscale.	PCM_KeepColority, PCM_ForceToGrey
TextExport	PDFs will be exported adhering to the defined standard.	PEM_ImageOnText, PEM_ImageOnly, PEM_TextOnly

PagePreprocessingParams

Function	Description	Value
CorrectOrientation	Attempt to auto rotate the image.	Boolean

PrepareImageMode

Function	Description	Value
Rotation	Specifies the rotation angle to apply to the image during preparation.	RT_NoRotation, RT_Clockwise, RT_Counterclockwise, RT_Upsidedown
CorrectSkew	Tells the OCR engine to correct skew during image preparation.	Boolean
CorrectSkewMode	Specifies the mode of skew correction.	Do Not Alter
InvertImage	Tells the OCR engine to invert the colors of the prepared image.	Boolean
MirrorImage	Tells Square 9’s OCR engine to mirror the prepared image around its vertical axis.
EnhanceLocalContrast	Specifies whether the local contract of the image should be increased.
DiscardColorImage	tells the OCR engine to only leave the black-and-white planein the prepared image.
UseFastBinarization	The OCR engine will use algorithms for fast image binarization

PageAnalysisParams

Function	Description	Value
ProhibitModelAnalysis	Typical variants of page layout will be gone through during page analysis and the best variant will be selected.	Boolean
DetectPictures	Pictures are detected as part of analysis.
DetectSeparators	Separators are detected during analysis.

ObjectsExtractionParams

Function	Description	Value
FastObjectsExtraction	Extraction speed may increase but quality may deteriorate.	Boolean
RemoveTexture	Background noise is removed from the image used for recognition. The original image is not altered.	Boolean

RecognizerParams

Function	Description	Value
FastMode	Data will be extracted more rapidly at the cost of accuracy.	Boolean
LowResolutionMode	This property is useful when recognizing faxes, small prints, images with low resolution or bad print quality.
BalancedMode	Data will be extracted more accurately but at the cost of speed.
OneLinePerBlock	The OCR engine will presume the text extracted contains no more than one string.
OneWordPerBlock	The OCR engine will presume the text extracted contains no more than one word.
CaseRecognitionMode	This value specifies the letter case during recognition
TextTypes	The value of TextTypes defines the style of the text to be extracted.	See TextType Value table
TextLanguages	Parameter for one or more languages in Abbyy. Helpful for accennted character recognition.(Ex. TextLanguage=English,French)	See Text Language Value table

Info
If neither FastMode or BalancedMode are used, FullMode will be used by default. Text will extract with greater accuracy but may be significantly slower

...

DocumentStructureDetectionParams

Function	Description	Value
ClassifySeparators	Additional properties of separators, such as their type is detected. GlobalSearch LAN does need this information and the value should be set to False	Boolean
DetectFootnotes	The footnotes are detected during document synthesis. GlobalSearch LAN does not require this and for quicker extraction, this value should be set to false
DetectTableOfContents	The TableOfContents are detected during document synthesis. GlobalSearch LAN does not require this and for quicker extraction, this value should be set to false.

Info
The default values for these parameters are set to TRUE. Smart Search does not require these parameters and for quicket extraction, these values should be set to FALSE

...

Text Parameters

TextLanguage Value Table

Bulgarian	French	Portuguese(Brazillian)
Chinese simplified	German	Russian
Chinese traditional	Greek	Slovak
Czech	Hungarian	Spanish
Danish	Italian	Swedish
Dutch	Japanese	Turkish
English	Korean	Ukrainian
Estonian	Polish	Vietnamese

Text Type	Description	Value
TT_Normal	Common typographic texts.	1
TT_Typewriter	Tells the OCR engine to presume the text was generated on typewriter.	2
TT_Matrix	This value tells the OCR engine to presume the text was generated on a Matrix Printer.	4
TT_Index	Corresponds to a special set of characters including only digits written in ZIPCode style.	8
TT_OCR_A	A special font designed for Optical Character Recognition. Largely used by banks, credit card companies or financial institutions	32
TT_OCR-B	This value corresponds to a special font designed for Optical Character Recognition.	64
TT_MICR_E138	This value corresponds to a special MICR barcode font (CMC-7).	128
TT_MICR_CMC7	This value tells the OCRengine to make the assumption thatit is reading a special MICRfont(CMC-7).	256
TT_Gothic	This value tells the OCR engine to presume the text is printed of the Gothic Type. Only the “Fraktur” font is supported.	512
TT_Receipt	This value corresponds to a special font commonly used in thermal printed receipts.	1024

TextType Value Table

You can select multiple text types by adding the values together.
Both TextLanguage and TextType would be added to the RecongnizerParams

...

Code Block

language	c#

[PDFExportParams]
PDFAComplianceMode=PCM_Pdfa_1b
TextExportMode=PEM_ImageOnText
Colority=PCM_KeepColority

[PagePreprocessingParams]
CorrectOrientation = true

[PrepareImageMode]
CorrectSkew = false

[PageAnalysisParams]
ProhibitModelAnalysis=true

[ObjectsExtractionParams]
FastObjectsExtraction=true

[RecognizerParams]
FastMode=true

[DocumentStructureDetectionParams]
ClassifySeparators=false
DetectFootnotes=false
DetectTableOfContents=false

...

Code Block

language	c#

[PDFExportParams]
PDFAComplianceMode=PCM_Pdfa_1b
TextExportMode=PEM_ImageOnText
Colority=PCM_KeepColority

[PagePreprocessingParams]
CorrectOrientation = false

[PrepareImageMode]
CorrectSkew = true

[PageAnalysisParams]
ProhibitModelAnalysis=false
EnableTextExtractionMode=true

[ObjectsExtractionParams]
FastObjectsExtraction=false
EnableAggressiveTextExtraction=true

[RecognizerParams] 
FastMode=false

[DocumentStructureDetectionParams]
ClassifySeparators=false
DetectFootnotes=false
DetectTableOfContents=false

...

Versions Compared

Old Version 5

New Version 6

Key

Text Parameters