...
This object defines how PDFs are exported after undergoing TextPDF Creation.
Function | Description | Value |
---|---|---|
PDFAComplianceMode | PDFs will be exported adhering to the defined standard. | PCM_None, PCM_Pdfa_1b, PCM_Pdf_1b |
Colority | Defines if PDFs are exported as Color or Grayscale. | PCM_KeepColority, PCM_ForceToGrey |
TextExport | PDFs will be exported adhering to the defined standard. | PEM_ImageOnText, PEM_ImageOnly, PEM_TextOnly |
PagePreprocessingParams
Function | Description | Value |
---|---|---|
CorrectOrientation | Attempt to auto rotate the image. | Boolean |
PrepareImageMode
Function | Description | Value |
---|---|---|
Rotation | Specifies the rotation angle to apply to the image during preparation. | RT_NoRotation, RT_Clockwise, RT_Counterclockwise, RT_Upsidedown |
CorrectSkew | Tells the OCR engine to correct skew during image preparation. | Boolean |
CorrectSkewMode | Specifies the mode of skew correction. | Do Not Alter |
InvertImage | Tells the OCR engine to invert the colors of the prepared image. | Boolean |
MirrorImage | Tells Square 9’s OCR engine to mirror the prepared image around its vertical axis. | |
EnhanceLocalContrast | Specifies whether the local contract of the image should be increased. | |
DiscardColorImage | tells the OCR engine to only leave the black-and-white planein the prepared image. | |
UseFastBinarization | The OCR engine will use algorithms for fast image binarization |
PageAnalysisParams
Function | Description | Value |
---|---|---|
ProhibitModelAnalysis | Typical variants of page layout will be gone through during page analysis and the best variant will be selected. | Boolean |
DetectPictures | Pictures are detected as part of analysis. | |
DetectSeparators | Separators are detected during analysis. |
ObjectsExtractionParams
Function | Description | Value |
---|---|---|
FastObjectsExtraction | Extraction speed may increase but quality may deteriorate. | Boolean |
RemoveTexture | Background noise is removed from the image used for recognition. The original image is not altered. |
RecognizerParams
Function | Description | Value |
---|---|---|
FastMode | Data will be extracted more rapidly at the cost of accuracy. | Boolean |
LowResolutionMode | This property is useful when recognizing faxes, small prints, images with low resolution or bad print quality. | |
BalancedMode | Data will be extracted more accurately but at the cost of speed. | |
OneLinePerBlock | The OCR engine will presume the text extracted contains no more than one string. | |
OneWordPerBlock | The OCR engine will presume the text extracted contains no more than one word. | |
CaseRecognitionMode | This value specifies the letter case during recognition | |
TextTypes | The value of TextTypes defines the style of the text to be extracted. | See TextType Value table |
TextLanguages | Parameter for one or more languages in Abbyy. Helpful for accennted character recognition.(Ex. TextLanguage=English,French) | See Text Language Value table |
Info |
---|
If neither FastMode or BalancedMode are used, FullMode will be used by default. Text will extract with greater accuracy but may be significantly slower |
...
DocumentStructureDetectionParams
Function | Description | Value |
---|---|---|
ClassifySeparators | Additional properties of separators, such as their type is detected. GlobalSearch LAN does need this information and the value should be set to False | Boolean |
DetectFootnotes | The footnotes are detected during document synthesis. GlobalSearch LAN does not require this and for quicker extraction, this value should be set to false | |
DetectTableOfContents | The TableOfContents are detected during document synthesis. GlobalSearch LAN does not require this and for quicker extraction, this value should be set to false. |
Info |
---|
The default values for these parameters are set to TRUE. Smart Search does not require these parameters and for quicket extraction, these values should be set to FALSE |
...
Text Parameters
TextLanguage Value Table
Bulgarian | French | Portuguese(Brazillian) |
Chinese simplified | German | Russian |
Chinese traditional | Greek | Slovak |
Czech | Hungarian | Spanish |
Danish | Italian | Swedish |
Dutch | Japanese | Turkish |
English | Korean | Ukrainian |
Estonian | Polish | Vietnamese |
Text Type | Description | Value |
---|---|---|
TT_Normal | Common typographic texts. | 1 |
TT_Typewriter | Tells the OCR engine to presume the text was generated on typewriter. | 2 |
TT_Matrix | This value tells the OCR engine to presume the text was generated on a Matrix Printer. | 4 |
TT_Index | Corresponds to a special set of characters including only digits written in ZIPCode style. | 8 |
TT_OCR_A | A special font designed for Optical Character Recognition. Largely used by banks, credit card companies or financial institutions | 32 |
TT_OCR-B | This value corresponds to a special font designed for Optical Character Recognition. | 64 |
TT_MICR_E138 | This value corresponds to a special MICR barcode font (CMC-7). | 128 |
TT_MICR_CMC7 | This value tells the OCRengine to make the assumption thatit is reading a special MICRfont(CMC-7). | 256 |
TT_Gothic | This value tells the OCR engine to presume the text is printed of the Gothic Type. Only the “Fraktur” font is supported. | 512 |
TT_Receipt | This value corresponds to a special font commonly used in thermal printed receipts. | 1024 |
TextType Value Table
- You can select multiple text types by adding the values together.
- Both TextLanguage and TextType would be added to the RecongnizerParams
...
Code Block | ||
---|---|---|
| ||
[PDFExportParams] PDFAComplianceMode=PCM_Pdfa_1b TextExportMode=PEM_ImageOnText Colority=PCM_KeepColority [PagePreprocessingParams] CorrectOrientation = true [PrepareImageMode] CorrectSkew = false [PageAnalysisParams] ProhibitModelAnalysis=true [ObjectsExtractionParams] FastObjectsExtraction=true [RecognizerParams] FastMode=true [DocumentStructureDetectionParams] ClassifySeparators=false DetectFootnotes=false DetectTableOfContents=false |
...
Code Block | ||
---|---|---|
| ||
[PDFExportParams] PDFAComplianceMode=PCM_Pdfa_1b TextExportMode=PEM_ImageOnText Colority=PCM_KeepColority [PagePreprocessingParams] CorrectOrientation = false [PrepareImageMode] CorrectSkew = true [PageAnalysisParams] ProhibitModelAnalysis=false EnableTextExtractionMode=true [ObjectsExtractionParams] FastObjectsExtraction=false EnableAggressiveTextExtraction=true [RecognizerParams] FastMode=false [DocumentStructureDetectionParams] ClassifySeparators=false DetectFootnotes=false DetectTableOfContents=false |
...