Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Source documents can come in a variety of formats and levels of quality. KeyFree Indexing uses optical scanner recognition ( OCR ) technology to pull data from the document image. OCR technology is not 100% accurate and results are largely based on the quality of the scanned image. Fuzzy printing, wrinkles, or dirt may affect the results. In general, the default settings work well, but some images may benefit from customizing your OCR settings to match the document.

  1. To configure KeyFree settings, from the User Settings on the GlobalSearch toolbar, select the KeyFree tab. Configure your settings and click Save.
      Note that KeyFree Indexing settings effect affect only your local client workstation. Settings in the browser-based GlobalSearch and in the desktop client are separate and do not affect one another. You must configure the settings in both clients separately.

    Image Added

  2. To restore the KeyFree Indexing to the factory default settings, click Set Defaults and at the OCR Settings – Confirm Reset confirmation question, click Yes.

Basic KeyFree Settings

On the KeyFree tab, select Basic and choose from the following settings.

...

  • Move the settings slider OCR Mode to the left for faster OCR scanning and to the right for slower but more accurate OCR scanning.

  • To improve text recognitionentered into index fields, select the text letter case typically found in the document you wish to persist from the Case Recognition Mode drop-down list. Select from the following choices:

    • Auto Case – For text to retain the capitalization seen in the original document.

    • Small Case – Enter text in lowercase letters in an Index Field.

    • Capital Case – Enter text in uppercase letters in an Index Field.

  • From the Text Types drop-list, select the type of text that will be recognized for OCR. You can select more than one Text Type, but for best results keep the number of types selected to a minimum. Text types include:

    • Normal – The default selection for most serif or sans serif text from a modern printer.

    • Typewriter – For text from a typewriter.

    • Matrix – For text from a dot-matrix printer.

    • OCR_A – For text set in OCR A monospaced font designed for OCR.

    • OCR_B – For text set in OCR B monospaced font designed for OCR.

    • MICR_E13B – When indexing a check or other banking documents which uses this MICR (Magnetic Ink Character Recognition Code) font, select this option.

    • MICR_CMC7– When indexing a check or other banking documents that uses this MICR barcode font.

...

  • To improve character recognition, select from the following:
    • Remove Garbage – Removes excess dots that are smaller than a certain size from the image during objects extraction (despeckle).

    • Remove Texture – Temporarily remove the background noise during OCR which might interfere with text recognition.

    • Detect Matrix Printer – If the source document was produced on a dot-matrix printer, use this option to interpret the text more accurately.

    • Detect Porous Text – Detect regions of the document with “porous” text.

    • Detect Text On Pictures – If the document has text on an image or colored background, use this selection to increase the contrast between the image and the textallow for OCR on the image.

    • Enable Aggressive Text Extraction – Enables the OCR engine to attempt to extract as much text on the image as possible. This is useful when the image contains some low-quality text, (although it may still require manual correction).

    • Fast Objects Extraction – When speed is required more than a high level of OCR accuracy, select this setting.

    • Prohibit Color Image – For the OCR engine to skip text laid over an image or colored background and only scan the black-and-white text.


  • Specify how to reuse the text and image layers of the source PDF file by selecting from the PDF Layer Reuse Mode drop-down list. Do not use this setting if the source file contains only raster-based data, such as image-only PDF files.

    • Auto – Have the OCR engine use both text and image layers. This is useful in most cases.

    • Do Not Reuse – Do not reuse the text layer which exists in the PDF file.

    • Content Only – For the OCR engine use only text layers in the PDF file, if they exist.

...