Configuring Import Data And Docs With XML

Import Data and Docs can be used to batch import several files with relevant Index Field data defined in either a CSV or an XML. Sometimes, it may be more convenient to use XML files instead of CSV’s. However, in order for GlobalCapture to recognize the data, it has to follow a specific schema.

Import Data and Docs GlobalCapture Import

A standard Import Data and Docs with XML has a very simple XML format. This XML will be picked up from a hotfolder, and will look for physical files as specified in the XML. Ensure that the App Pool user has access to the repository or directory where the files live. A new batch will be created for the import and the import will follow the workflow process after importing. This is the most common and easiest-to-configure method of importing.

<Import>
	<Archive>
		<Document>
			<DocFile FileLoc="C:\GetSmart\Import\Document1.txt"/>
			<Fields>
				<Field Name="Status" value="Processing" />
				<Field Name="Other Field" value="Value1" />
			</Fields>
		</Document>
	</Archive>
</Import>

Direct XML Import

XML can be imported directyly through the processing folder by dropping the XML into the Processing Folder in Getsmart. This will do a direct import, bypassing the Engine. No workflow needs to be constructed for this type of import. This is a faster, more efficient method for importing bulk data, but no batch data will be created and errors will cause the entire XML to import incorrectly. This is not recommended for most instances.

The XML will have to be formatted slightly differently than the normal Import Data and Docs using GlobalCapture.

Below is an example of such an XML file:

<Import>
	<Archive ConnectionID="1" Name="2">
		<Document pass="True">
			<DocFile FileLoc="C:\GetSmart\Processing\Test\test.pdf"/>
			<Fields>
				<Field Name="Vendor Name" pass="True" value="POP"/>
				<Field Name="PO Amount" pass="True" value="110.00"/>
				<Field Name="Vendor email" pass="True" value="pop@pop.com"/>
			</Fields>
		</Document>
		<Document pass="True">
		<DocFile FileLoc="C:\GetSmart\Processing\Test\test1.pdf"/>
			<Fields>
				<Field Name="Vendor Name" pass="True" value="Amazon"/>
				<Field Name="PO Amount" pass="True" value="10000.00"/>
				<Field Name="Vendor email" pass="True" value="cs@amazon.com"/>
			</Fields>
		</Document>
	 </Archive>
</Import>

Overview of Schema

<Import> ... </Import>
- Master tag for the entire XML file.
<Archive> ... </Archive>
- Secondary tag to cover the entire document, but inside the <Import> tag. This is necessary for proper functionality. ConnectionID refers to the Database ID and Name refers to the Archive ID
<Document> ... </Document>
- Information about the document to be imported. You need a <Document></Document> for each document being imported via this XML. This tag MUST have pass="True"
<DocFile FileLoc="X:\path\to\document" />
- The FileLoc property in the DocFile is the full file path to the document. The GlobalCapture engine will look for the file in this location.
<Fields> <Field pass="True" Name="Field Name" value="Field value"> </Fields>
- Within the <Fields></Fields> tag, you specify the relevant index field information for the incoming document. You need a <Field /> tag for each index field. The Name and value properties are the Field Name and Field Value, respectively. This tag MUST have pass="True"

Other important information

All tag names and property names are case-sensitive. If your <Field /> tag says Value instead of value, then the data will not be captured.
All index fields listed in the XML file must be process fields in the workflow. If they are not process fields, the index field data will be ignored.