HomeGuides :: Symphony OCRPrinter Friendly Version

Guides :: Symphony OCR

1. Background Information

1.1. Symphony OCR System Background

What is Symphony OCR?

Symphony OCR is part of Symphony Suite, The Complete Imaging Solution.   Symphony OCR is a back-end OCR engine. It will locate all image-only PDF and TIF files in your document management system and convert them to fully text searchable PDFs by adding an invisible layer of text over the image. Symphony OCR typically runs on a back-end PC or server (for Worldox sites, this is typically the Indexer PC).

Tip: Turn OCR off at your scanners and see significant (2 to 5x) improvement in scanning speeds. In fact, Adobe Acrobat turns OCR on by default, and we strongly recommend turning it off. Let Symphony OCR take care of the OCR in the background.

 

...

1.2. Components of Symphony OCR



The Symphony OCR system contains three components:


Document Repository - Stores the documents that Symphony OCR processes.

Symphony OCR - Monitors the Document Repository for image-only PDF and TIFF files, processes those files by adding an invisible layer of OCRed text, then saves them back to the Worldox Document Repository.

Web Browser - Displays the Symphony OCR user interface.  This can be viewed from any browser in your local area network, though most commonly it is displayed in a browser on the monitored PC.

 

 

...

2. Installation and Basic Configuration Guide

2.1. Preparation for Installation

In order to assist you with installing and configuring Symphony Profiler we recommend that you complete a short site survey for the firm to ensure that you know which Symphony OCR feature you wish to enable, etc.

Symphony OCR Site Survey for Worldox
Symphony OCR Site Survey for NetDocuments

 

...

2.2. Installation Guide

Install Symphony OCR

For a video showing these steps see:  Symphony OCR - How to Install

  • Download and save the Symphony OCR setup file
  • Double-click on the installer executable to run the Installation Wizard
  • Accept the default installation location and install to the C:\ drive of the workstation
  • Select the appropriate radio button:     
    • Run as logged-in user
    • Run as a Windows Service
      • Enter the username
        Note:  This should be in the DOMAIN\User format
      • Enter the password for this user
  • Select "Next"
  • Leave the "Start Symphony OCR" checkbox checked
  • Select "Finish"

Note to Channel Partners:  Installers can be downloaded from the Channel Partner Resource Center - Implementation Resources

Symphony OCR Configuration Wizard

When you launch Symphony OCR for the very first time, you will be prompted to walk through a few quick steps. Follow the steps below to get SOCR up and running.

Note:  You do not have to follow the steps provided but it does cover just about everything you need. If you leave the wizard then you may continue configuring manually using the links on the left side panel. See the Configuration Guide for details.

If you are unable to open Symphony OCR, try copying the URL to a different browser, or see this article: Unable to Open Symphony Interface.

Step 1. Enter the License

Paste in your Symphony OCR license and click "Save and Continue".

(See also Licensing)

Step 2. Input Email and Basic Settings

Input an email address and select a notification type (see Notifications for more info)

If you'd like to process TIFF files and/or email attachments then check the appropriate boxes to enable processing (see Basic Settings section in Processor for more info)


Step 3. DMS Integration

Your license dictates what document management system your software can integrate with. You'll need to configure them in order for Symphony OCR to find your documents and begin working. The buttons to "Configure" will take you to the SOCR settings to set it up, while the "Quick Start Guide" buttons will take you to articles with instructions. You can also refer to the links below.


Each document management system that Symphony OCR integrates with has different instructions for configuring. Click on the appropriate chapter links below to configure SOCR for your document management system:

Configuration Guide - Worldox

Configuration Guide - NetDocuments

Configuration Guide - ShareFile

Configuration Guide - Open Text

Configuration Guide - Practice Master

Configuration Guide - LSSe64

Configuration Guide - Folders

Configuration Guide - Box

Configuration Guide - Dropbox

Configuration Guide - Microsoft One Drive

Configuration Guide - Google Drive

Configuration Guide - SharePoint

Test Symphony OCR (optional)

  • Save a test PDF image file to a monitored folder in your document management system (a sample PDF image is in C:\Program Files\SymphonyOCR\Sample Images)
  • In the Symphony OCR interface, select "Finder" from the configuration side bar
  • Click "Refresh"
  • Confirm that the contents of the test file are OCRed

Adjustments to Scanning Software

Disable OCR in any desktop scanning software (such as Adobe Acrobat).
 
Why: OCR at the desktop level can slow the scanning process by 2 to 5 times and is completely unnecessary now that Symphony OCR is being used for background OCR.
...

3. Configuration Guide - Worldox

3.1. Quick Start Configuration Guide - Worldox

For a video of these steps, visit:  Symphony OCR - Configure for Worldox


Establish the Worldox User

  • Enter the Worldox user code (typically the 000000 user)
  • Select "Save Changes"

Select the 'Profile Groups to Monitor'

  • Select which Profile Groups / Cabinets you wish to process
  • Select "Save Changes"

For more detailed information and advanced settings for configuring Symphony OCR, visit Configuration Guide - Worldox


What Now?

Symphony OCR should be off and running now! The Finder is looking through your repository and sending documents to the Analyzer to determine what needs to be processed. The Analyzer sends any documents eligible for OCR to the Processor which applies an invisible layer of text to the document.

By default, Symphony OCR queries the Worldox document repository for newly saved and modified files every 15 minutes.  Generally speaking, newly saved files will be OCRed within about 15 minutes.  Depending on the volume of image-only documents already filed to Worldox, it may take a while for Symphony OCR to process the backlog (legacy files).  Symphony OCR gives precedence to newer files, so documents that are scanned today will be processed before the backlog.

Refer to the section, Configuration Guide - Worldox - Finder for further information on finder settings that determine when Symphony OCR locates files for processing.

Refer to the section, Configuration Guide - Worldox - Processor for further information on configuration settings that determine which files are processed.

Note:  While Symphony OCR will likely process documents within about 15 minutes, they will not be immediately available from within Worldox via 'Text in file' searches.  Within 15 minutes, you will be able open the file and search for text that way (ie. Ctrl+F).  But for the documents to be returned in the 'Text in file' searches, the Worldox text index needs to be updated (which normally happens every night).  So despite your files being OCRed, you will need to wait until the next day (typically) in order to do full text in file searching via Worldox.


...

3.2. Licensing



License

This is where your Symphony OCR license is set. To change your Symphony OCR license, simply click "Licensing" from the Configuration side bar, enter your new license, and select "Save Changes."

License Details

Provides you details of the license.

Features Allowed by your License

This area tells you which features are allowed by your license.

Updating your License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your renewal invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will automatically see this new license, download it and install it.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can click the "Check for Updated License" link on this page.  This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers.  As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

 

...

3.3. Notifications



Notifications allow users to be emailed nightly based on the status of Symphony OCR.

  • Enter the email address for the person you wish to receive notifications in the "Add e-mail address" box

  • Select the Notification Type from the drop down


Each email address may be configured with one of four types:

Never - nightly emails will never be sent to this recipient (instead, after entering an email address you can select "Send Now" and deliver an email to the recipient on demand).

When there are errors - the nightly email will only be sent to the recipient if the overall system condition is Error.  This is useful for recipients who only need to know when the system is not processing documents because of some major error (licensing issues are the most common major error).

When there are warnings or errors - the nightly email will only be sent to the recipient if the overall system condition is Warning or Error.  The warning condition is triggered by documents in the Needs Attention list, configuration problems or other system level issues that should be looked at, even though they haven't completely stopped processing from occurring.

Always (aka Daily) - the nightly email will be sent to the recipient every night regardless of system status.  This is useful for firms who want to monitor the 'Not Processed' lists to ensure that every document that couldn't be OCRed (e.g., because of security or corruption) has been reviewed.  Users can review documents in the various 'Not Processed' lists and either correct the underlying issue, or move the documents to the Ignore list using Bulk Operations >Ignore.

  • Select "Save Changes"

If you have a user leave the firm or you no longer wish for a particular user to be notified, you can change the Notification Type to "Never" or remove the user entirely by selecting "Remove" to the right of the address.

...

3.4. Worldox

Basic settings

Worldox User Code - This is where the Worldox user is specified. This is the user that Symphony OCR should search for documents as (note that Symphony OCR does not actually use a Worldox license). Symphony OCR will have access to all profile groups that the specified Worldox user has access to. This user should have Worldox Manager Rights. We recommend using the 000000 user. 

Worldox Network Folder - This is the network folder in which Worldox is installed.  It can be identified as a UNC path or a mapped network drive (e.g. \\server1\DMS\Worldox, or X:\Worldox) unless you are running Symphony OCR as a service, in which case it must be identified as a UNC path.

Profile Groups to Monitor

This is the list of profile groups the user specified has access to. If a profile group does not appear in the list, this user does not have access to those profile groups (or the profile group has not been properly configured in Worldox). You can select the checkbox in the header area to automatically select and process all documents in all profile groups. If you wish to only process certain profile groups, you can simply select the applicable ones.  Be sure to select "Save Changes" at the bottom of the screen.

Default Priority - There are 6 processing priorities which range from Very Low to Very High and includes "Analyzer Only".  By default all profile groups will be processed with a "Normal" priority.  If you wish to change the priority for a particular profile group, select the appropriate item from the drop down arrow.  If you wish to re-prioritize documents that have already been found in that particular profile group as well as new documents that are in that profile group, select the "Reprioritize existing documents" checkbox.  For more information on Processing Priorities see:  Processing Priorities

Refresh - Allows you to refresh the list of available profile groups.  For example, if you have added a new profile group to Worldox, and wish to process that group, you can select this which will provide you with the newly added profile groups.

View Detailed Progress - Selecting this will take you to the Progress Details page.  This will provide you with a list of profile groups, the number of documents and pages that have been processed / not processed per profile group.

Advanced settings

Process Read Only Files - If you wish to process read-only files, you should check this checkbox.

Indexed Search Frequency - By default Symphony OCR will search for documents in selected profile groups once every 15 minutes using Indexed Searches.  This should be sufficient for your needs, however you can change this to search more or less frequently.

Non-indexed Search Frequency - By default Symphony OCR will search for documents in selected profile groups once every 12 hours without using Worldox indexes.  Because it takes a significant amount of time to crawl through the directory structure to find files, once every 12 hours should be sufficient.

Debugging

Reset Worldox Session - Selecting "Reset Worldox Session" will reset the Worldox session for the user defined in the Basic Settings above.

...

3.5. Enable Read Only Processing

To OCR files that are marked as Read-Only:

  • Open the Symphony OCR homepage
  • Select "Worldox" from the left sidebar
  • Under Advanced Settings, check "Process read-only files"

  • Select "Save Changes"
...

3.6. Finder

Status

Worldox Indexed Search performs an indexed search to find documents that have been created or modified *today* that are eligible for OCR. By default, it performs the query every 15 minutes. This can be adjusted by selecting "Manage".  This will take you to the Worldox page where you can adjust the search frequency under Advanced Settings.

Worldox Non-Indexed Finder performs a non-indexed search to find all documents in Worldox that are eligible for OCR, regardless of how recently the document has been created or modified. By default, it performs this search once every 12 hours. This can be adjusted by selecting "Manage".  This will take you to the Worldox page where you can adjust the search frequency under Advanced Settings.

 

...

3.7. Analyzer



The Analyzer is responsible for looking at each document and determining if it is eligible for OCR. If a document is eligible it is placed in the Processing list. If a document is not eligible, it is placed in the appropriate list (for more information on why a document might not be eligible for OCR, refer to the section, Not Processed List).

Control 

In the control area, you can choose to refresh the Analyzer or stop the Analyzer:

Refresh - Selecting Refresh will refresh the Status of the Analyzer page.

Stop Analyzer - Selecting this option will stop the Analyzer from Analyzing documents in the document repository.

Status 

Displays the status of the Analyzer.

Information

Machine Processors - Indicates how many logical processors the workstation running Symphony OCR contains.

Licensed parallel processing - Indicates how many documents will be analyzed at a time based on your license features.

Recent Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Overall Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Settings

Do not analyze documents younger than - The default setting is 30 seconds. If you wish to have the Analyzer wait longer to analyze documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

To change this setting, simply type in the number of seconds, and then select "Save Changes".

...

3.8. Processor

Accessing the Processor

Select Processor in the navigation panel:


The Processor manages the actual OCR processes. Once a document has been identified as eligible for OCR by the Analyzer, the Processor confirms that the file is still eligible for OCR, and then OCRs the file. If a document is successfully OCRed, it is moved to the Processed list (for more information about the flow of documents throughout Symphony OCR, refer to the section Symphony Workflow, Tools & Document Lists).


Control

In the control area, you can choose to refresh the Processor or stop the Processor:

Refresh - Selecting Refresh will refresh the status of the Processor page.

Stop Processor - Selecting this option will stop the Processor from processing documents in the document repository.

Status

The status of the Processor (what it is currently processing).

Information

Processing Capacity Remaining - If you have a license that limits the number of pages you can process per year, the number of pages remaining will appear here.
Machine Processors - Indicates how many logical processors the workstation running SymphonyOCR contains.
Licensed parallel processing - Indicates the number of documents that will be processed by the processor simultaneously.

Recent Performance

Provides performance statistics such as the number of documents and pages that Symphony OCR has processed in a smaller sample size and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.

Overall Performance

Provides performance statistics such as the total number of documents and pages that Symphony OCR has processed and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.


Basic Settings

Process TIFFs (OCR and convert to PDF) - Symphony OCR can process TIFF files and convert them to image + text PDF files. This is an optional setting. If you wish to process TIFF documents, simply check this checkbox.

Note:  If the firm opts to process TIFF documents, this will change the file extension to .tif.  This will "break" any relationships or projects that include this file.

Process MSG (email) attachments - Symphony OCR can process email message attachments.  This is an optional setting.  If you wish to process email message attachments, check this checkbox. 

<Big fat scary warning: 

Due to a limitation in newer versions of Office, Microsoft prevents us from accessing the DLLs that allow us to read/process emails under the following conditions: 
> Symphony OCR is configured to run as a service
> 'Process MSG (email) attachments' is checked
> Outlook 2013 (or possibly Outlook 2016) is open

In these circumstances, you're likely to see the following error:

Therefore, if Symphony OCR is being installed to run as a service *and* will be configured to process email attachments, it is our recommendation to install it on a machine that will not normally have Outlook 2013 (or possibly 2016) open.  On the bright side, our testing has shown that in these situations, Symphony is still processing normal documents and WILL eventually recover and process emails after Office is closed.  But if you can, we recommend avoiding this situation.  If your experience is different, we'd like to hear about it. 

End of big fat scary warning>

Do not process documents younger than - The default setting is 30 seconds. If you wish to have the Processor wait longer to process documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

Do not process documents older than - If you have older documents that you do not want Symphony OCR to process, enter a specific number of days for which the software should process backlog.

Automatically rotate pages to proper orientation -  If selected, the pages will rotate either landscape or portrait according to the text on the page.


Original retention settings

Retain originals of processed files - If selected, Symphony OCR retains copies of the documents that is has processed. These copies appear as versions (if Symphony OCR processes a document 3 times, it will maintain copies of all 3 versions of the document). The user can restore previous versions of a document from the Symphony OCR backup using the document Details screen.

Purge originals of processed files after - The default setting is to retain the originals of processed files for 7 days after which they will be purged. If you wish to change this setting, you can change the value to the appropriate number of days for your firm.

Backlog throttling settings (only needed when your license does NOT have unlimited pages for processing)

Default processing capacity reserved for new documents (based on the actual number of new pages added each day) -  This is calculated from the number of pages that were added to the site in the past year.

Override the default processing capacity reserve - This will determine the number of pages you would like to reserve for new documents, evenly spreading the page count capacity across the entire year. To determine a reasonable reserve, allow the Symphony OCR Analyzer module to run, then look at the timeline for the Processing Queue. Adding the number of pages in the first 52 weeks, and dividing by 365 will give an average number of pages added to the system per year. Trumpet recommends adding an additional 10% to accommodate for future growth or above average filing. This value should be a reasonable overclocking reserve.

Advanced Settings:

Enable OCR debug logging - This will enable debugging for support purposes.

Create thumbnails (if not already present) - Checking this checkbox will create thumbnails if they are not already present.

Enable OCR debug logging - This will help our support team address issues if necessary.  In order to reserve disk space, we recommend not enabling this unless requested by our support team.

Limit parallel processing to X documents- This allows you to limit the number of cores that Symphony OCR will utilize. It uses 1 core per document. For example, if you input 3 Symphony OCR will only use 3 cores, and will process 3 documents simultaneously. See: How does Symphony OCR impact the performance of the server or indexer PC


Upon making changes to any of the above settings, select "Save Changes".

Special Note:   See  Indexing Email Attachments to enable the Worldox Indexer to process text.

...

3.9. Worldox security classifications so secured documents can be OCRed

Background

Symphony OCR honors the Worldox security model, so it will only be able to process documents that are fully accessible by the Worldox user that Symphony runs as.

If users want documents that have been restricted in Worldox to be OCRed, they must configure Worldox's security features to allow Symphony to modify the document.

If you do not do this configuration, documents will be placed on the Inaccessible list and will not be OCRed.

Configuration for Ethical Walls

In the ethical wall configuration, be sure that you have added the Symphony Worldox user (usually 000000) to the users list for the ethical wall, and that you have configured that user to have full access to documents covered by the ethical wall.

Configuration for Security Classifications

When users classify individual documents, they need to make sure that they include the Symphony Worldox user (usually 000000) in the classification and give full access to that user.   To make this easier, we suggest creating the following two security classifications (this is done from WDAdmin, Security->Classifications):

  • <Private - Symphony documents>
  • <Read Only - Symphony documents>

First, create the <Private - Symphony documents> classification:

  1. WDAdmin, Security->Classifications
  2. Click New
  3. Ensure that the <Everyone Else> group has NO rights assigned   (this is the default)
  4. Click Add User
  5. Choose the Symphony Worldox user (usually 000000)
  6. Ensure that the Symphony Worldox user has full rights
  7. Click OK
  8. You will be prompted for a classification name and Who Sees It
  9. Set the name to:   <Private - Symphony documents>
  10. Set the Who sees it to: Everyone
  11. Click OK

Now create the <Read Only - Symphony documents> classification:

  1. WDAdmin, Security->Classifications
  2. Click New
  3. Configure the <Everyone Else> group so it has Find and Read rights assigned
  4. Click Add User
  5. Choose the Symphony Worldox user (usually 000000)
  6. Ensure that the Symphony Worldox user has full rights
  7. Click OK
  8. You will be prompted for a classification name and Who Sees It
  9. Set the name to:   <Read Only - Symphony documents>
  10. Set the Who sees it to: Everyone
  11. Click OK

 

If you already have documents in the Inaccessible list, after making classification changes you will need to click Re-Analyze All to make those documents available for processing.

...

3.10. Process Files Outside the Worldox Document Repository

First, you will want to determine if these files have the potential to be migrated to the Worldox document repository. 

If the firm does not want to migrate these documents to the Worldox document repository, but would like to OCR them, you can enable Folder Processing (assuming that the firm has the appropriate licensing).  See: Configuration Guide - Folder.

If the firm will want to migrate these documents (or a subset of these documents to the Worldox document repository), Trumpet recommends that you create a "Legacy Cabinet" that points at the legacy area.  This ensures that the document record for these files is nicely maintained.  Here are some tips and tricks for setting up the Legacy Profile:

  • Ensure that the Worldox user that Symphony OCR operates under has permissions to modify the documents
  • Ensure that 8.3 Support is enabled
  • Ensure that the files within the structure have 8.3 filenames
  • As you're probably aware, it's typically recommended that the Legacy Profile group be "read only" to prevent users from modifying the documents within Worldox.  If the Legacy Profile is set to "Read Only" however, that will also prevent Symphony OCR from processing the documents.  Therefore, you can set the legacy profile group to Deny Open and Deny Copy, but leave the "Read Only" check box on the profile group "unchecked" when configuring the profile group:

Once the legacy cabinet has been created, it will be listed in the Worldox Configuration page of Symphony OCR so that you can enable processing. 

Tip:  Trumpet offers an 8.3 Filename Tool to assist you in ensuring the 8.3 filenames are in tact, contact support@trumpetinc.com for more information on the tool.

 

 

 

...

4. Configuration Guide - NetDocuments

4.1. Quick Start Configuration Guide - NetDocuments

For a video showing the configuration, visit: Symphony OCR - How to configure for NetDocuments

Enable NetDocuments Integration

  • Select the "Log In" link
  • You will be re-directed to NetDocuments.  Enter the username and password with which you want to run Symphony OCR and select "Login"
    • Important: The user you connect as must be a NetDocuments Admin user.

Available Repositories

  • Select "Activate" next to the repository you wish to activate
  • Confirm by selecting "Yes — Activate this repository"
  • If you wish to process documents already stored in NetDocuments, check the "Look for Legacy Documents" checkbox*
  • If you wish to create a new version of a processed document, check "Create version of OCRed results"

Cabinets to Monitor

  • Select the cabinets in which you would like Symphony OCR to monitor / process

Save your Settings

  • In order to save your settings, select the "Save Changes" button.

For more detailed information and advanced settings for configuring Symphony OCR, visit Configuration Guide - NetDocuments

*When Symphony OCR processes documents, the 'modified date' of the document will be updated to the date that the OCR occurs, and the 'modified by' will change to the user that Symphony OCR.

 

What Now?

Symphony OCR should be off and running now! The Finder is looking through your repository and sending documents to the Analyzer to determine what needs to be processed. The Analyzer sends any documents eligible for OCR to the Processor which applies an invisible layer of text to the document.

By default, Symphony OCR queries the document repository for newly saved and modified files every 15 minutes.  Generally speaking, newly saved files will be OCRed within about 15 minutes.    Symphony OCR can also optionally process the files already stored in NetDocuments.  By default, it performs a query for these files every 7 days.  Symphony OCR gives precedence to newer files, so documents that are scanned today will be processed before the legacy documents. 

Note that Text-in-File searches within Netdocuments may not return the text in your recently OCR'd files for up to 6-8 hours. This is due to Netdocuments API and the behavior they've designed to run API operations at a lower priority. You will, however, be able to open the file and benefit from the OCR immediately after Symphony processes it.

Refer to the section, Configuration Guide - NetDocuments - Finder, for further information on finder settings that determine when Symphony OCR locates files for processing.

Refer to the section, Configuration Guide - NetDocuments - Processor, for further information on settings that determine which files are processed

 

...

4.2. Licensing



License

This is where your Symphony OCR license is set. To change your Symphony OCR license, simply click "Licensing" from the Configuration side bar, enter your new license, and select "Save Changes."

License Details

Provides you details of the license.

Features Allowed by your License

This area tells you which features are allowed by your license.

Updating your License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your renewal invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will automatically see this new license, download it and install it.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can click the "Check for Updated License" link on this page.  This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers.  As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

 

...

4.3. Notifications



Notifications allow users to be emailed nightly based on the status of Symphony OCR.

  • Enter the email address for the person you wish to receive notifications in the "Add e-mail address" box

  • Select the Notification Type from the drop down


Each email address may be configured with one of four types:

Never - nightly emails will never be sent to this recipient (instead, after entering an email address you can select "Send Now" and deliver an email to the recipient on demand).

When there are errors - the nightly email will only be sent to the recipient if the overall system condition is Error.  This is useful for recipients who only need to know when the system is not processing documents because of some major error (licensing issues are the most common major error).

When there are warnings or errors - the nightly email will only be sent to the recipient if the overall system condition is Warning or Error.  The warning condition is triggered by documents in the Needs Attention list, configuration problems or other system level issues that should be looked at, even though they haven't completely stopped processing from occurring.

Always (aka Daily) - the nightly email will be sent to the recipient every night regardless of system status.  This is useful for firms who want to monitor the 'Not Processed' lists to ensure that every document that couldn't be OCRed (e.g., because of security or corruption) has been reviewed.  Users can review documents in the various 'Not Processed' lists and either correct the underlying issue, or move the documents to the Ignore list using Bulk Operations >Ignore.

  • Select "Save Changes"

If you have a user leave the firm or you no longer wish for a particular user to be notified, you can change the Notification Type to "Never" or remove the user entirely by selecting "Remove" to the right of the address.

...

4.4. NetDocuments

Connect to NetDocuments

 

Enable NetDocuments Integration

  • Select the appropriate "Connect to NetDocuments" button for the location of your repository
  • You will be re-directed to NetDocuments.  Enter the username and password you want to run Symphony OCR and select "Login" 
    • Important: The user you connect as must be a NetDocuments Admin user.

Available Repositories

This is the list of repositories the specified user has access to.  If you wish to only process certain repositories, you can simply select the applicable ones.  Select "Activate" to activate the repository for processing and confirm by select "Yes - Activate this repository".  If you do not wish to activate the repository, choose "Cancel".

Important Note: Adding a repository to Symphony OCR is a one-way action; once added, a repository can not be removed. Be sure you only activate processing of repositories that you want to permanently tie to your license.  Activating a repository will increase your NetDocuments user count.

Basic Settings

Connected to NetDocuments as — Displays the user as which Symphony OCR connects to NetDocuments.

Active Repository — Displays the active repository name(s).

Preserve modified user and date — Should be selected by default.  This will ensure that when Symphony OCR processes a document it does not change the modified date or the user of the document.

Process legacy documents — When checked, Symphony OCR will process eligible documents that are already stored in NetDocuments.  If the "Preserve modified user and date" checkbox is not set, this will update the modified date to the date that the OCR occurs.  Note also that the 'modified by' user will be changed to the user that Symphony OCR is connected to NetDocuments under.  If the "Preserved modified user and date" checkbox is checked, the dates and users will not be modified.

Create versions of OCRed results — When selected, Symphony OCR will save the OCRed document as a new version.  If the "Preserve modified user and date" checkbox is not set, this will allow you to see the modified date of the document if you choose to process legacy documents, however, this will significantly increase your storage (see Modified Dates of Documents in NetDocuments for further details).  If the "Preserve modified user and date" checkbox is set, you may still opt to check this check box, again understanding that this will significantly increase your storage).

Cabinets to Monitor

This is the list of cabinets the Symphony OCR user has access to. If a cabinet does not appear in the list, this user does not have access to those cabinets. You can select the checkbox in the header area to automatically select and process all documents in all cabinets. If you wish to only process certain cabinets, you can simply select the applicable ones.  Be sure to select "Save Changes" at the bottom of the screen.  If you wish to process certain cabinets at a higher priority than others, you can do so by selecting the appropriate drop down in the list.  For more information see:  Processing Priorities

View detailed progress — Selecting this link will take you to the Progress Details page.  This will provide you with a list of cabinets and the number of documents and pages that have been processed / not processed per cabinet.

Advanced Settings

Create versions of OCRed results — When checked, Symphony OCR will saved the OCRed document as a second version of the original.  This will significantly increase the amount of storage as this will in essence duplicate documents.  We strongly recommend against enabling this feature, as Symphony OCR has several mechanisms available for recovering pre-OCRed versions of documents.

New documents search frequency — By default, Symphony OCR will perform a search for new documents every 15 minutes.  The value on the right may be adjusted if you require searching for documents less frequently.

Legacy documents search frequency — By default, Symphony OCR will perform a search for legacy documents (documents existing prior to installing Symphony OCR) every 7 days.

 

 

...

4.5. Finder

Status

NetDocuments Recent Documents Search - performs a search to find documents that have been created or modified *today* that are eligible for OCR.   By default it performs the query every 15 minutes.   This can be adjusted by selecting "Manage".   This will take you to the NetDocuments page where you can adjust the search frequency under  Advanced Settings.

NetDocuments Legacy Documents Search - performs a search to find legacy documents that are eligible for OCR.   By default it performs the query every 7 days.   This can be adjusted by selecting "Manage".   This will take you to the NetDocuments page where you can adjust the search frequency under Advanced Settings.

...

4.6. Analyzer



The Analyzer is responsible for looking at each document and determining if it is eligible for OCR. If a document is eligible it is placed in the Processing list. If a document is not eligible, it is placed in the appropriate list (for more information on why a document might not be eligible for OCR, refer to the section, Not Processed List).

Control 

In the control area, you can choose to refresh the Analyzer or stop the Analyzer:

Refresh - Selecting Refresh will refresh the Status of the Analyzer page.

Stop Analyzer - Selecting this option will stop the Analyzer from Analyzing documents in the document repository.

Status 

Displays the status of the Analyzer.

Information

Machine Processors - Indicates how many logical processors the workstation running Symphony OCR contains.

Licensed parallel processing - Indicates how many documents will be analyzed at a time based on your license features.

Recent Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Overall Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Settings

Do not analyze documents younger than - The default setting is 30 seconds. If you wish to have the Analyzer wait longer to analyze documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

To change this setting, simply type in the number of seconds, and then select "Save Changes".

...

4.7. Processor

Accessing the Processor

Select Processor in the navigation panel:


The Processor manages the actual OCR processes. Once a document has been identified as eligible for OCR by the Analyzer, the Processor confirms that the file is still eligible for OCR, and then OCRs the file. If a document is successfully OCRed, it is moved to the Processed list (for more information about the flow of documents throughout Symphony OCR, refer to the section Symphony Workflow, Tools & Document Lists).


Control

In the control area, you can choose to refresh the Processor or stop the Processor:

Refresh - Selecting Refresh will refresh the status of the Processor page.

Stop Processor - Selecting this option will stop the Processor from processing documents in the document repository.

Status

The status of the Processor (what it is currently processing).

Information

Processing Capacity Remaining - If you have a license that limits the number of pages you can process per year, the number of pages remaining will appear here.
Machine Processors - Indicates how many logical processors the workstation running SymphonyOCR contains.
Licensed parallel processing - Indicates the number of documents that will be processed by the processor simultaneously.

Recent Performance

Provides performance statistics such as the number of documents and pages that Symphony OCR has processed in a smaller sample size and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.

Overall Performance

Provides performance statistics such as the total number of documents and pages that Symphony OCR has processed and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.


Basic Settings

Process TIFFs (OCR and convert to PDF) - Symphony OCR can process TIFF files and convert them to image + text PDF files. This is an optional setting. If you wish to process TIFF documents, simply check this checkbox.

Note:  If the firm opts to process TIFF documents, this will change the file extension to .tif.  This will "break" any relationships or projects that include this file.

Process MSG (email) attachments - Symphony OCR can process email message attachments.  This is an optional setting.  If you wish to process email message attachments, check this checkbox. 

<Big fat scary warning: 

Due to a limitation in newer versions of Office, Microsoft prevents us from accessing the DLLs that allow us to read/process emails under the following conditions: 
> Symphony OCR is configured to run as a service
> 'Process MSG (email) attachments' is checked
> Outlook 2013 (or possibly Outlook 2016) is open

In these circumstances, you're likely to see the following error:

Therefore, if Symphony OCR is being installed to run as a service *and* will be configured to process email attachments, it is our recommendation to install it on a machine that will not normally have Outlook 2013 (or possibly 2016) open.  On the bright side, our testing has shown that in these situations, Symphony is still processing normal documents and WILL eventually recover and process emails after Office is closed.  But if you can, we recommend avoiding this situation.  If your experience is different, we'd like to hear about it. 

End of big fat scary warning>

Do not process documents younger than - The default setting is 30 seconds. If you wish to have the Processor wait longer to process documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

Do not process documents older than - If you have older documents that you do not want Symphony OCR to process, enter a specific number of days for which the software should process backlog.

Automatically rotate pages to proper orientation -  If selected, the pages will rotate either landscape or portrait according to the text on the page.


Original retention settings

Retain originals of processed files - If selected, Symphony OCR retains copies of the documents that is has processed. These copies appear as versions (if Symphony OCR processes a document 3 times, it will maintain copies of all 3 versions of the document). The user can restore previous versions of a document from the Symphony OCR backup using the document Details screen.

Purge originals of processed files after - The default setting is to retain the originals of processed files for 7 days after which they will be purged. If you wish to change this setting, you can change the value to the appropriate number of days for your firm.

Backlog throttling settings (only needed when your license does NOT have unlimited pages for processing)

Default processing capacity reserved for new documents (based on the actual number of new pages added each day) -  This is calculated from the number of pages that were added to the site in the past year.

Override the default processing capacity reserve - This will determine the number of pages you would like to reserve for new documents, evenly spreading the page count capacity across the entire year. To determine a reasonable reserve, allow the Symphony OCR Analyzer module to run, then look at the timeline for the Processing Queue. Adding the number of pages in the first 52 weeks, and dividing by 365 will give an average number of pages added to the system per year. Trumpet recommends adding an additional 10% to accommodate for future growth or above average filing. This value should be a reasonable overclocking reserve.

Advanced Settings:

Enable OCR debug logging - This will enable debugging for support purposes.

Create thumbnails (if not already present) - Checking this checkbox will create thumbnails if they are not already present.

Enable OCR debug logging - This will help our support team address issues if necessary.  In order to reserve disk space, we recommend not enabling this unless requested by our support team.

Limit parallel processing to X documents- This allows you to limit the number of cores that Symphony OCR will utilize. It uses 1 core per document. For example, if you input 3 Symphony OCR will only use 3 cores, and will process 3 documents simultaneously. See: How does Symphony OCR impact the performance of the server or indexer PC


Upon making changes to any of the above settings, select "Save Changes".

...

4.8. Modified Dates & Modified Users of Documents in NetDocuments

Symphony OCR version 6.6.22 and higher

Modified Dates:

On a new installation, by default Symphony OCR will preserve the modified date of the document (assuming the "Preserve modified user and date" checkbox has been selected in the NetDocuments Integration Settings page). 

Modified Users:

On a new installation, by default Symphony OCR will preserve the modified user of the document (assuming the "Preserve modified user and date" checkbox has been selected in the NetDocuments Integration Settings page).

Create Version of OCRed Results

There are four options regarding creating new versions of documents in Symphony OCR:

Do not create versions:  Symphony OCR will not create a new version of the OCRed results with the exception of .msg files.  NetDocuments requires that Version 1 of emails not be overwritten.

Create versions for all documents:  Symphony OCR will create a new version of each document as it processes the documents.

Create Versions for PDF files only:  Symphony OCR will create a new version of PDF files (not .tiff files)

Create versions for non-PDF documents only:  Symphony OCR will create versions for .msg files (default behavior) and .tiff files as well. 
Note:  Symphony OCR converts .tiff files to .pdf before processing those.  If you opt to process .tiff files they cannot be converted back to .tiff after processing.

Note:  NetDocuments disallows modifications to version 1 of any MSG document.  Therefore, the first change Symphony makes to an MSG will produce a second version, regardless of which of the above options you select.

Symphony OCR version 6.6.21 and lower

(Update already, why don't 'cha :) )

Modified Dates:

Symphony OCR changed the modified dates of documents.

Symphony OCR exactly preserved all of the original content in a document.  When Symphony OCR adds the invisible layer of text, the underlying file did get modified because non-visual information was added.  This means that the modified date of those documents would have changed to the date that Symphony OCR had processed them.  When processing documents in "real time" this only changed the modified date slightly (by a few minutes / hours).  When processing your legacy store of documents (Symphony OCR treats documents modified more than 7 days ago as 'legacy'), however, it may have significantly changed the modified date.


For example, you may have had an image-only PDF document stored in NetDocuments with the modified date of 6/17/2013.  When Symphony OCR was installed and processesed that document, the modified date was changed to the date the document was processed.  So if it was OCRed on 9/18/2014, the modified date would have shown as 9/18/2014.  


If tracking the original modified date of your legacy store was important, you could have opted to enable versioning of your documents.  When Symphony OCR processesed your documents, it created a new version of the document, ensuring the original modified date stayed intact.  If you opted to have Symphony OCR save the processed document as a new version, the amount of storage you were using will have increase (because there are multiple versions of the same document in your document repository).  To enable versioning of your documents, see:  NetDocuments - Basic Settings. 

Modified User:

Because Symphony OCR actually "changed" the PDF documents (and TIFF documents if you choose to process them) - it added an invisible layer of text to the document - the modified user of those documents will change to the user that Symphony OCR is running as.

Updating to Version 6.6.22:

Upon updating to Version 6.6.22 you will be prompted in the Summary Page that NetDocuments Integration has warnings.  Select "Manage" to see the warnings.

At the top of the screen, you will see the "Issue" presented is as follows:

NetDocuments is currently configured to NOT preserve modified user and date for OCRed documents.  We strongly recommend that you enable the 'Preserved modified user and date' option.  Tip:  You may also want to consider disabling 'Create versions of OCRed results' when you make this change.

To update this, select the 'Preserve modified user and date' checkbox and optionally 'Create versions of OCRed' results (if you wish to do so)

If you prefer NOT to have the Modified User and Dates preserved, you can select the "IGNORE" link which will remove the warning from the NetDocuments Integration Settings and the Summary pages.


...

4.9. Legacy Documents

Processing Legacy Documents in NetDocuments:

SymphonyOCR will process documents that were stored in the NetDocuments repository prior to SymphonyOCR having been installed if you opt to enable the functionality in the NetDocuments Configuration screen.  SymphonyOCR recognizes documents modified more than 7 days ago as 'legacy' documents.  When you enable 'Process legacy documents', Symphony will periodically search through all documents for files that are eligible for OCR. The frequency of this search is controlled in the 'Advanced settings'.  The default behavior is to perform the search every 7 days.

...

4.10. Reset NetDocuments User

To reset the NetDocuments user

  • Close Symphony OCR (if it is running as a service then stop the service)
  • Navigate to C:\Program Files (x86)\Trumpet\SymphonyOCR\Config
  • Right click on the settings.xml file and choose to open it with Notepad
  • Find the line that starts with '<ndConnectionManager'
  • Delete the entire line (Be careful no to delete anything else!)
  • Save the file
  • Launch Symphony OCR
  • Navigate to the 'NetDocuments' tab
  • Click the 'Log In' link and follow the prompts
...

4.11. Symphony OCR for NetDocuments Security Overview

Symphony OCR Software Architecture

Symphony OCR is installed as an on-premise Windows service.

Symphony OCR consists of a back-end service that monitors NetDocuments for new and changed documents, analyzes documents, and OCRs documents.  The Symphony OCR service also presents a web based interface for administration.  This web interface is only exposed on the firm’s internal network.

Authentication With NetDocuments

Symphony OCR interacts with NetDocuments via the standard NetDocuments REST API (full details of the NetDocuments API integration can be found here: https://support.netdocuments.com/hc/en-us/articles/205219850-API-Documentation ).

Symphony OCR uses the Internet standard OAuth2 authentication protocol to request permission from the user.  Once the user has approved integration, NetDocuments provides SymphonyOCR with an access token that is used for operations that interact with NetDocuments (querying for document meta data, downloading document content, uploading document content).  A NetDocuments administrative user must give explicit permission for this access to be configured, and the administrative user may revoke the access via the NetDocuments administrative interface at any time.

All network communication between Symphony OCR and NetDocuments is encrypted using standard HTTPS protocols.

Details of the Initial Account Setup Use-Case

  1. During initial setup, the user clicks a "Connect to NetDocuments" button in the Symphony OCR interface.
  2. After clicking, they are redirected to the NetDocuments login screen, where they enter their NetDocuments administrative account credentials.
  3. A short-lived authorization code is then returned to Symphony OCR.
  4. Symphony OCR interacts directly with the NetDocuments authentication server to exchange the authorization code for an authentication token.  This exchange is protected by a pre-agreed upon Client ID and secret Client Token that is specific to Symphony OCR.
  5. The authentication token is used for all subsequent interaction between Symphony OCR and NetDocuments.  Under no circumstances does the browser have access to the NetDocuments authentication token.


...

5. Configuration Guide - Sharefile

5.1. Quick Start Configuration Guide - ShareFile

Connect to ShareFile

  • Select the "Log In" button
  • A ShareFile window will open, enter the username and password you want to run Symphony OCR
  • Select "Login"

Folders to Monitor

  • Select the folders you wish to process

Save your Settings

  • Select "Save Changes" to save your settings

For more detailed information and advanced settings for configuring Symphony OCR, visit Configuration Guide - ShareFile.

...

5.2. Licensing



License

This is where your Symphony OCR license is set. To change your Symphony OCR license, simply click "Licensing" from the Configuration side bar, enter your new license, and select "Save Changes."

License Details

Provides you details of the license.

Features Allowed by your License

This area tells you which features are allowed by your license.

Updating your License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your renewal invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will automatically see this new license, download it and install it.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can click the "Check for Updated License" link on this page.  This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers.  As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

 

...

5.3. Notifications



Notifications allow users to be emailed nightly based on the status of Symphony OCR.

  • Enter the email address for the person you wish to receive notifications in the "Add e-mail address" box

  • Select the Notification Type from the drop down


Each email address may be configured with one of four types:

Never - nightly emails will never be sent to this recipient (instead, after entering an email address you can select "Send Now" and deliver an email to the recipient on demand).

When there are errors - the nightly email will only be sent to the recipient if the overall system condition is Error.  This is useful for recipients who only need to know when the system is not processing documents because of some major error (licensing issues are the most common major error).

When there are warnings or errors - the nightly email will only be sent to the recipient if the overall system condition is Warning or Error.  The warning condition is triggered by documents in the Needs Attention list, configuration problems or other system level issues that should be looked at, even though they haven't completely stopped processing from occurring.

Always (aka Daily) - the nightly email will be sent to the recipient every night regardless of system status.  This is useful for firms who want to monitor the 'Not Processed' lists to ensure that every document that couldn't be OCRed (e.g., because of security or corruption) has been reviewed.  Users can review documents in the various 'Not Processed' lists and either correct the underlying issue, or move the documents to the Ignore list using Bulk Operations >Ignore.

  • Select "Save Changes"

If you have a user leave the firm or you no longer wish for a particular user to be notified, you can change the Notification Type to "Never" or remove the user entirely by selecting "Remove" to the right of the address.

...

5.4. ShareFile

Connect to Sharefile

  • Select the "Log In" link
  • You will be re-directed to ShareFile.  Enter the username and password you want to run Symphony OCR and select "Login"

Basic Settings

ShareFile Account - Displays the user that Symphony OCR connects to ShareFile as

Folders to Monitor

This is the list of Folders the user specified has access to. If a folder does not appear in the list, this user does not have access to those folders.

  • Select the checkbox in the header area to automatically select and process all documents in all folders or if you wish to only process certain folders, you can simply select the applicable ones. 
  • Select "Save Changes" at the bottom of the screen.  If you wish to process certain folders at a higher priority than others, you can do so by selecting the appropriate drop down in the list.  For more information see:  Processing Priorities

View detailed progress - Selecting this link will take you to the Progress Details page.  This will provide you with a list of Cabinets, the number of documents and pages that have been processed / not processed per cabinet.

Advanced Settings

Search frequency - By default, Symphony OCR will perform a search for new documents every 60 minutes.  The value on the may be adjusted if you require searching for documents less frequently.

...

5.5. Finder

Status

ShareFile Search - performs a search in the monitored folder structure to find all documents that are eligible for OCR regardless of how recently the document has been created or modified.   By default it performs this search once every 60 minutes.     This can be adjusted by selecting "Manage".   This will take you to the Folders page where you can adjust the search frequency for each folder.

 

...

5.6. Analyzer



The Analyzer is responsible for looking at each document and determining if it is eligible for OCR. If a document is eligible it is placed in the Processing list. If a document is not eligible, it is placed in the appropriate list (for more information on why a document might not be eligible for OCR, refer to the section, Not Processed List).

Control 

In the control area, you can choose to refresh the Analyzer or stop the Analyzer:

Refresh - Selecting Refresh will refresh the Status of the Analyzer page.

Stop Analyzer - Selecting this option will stop the Analyzer from Analyzing documents in the document repository.

Status 

Displays the status of the Analyzer.

Information

Machine Processors - Indicates how many logical processors the workstation running Symphony OCR contains.

Licensed parallel processing - Indicates how many documents will be analyzed at a time based on your license features.

Recent Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Overall Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Settings

Do not analyze documents younger than - The default setting is 30 seconds. If you wish to have the Analyzer wait longer to analyze documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

To change this setting, simply type in the number of seconds, and then select "Save Changes".

...

5.7. Processor

Accessing the Processor

Select Processor in the navigation panel:


The Processor manages the actual OCR processes. Once a document has been identified as eligible for OCR by the Analyzer, the Processor confirms that the file is still eligible for OCR, and then OCRs the file. If a document is successfully OCRed, it is moved to the Processed list (for more information about the flow of documents throughout Symphony OCR, refer to the section Symphony Workflow, Tools & Document Lists).


Control

In the control area, you can choose to refresh the Processor or stop the Processor:

Refresh - Selecting Refresh will refresh the status of the Processor page.

Stop Processor - Selecting this option will stop the Processor from processing documents in the document repository.

Status

The status of the Processor (what it is currently processing).

Information

Processing Capacity Remaining - If you have a license that limits the number of pages you can process per year, the number of pages remaining will appear here.
Machine Processors - Indicates how many logical processors the workstation running SymphonyOCR contains.
Licensed parallel processing - Indicates the number of documents that will be processed by the processor simultaneously.

Recent Performance

Provides performance statistics such as the number of documents and pages that Symphony OCR has processed in a smaller sample size and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.

Overall Performance

Provides performance statistics such as the total number of documents and pages that Symphony OCR has processed and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.


Basic Settings

Process TIFFs (OCR and convert to PDF) - Symphony OCR can process TIFF files and convert them to image + text PDF files. This is an optional setting. If you wish to process TIFF documents, simply check this checkbox.

Note:  If the firm opts to process TIFF documents, this will change the file extension to .tif.  This will "break" any relationships or projects that include this file.

Process MSG (email) attachments - Symphony OCR can process email message attachments.  This is an optional setting.  If you wish to process email message attachments, check this checkbox. 

<Big fat scary warning: 

Due to a limitation in newer versions of Office, Microsoft prevents us from accessing the DLLs that allow us to read/process emails under the following conditions: 
> Symphony OCR is configured to run as a service
> 'Process MSG (email) attachments' is checked
> Outlook 2013 (or possibly Outlook 2016) is open

In these circumstances, you're likely to see the following error:

Therefore, if Symphony OCR is being installed to run as a service *and* will be configured to process email attachments, it is our recommendation to install it on a machine that will not normally have Outlook 2013 (or possibly 2016) open.  On the bright side, our testing has shown that in these situations, Symphony is still processing normal documents and WILL eventually recover and process emails after Office is closed.  But if you can, we recommend avoiding this situation.  If your experience is different, we'd like to hear about it. 

End of big fat scary warning>

Do not process documents younger than - The default setting is 30 seconds. If you wish to have the Processor wait longer to process documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

Do not process documents older than - If you have older documents that you do not want Symphony OCR to process, enter a specific number of days for which the software should process backlog.

Automatically rotate pages to proper orientation -  If selected, the pages will rotate either landscape or portrait according to the text on the page.


Original retention settings

Retain originals of processed files - If selected, Symphony OCR retains copies of the documents that is has processed. These copies appear as versions (if Symphony OCR processes a document 3 times, it will maintain copies of all 3 versions of the document). The user can restore previous versions of a document from the Symphony OCR backup using the document Details screen.

Purge originals of processed files after - The default setting is to retain the originals of processed files for 7 days after which they will be purged. If you wish to change this setting, you can change the value to the appropriate number of days for your firm.

Backlog throttling settings (only needed when your license does NOT have unlimited pages for processing)

Default processing capacity reserved for new documents (based on the actual number of new pages added each day) -  This is calculated from the number of pages that were added to the site in the past year.

Override the default processing capacity reserve - This will determine the number of pages you would like to reserve for new documents, evenly spreading the page count capacity across the entire year. To determine a reasonable reserve, allow the Symphony OCR Analyzer module to run, then look at the timeline for the Processing Queue. Adding the number of pages in the first 52 weeks, and dividing by 365 will give an average number of pages added to the system per year. Trumpet recommends adding an additional 10% to accommodate for future growth or above average filing. This value should be a reasonable overclocking reserve.

Advanced Settings:

Enable OCR debug logging - This will enable debugging for support purposes.

Create thumbnails (if not already present) - Checking this checkbox will create thumbnails if they are not already present.

Enable OCR debug logging - This will help our support team address issues if necessary.  In order to reserve disk space, we recommend not enabling this unless requested by our support team.

Limit parallel processing to X documents- This allows you to limit the number of cores that Symphony OCR will utilize. It uses 1 core per document. For example, if you input 3 Symphony OCR will only use 3 cores, and will process 3 documents simultaneously. See: How does Symphony OCR impact the performance of the server or indexer PC


Upon making changes to any of the above settings, select "Save Changes".

...

6. Configuration Guide - OpenText

6.1. Quick Start Configuration Guide - OpenText

Enter the database credentials:

Database login credentials

Login to database with username: Enter the username for the database

Login to database with password:  Enter the database password

Database computer name:  Enter the database computer name

Database server instance name:  this is optional and required only if there is more than one database on the server, if there is more than one database on the server, enter the instance name you wish to process

Database name:  enter the name of the database


Select Save Changes.


...

6.2. Licensing



License

This is where your Symphony OCR license is set. To change your Symphony OCR license, simply click "Licensing" from the Configuration side bar, enter your new license, and select "Save Changes."

License Details

Provides you details of the license.

Features Allowed by your License

This area tells you which features are allowed by your license.

Updating your License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your renewal invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will automatically see this new license, download it and install it.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can click the "Check for Updated License" link on this page.  This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers.  As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

 


...

6.3. Notifications

[[INSERT:1288]]

...

6.4. OpenText

Database login credentials

Login to database with username: Enter the username for the database

Login to database with password:  Enter the database password

Database computer name:  Enter the database computer name

Database server instance name:  this is optional and required only if there is more than one database on the server, if there is more than one database on the server, enter the instance name you wish to process

Database name:  enter the name of the database

Advanced settings:

New document search frequency:  By default Symphony OCR will search for documents once every 15 minutes.  This should be sufficient for your needs, however you can change this to search more or less frequently

Legacy document search frequency:  By default, Symphony OCR will perform a search for legacy documents (documents existing prior to installing Symphony OCR) every 7 days.


...

6.5. Finder

OpenText New Document Search - performs a search for new documents that are eligible for OCR.  By default it performs this search once every 15 minutes.  This can be adjusted by selecting "Manage".  This will take you to the OpenText page where you can adjust the search frequency.

OpenText Legacy Document Search - performs a search for legacy documents that are eligible for OCR.  By default it performs this search once every 7 days.  this can be adjusted by selecting "Manage".  This will take you to the OpenText page where you can adjust the search frequency.


...

6.6. Analyzer



The Analyzer is responsible for looking at each document and determining if it is eligible for OCR. If a document is eligible it is placed in the Processing list. If a document is not eligible, it is placed in the appropriate list (for more information on why a document might not be eligible for OCR, refer to the section, Not Processed List).

Control 

In the control area, you can choose to refresh the Analyzer or stop the Analyzer:

Refresh - Selecting Refresh will refresh the Status of the Analyzer page.

Stop Analyzer - Selecting this option will stop the Analyzer from Analyzing documents in the document repository.

Status 

Displays the status of the Analyzer.

Information

Machine Processors - Indicates how many logical processors the workstation running Symphony OCR contains.

Licensed parallel processing - Indicates how many documents will be analyzed at a time based on your license features.

Recent Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Overall Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Settings

Do not analyze documents younger than - The default setting is 30 seconds. If you wish to have the Analyzer wait longer to analyze documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

To change this setting, simply type in the number of seconds, and then select "Save Changes".


...

6.7. Processor

Accessing the Processor

Select Processor in the navigation panel:


The Processor manages the actual OCR processes. Once a document has been identified as eligible for OCR by the Analyzer, the Processor confirms that the file is still eligible for OCR, and then OCRs the file. If a document is successfully OCRed, it is moved to the Processed list (for more information about the flow of documents throughout Symphony OCR, refer to the section Symphony Workflow, Tools & Document Lists).


Control

In the control area, you can choose to refresh the Processor or stop the Processor:

Refresh - Selecting Refresh will refresh the status of the Processor page.

Stop Processor - Selecting this option will stop the Processor from processing documents in the document repository.

Status

The status of the Processor (what it is currently processing).

Information

Processing Capacity Remaining - If you have a license that limits the number of pages you can process per year, the number of pages remaining will appear here.
Machine Processors - Indicates how many logical processors the workstation running SymphonyOCR contains.
Licensed parallel processing - Indicates the number of documents that will be processed by the processor simultaneously.

Recent Performance

Provides performance statistics such as the number of documents and pages that Symphony OCR has processed in a smaller sample size and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.

Overall Performance

Provides performance statistics such as the total number of documents and pages that Symphony OCR has processed and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.


Basic Settings

Process TIFFs (OCR and convert to PDF) - Symphony OCR can process TIFF files and convert them to image + text PDF files. This is an optional setting. If you wish to process TIFF documents, simply check this checkbox.

Note:  If the firm opts to process TIFF documents, this will change the file extension to .tif.  This will "break" any relationships or projects that include this file.

Process MSG (email) attachments - Symphony OCR can process email message attachments.  This is an optional setting.  If you wish to process email message attachments, check this checkbox. 

<Big fat scary warning: 

Due to a limitation in newer versions of Office, Microsoft prevents us from accessing the DLLs that allow us to read/process emails under the following conditions: 
> Symphony OCR is configured to run as a service
> 'Process MSG (email) attachments' is checked
> Outlook 2013 (or possibly Outlook 2016) is open

In these circumstances, you're likely to see the following error:

Therefore, if Symphony OCR is being installed to run as a service *and* will be configured to process email attachments, it is our recommendation to install it on a machine that will not normally have Outlook 2013 (or possibly 2016) open.  On the bright side, our testing has shown that in these situations, Symphony is still processing normal documents and WILL eventually recover and process emails after Office is closed.  But if you can, we recommend avoiding this situation.  If your experience is different, we'd like to hear about it. 

End of big fat scary warning>

Do not process documents younger than - The default setting is 30 seconds. If you wish to have the Processor wait longer to process documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

Do not process documents older than - If you have older documents that you do not want Symphony OCR to process, enter a specific number of days for which the software should process backlog.

Automatically rotate pages to proper orientation -  If selected, the pages will rotate either landscape or portrait according to the text on the page.


Original retention settings

Retain originals of processed files - If selected, Symphony OCR retains copies of the documents that is has processed. These copies appear as versions (if Symphony OCR processes a document 3 times, it will maintain copies of all 3 versions of the document). The user can restore previous versions of a document from the Symphony OCR backup using the document Details screen.

Purge originals of processed files after - The default setting is to retain the originals of processed files for 7 days after which they will be purged. If you wish to change this setting, you can change the value to the appropriate number of days for your firm.

Backlog throttling settings (only needed when your license does NOT have unlimited pages for processing)

Default processing capacity reserved for new documents (based on the actual number of new pages added each day) -  This is calculated from the number of pages that were added to the site in the past year.

Override the default processing capacity reserve - This will determine the number of pages you would like to reserve for new documents, evenly spreading the page count capacity across the entire year. To determine a reasonable reserve, allow the Symphony OCR Analyzer module to run, then look at the timeline for the Processing Queue. Adding the number of pages in the first 52 weeks, and dividing by 365 will give an average number of pages added to the system per year. Trumpet recommends adding an additional 10% to accommodate for future growth or above average filing. This value should be a reasonable overclocking reserve.

Advanced Settings:

Enable OCR debug logging - This will enable debugging for support purposes.

Create thumbnails (if not already present) - Checking this checkbox will create thumbnails if they are not already present.

Enable OCR debug logging - This will help our support team address issues if necessary.  In order to reserve disk space, we recommend not enabling this unless requested by our support team.

Limit parallel processing to X documents- This allows you to limit the number of cores that Symphony OCR will utilize. It uses 1 core per document. For example, if you input 3 Symphony OCR will only use 3 cores, and will process 3 documents simultaneously. See: How does Symphony OCR impact the performance of the server or indexer PC


Upon making changes to any of the above settings, select "Save Changes".

...

7. Configuration Guide - PracticeMaster

7.1. Quick Start Configuration Guide - PracticeMaster

Determine Network Folder

  • Copy and paste the network folder for PracticeMaster
  • Select "Save Changes"

Determine the Documents folder

  • Copy and paste the root of the path where the documents reside within Practice Master

  • Select "Save Changes"

For more detailed information and advanced settings for configuring Symphony OCR, visit Configuration Guide - PracticeMaster

...

7.2. Licensing



License

This is where your Symphony OCR license is set. To change your Symphony OCR license, simply click "Licensing" from the Configuration side bar, enter your new license, and select "Save Changes."

License Details

Provides you details of the license.

Features Allowed by your License

This area tells you which features are allowed by your license.

Updating your License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your renewal invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will automatically see this new license, download it and install it.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can click the "Check for Updated License" link on this page.  This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers.  As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

 

...

7.3. Notifications



Notifications allow users to be emailed nightly based on the status of Symphony OCR.

  • Enter the email address for the person you wish to receive notifications in the "Add e-mail address" box

  • Select the Notification Type from the drop down


Each email address may be configured with one of four types:

Never - nightly emails will never be sent to this recipient (instead, after entering an email address you can select "Send Now" and deliver an email to the recipient on demand).

When there are errors - the nightly email will only be sent to the recipient if the overall system condition is Error.  This is useful for recipients who only need to know when the system is not processing documents because of some major error (licensing issues are the most common major error).

When there are warnings or errors - the nightly email will only be sent to the recipient if the overall system condition is Warning or Error.  The warning condition is triggered by documents in the Needs Attention list, configuration problems or other system level issues that should be looked at, even though they haven't completely stopped processing from occurring.

Always (aka Daily) - the nightly email will be sent to the recipient every night regardless of system status.  This is useful for firms who want to monitor the 'Not Processed' lists to ensure that every document that couldn't be OCRed (e.g., because of security or corruption) has been reviewed.  Users can review documents in the various 'Not Processed' lists and either correct the underlying issue, or move the documents to the Ignore list using Bulk Operations >Ignore.

  • Select "Save Changes"

If you have a user leave the firm or you no longer wish for a particular user to be notified, you can change the Notification Type to "Never" or remove the user entirely by selecting "Remove" to the right of the address.

...

7.4. PracticeMaster

Basic settings

PracticeMaster network folder/current working directory this is where the Practice Master network folder is identified.  Copy and paste the path to the network folder into the field.

Documents folder  this is the root of where the documents reside within Practice Master.  Copy and paste the path to the folder into the field.

Advanced settings

Process Read Only Files - if you wish to process read-only files, you should check this check box

Finder Scan Frequency - by default Symphony OCR will search for documents once every 120 minutes.  This should be sufficient for your needs, however you can change this to search more or less frequently

 

 

...

7.5. Finder

...

7.6. Analyzer



The Analyzer is responsible for looking at each document and determining if it is eligible for OCR. If a document is eligible it is placed in the Processing list. If a document is not eligible, it is placed in the appropriate list (for more information on why a document might not be eligible for OCR, refer to the section, Not Processed List).

Control 

In the control area, you can choose to refresh the Analyzer or stop the Analyzer:

Refresh - Selecting Refresh will refresh the Status of the Analyzer page.

Stop Analyzer - Selecting this option will stop the Analyzer from Analyzing documents in the document repository.

Status 

Displays the status of the Analyzer.

Information

Machine Processors - Indicates how many logical processors the workstation running Symphony OCR contains.

Licensed parallel processing - Indicates how many documents will be analyzed at a time based on your license features.

Recent Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Overall Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Settings

Do not analyze documents younger than - The default setting is 30 seconds. If you wish to have the Analyzer wait longer to analyze documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

To change this setting, simply type in the number of seconds, and then select "Save Changes".

...

7.7. Processor

Accessing the Processor

Select Processor in the navigation panel:


The Processor manages the actual OCR processes. Once a document has been identified as eligible for OCR by the Analyzer, the Processor confirms that the file is still eligible for OCR, and then OCRs the file. If a document is successfully OCRed, it is moved to the Processed list (for more information about the flow of documents throughout Symphony OCR, refer to the section Symphony Workflow, Tools & Document Lists).


Control

In the control area, you can choose to refresh the Processor or stop the Processor:

Refresh - Selecting Refresh will refresh the status of the Processor page.

Stop Processor - Selecting this option will stop the Processor from processing documents in the document repository.

Status

The status of the Processor (what it is currently processing).

Information

Processing Capacity Remaining - If you have a license that limits the number of pages you can process per year, the number of pages remaining will appear here.
Machine Processors - Indicates how many logical processors the workstation running SymphonyOCR contains.
Licensed parallel processing - Indicates the number of documents that will be processed by the processor simultaneously.

Recent Performance

Provides performance statistics such as the number of documents and pages that Symphony OCR has processed in a smaller sample size and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.

Overall Performance

Provides performance statistics such as the total number of documents and pages that Symphony OCR has processed and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.


Basic Settings

Process TIFFs (OCR and convert to PDF) - Symphony OCR can process TIFF files and convert them to image + text PDF files. This is an optional setting. If you wish to process TIFF documents, simply check this checkbox.

Note:  If the firm opts to process TIFF documents, this will change the file extension to .tif.  This will "break" any relationships or projects that include this file.

Process MSG (email) attachments - Symphony OCR can process email message attachments.  This is an optional setting.  If you wish to process email message attachments, check this checkbox. 

<Big fat scary warning: 

Due to a limitation in newer versions of Office, Microsoft prevents us from accessing the DLLs that allow us to read/process emails under the following conditions: 
> Symphony OCR is configured to run as a service
> 'Process MSG (email) attachments' is checked
> Outlook 2013 (or possibly Outlook 2016) is open

In these circumstances, you're likely to see the following error:

Therefore, if Symphony OCR is being installed to run as a service *and* will be configured to process email attachments, it is our recommendation to install it on a machine that will not normally have Outlook 2013 (or possibly 2016) open.  On the bright side, our testing has shown that in these situations, Symphony is still processing normal documents and WILL eventually recover and process emails after Office is closed.  But if you can, we recommend avoiding this situation.  If your experience is different, we'd like to hear about it. 

End of big fat scary warning>

Do not process documents younger than - The default setting is 30 seconds. If you wish to have the Processor wait longer to process documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

Do not process documents older than - If you have older documents that you do not want Symphony OCR to process, enter a specific number of days for which the software should process backlog.

Automatically rotate pages to proper orientation -  If selected, the pages will rotate either landscape or portrait according to the text on the page.


Original retention settings

Retain originals of processed files - If selected, Symphony OCR retains copies of the documents that is has processed. These copies appear as versions (if Symphony OCR processes a document 3 times, it will maintain copies of all 3 versions of the document). The user can restore previous versions of a document from the Symphony OCR backup using the document Details screen.

Purge originals of processed files after - The default setting is to retain the originals of processed files for 7 days after which they will be purged. If you wish to change this setting, you can change the value to the appropriate number of days for your firm.

Backlog throttling settings (only needed when your license does NOT have unlimited pages for processing)

Default processing capacity reserved for new documents (based on the actual number of new pages added each day) -  This is calculated from the number of pages that were added to the site in the past year.

Override the default processing capacity reserve - This will determine the number of pages you would like to reserve for new documents, evenly spreading the page count capacity across the entire year. To determine a reasonable reserve, allow the Symphony OCR Analyzer module to run, then look at the timeline for the Processing Queue. Adding the number of pages in the first 52 weeks, and dividing by 365 will give an average number of pages added to the system per year. Trumpet recommends adding an additional 10% to accommodate for future growth or above average filing. This value should be a reasonable overclocking reserve.

Advanced Settings:

Enable OCR debug logging - This will enable debugging for support purposes.

Create thumbnails (if not already present) - Checking this checkbox will create thumbnails if they are not already present.

Enable OCR debug logging - This will help our support team address issues if necessary.  In order to reserve disk space, we recommend not enabling this unless requested by our support team.

Limit parallel processing to X documents- This allows you to limit the number of cores that Symphony OCR will utilize. It uses 1 core per document. For example, if you input 3 Symphony OCR will only use 3 cores, and will process 3 documents simultaneously. See: How does Symphony OCR impact the performance of the server or indexer PC


Upon making changes to any of the above settings, select "Save Changes".

...

8. Configuration Guide - LSSe64

8.1. Quick start Configuration Guide - LSSe64

Determine Credentials

  • Enter the LSSe64 database username into the "Login to Database with username" field
  • Enter the LSSe64 database password into the "Login to database with password" field
  • Enter the name of the computer / workstation in the "Database computer name" field
  • If a database instance name is defined, enter the name of the SQL server intsance in the "Database server instance name" field.

Save Settings

  • Select "Save Changes" to save your settings.

For more detailed information and advanced settings for configuring Symphony OCR, visit Configuration Guide - LSSe64

...

8.2. Licensing



License

This is where your Symphony OCR license is set. To change your Symphony OCR license, simply click "Licensing" from the Configuration side bar, enter your new license, and select "Save Changes."

License Details

Provides you details of the license.

Features Allowed by your License

This area tells you which features are allowed by your license.

Updating your License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your renewal invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will automatically see this new license, download it and install it.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can click the "Check for Updated License" link on this page.  This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers.  As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

 

...

8.3. Notifications



Notifications allow users to be emailed nightly based on the status of Symphony OCR.

  • Enter the email address for the person you wish to receive notifications in the "Add e-mail address" box

  • Select the Notification Type from the drop down


Each email address may be configured with one of four types:

Never - nightly emails will never be sent to this recipient (instead, after entering an email address you can select "Send Now" and deliver an email to the recipient on demand).

When there are errors - the nightly email will only be sent to the recipient if the overall system condition is Error.  This is useful for recipients who only need to know when the system is not processing documents because of some major error (licensing issues are the most common major error).

When there are warnings or errors - the nightly email will only be sent to the recipient if the overall system condition is Warning or Error.  The warning condition is triggered by documents in the Needs Attention list, configuration problems or other system level issues that should be looked at, even though they haven't completely stopped processing from occurring.

Always (aka Daily) - the nightly email will be sent to the recipient every night regardless of system status.  This is useful for firms who want to monitor the 'Not Processed' lists to ensure that every document that couldn't be OCRed (e.g., because of security or corruption) has been reviewed.  Users can review documents in the various 'Not Processed' lists and either correct the underlying issue, or move the documents to the Ignore list using Bulk Operations >Ignore.

  • Select "Save Changes"

If you have a user leave the firm or you no longer wish for a particular user to be notified, you can change the Notification Type to "Never" or remove the user entirely by selecting "Remove" to the right of the address.

...

8.4. LSSe64

Database login credentials

Login to database with username - enter the LSSe64 database username in this field

Login to database with password - enter the LSSe64 database password in this field

Database computer name - enter the name of the computer / workstation

Database server instance name - (Optional)   If an instance name is defined, enter the name of the SQL server instance (on the Database computer name workstation) that is running LSSe64.   If no instance name is defined, leave this field blank.

Database name - enter the name of the SQL Database for LSSe64

Advanced Settings

New documents search frequency - by default Symphony OCR will query the LSSe64 database for newly saved documents every 15 minutes.   This is typically sufficient, but you may adjust that accordingly.

Legacy documents search frequency - by default Symphony OCR will query the LSSe64 for Legacy documents (documents saved to the database prior to installing Symphony OCR) every 7 days.   This is typically sufficient for handling the back log but you may adjust that according to the firm's specific needs.

...

8.5. Analyzer



The Analyzer is responsible for looking at each document and determining if it is eligible for OCR. If a document is eligible it is placed in the Processing list. If a document is not eligible, it is placed in the appropriate list (for more information on why a document might not be eligible for OCR, refer to the section, Not Processed List).

Control 

In the control area, you can choose to refresh the Analyzer or stop the Analyzer:

Refresh - Selecting Refresh will refresh the Status of the Analyzer page.

Stop Analyzer - Selecting this option will stop the Analyzer from Analyzing documents in the document repository.

Status 

Displays the status of the Analyzer.

Information

Machine Processors - Indicates how many logical processors the workstation running Symphony OCR contains.

Licensed parallel processing - Indicates how many documents will be analyzed at a time based on your license features.

Recent Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Overall Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Settings

Do not analyze documents younger than - The default setting is 30 seconds. If you wish to have the Analyzer wait longer to analyze documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

To change this setting, simply type in the number of seconds, and then select "Save Changes".

...

8.6. Processor

Accessing the Processor

Select Processor in the navigation panel:


The Processor manages the actual OCR processes. Once a document has been identified as eligible for OCR by the Analyzer, the Processor confirms that the file is still eligible for OCR, and then OCRs the file. If a document is successfully OCRed, it is moved to the Processed list (for more information about the flow of documents throughout Symphony OCR, refer to the section Symphony Workflow, Tools & Document Lists).


Control

In the control area, you can choose to refresh the Processor or stop the Processor:

Refresh - Selecting Refresh will refresh the status of the Processor page.

Stop Processor - Selecting this option will stop the Processor from processing documents in the document repository.

Status

The status of the Processor (what it is currently processing).

Information

Processing Capacity Remaining - If you have a license that limits the number of pages you can process per year, the number of pages remaining will appear here.
Machine Processors - Indicates how many logical processors the workstation running SymphonyOCR contains.
Licensed parallel processing - Indicates the number of documents that will be processed by the processor simultaneously.

Recent Performance

Provides performance statistics such as the number of documents and pages that Symphony OCR has processed in a smaller sample size and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.

Overall Performance

Provides performance statistics such as the total number of documents and pages that Symphony OCR has processed and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.


Basic Settings

Process TIFFs (OCR and convert to PDF) - Symphony OCR can process TIFF files and convert them to image + text PDF files. This is an optional setting. If you wish to process TIFF documents, simply check this checkbox.

Note:  If the firm opts to process TIFF documents, this will change the file extension to .tif.  This will "break" any relationships or projects that include this file.

Process MSG (email) attachments - Symphony OCR can process email message attachments.  This is an optional setting.  If you wish to process email message attachments, check this checkbox. 

<Big fat scary warning: 

Due to a limitation in newer versions of Office, Microsoft prevents us from accessing the DLLs that allow us to read/process emails under the following conditions: 
> Symphony OCR is configured to run as a service
> 'Process MSG (email) attachments' is checked
> Outlook 2013 (or possibly Outlook 2016) is open

In these circumstances, you're likely to see the following error:

Therefore, if Symphony OCR is being installed to run as a service *and* will be configured to process email attachments, it is our recommendation to install it on a machine that will not normally have Outlook 2013 (or possibly 2016) open.  On the bright side, our testing has shown that in these situations, Symphony is still processing normal documents and WILL eventually recover and process emails after Office is closed.  But if you can, we recommend avoiding this situation.  If your experience is different, we'd like to hear about it. 

End of big fat scary warning>

Do not process documents younger than - The default setting is 30 seconds. If you wish to have the Processor wait longer to process documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

Do not process documents older than - If you have older documents that you do not want Symphony OCR to process, enter a specific number of days for which the software should process backlog.

Automatically rotate pages to proper orientation -  If selected, the pages will rotate either landscape or portrait according to the text on the page.


Original retention settings

Retain originals of processed files - If selected, Symphony OCR retains copies of the documents that is has processed. These copies appear as versions (if Symphony OCR processes a document 3 times, it will maintain copies of all 3 versions of the document). The user can restore previous versions of a document from the Symphony OCR backup using the document Details screen.

Purge originals of processed files after - The default setting is to retain the originals of processed files for 7 days after which they will be purged. If you wish to change this setting, you can change the value to the appropriate number of days for your firm.

Backlog throttling settings (only needed when your license does NOT have unlimited pages for processing)

Default processing capacity reserved for new documents (based on the actual number of new pages added each day) -  This is calculated from the number of pages that were added to the site in the past year.

Override the default processing capacity reserve - This will determine the number of pages you would like to reserve for new documents, evenly spreading the page count capacity across the entire year. To determine a reasonable reserve, allow the Symphony OCR Analyzer module to run, then look at the timeline for the Processing Queue. Adding the number of pages in the first 52 weeks, and dividing by 365 will give an average number of pages added to the system per year. Trumpet recommends adding an additional 10% to accommodate for future growth or above average filing. This value should be a reasonable overclocking reserve.

Advanced Settings:

Enable OCR debug logging - This will enable debugging for support purposes.

Create thumbnails (if not already present) - Checking this checkbox will create thumbnails if they are not already present.

Enable OCR debug logging - This will help our support team address issues if necessary.  In order to reserve disk space, we recommend not enabling this unless requested by our support team.

Limit parallel processing to X documents- This allows you to limit the number of cores that Symphony OCR will utilize. It uses 1 core per document. For example, if you input 3 Symphony OCR will only use 3 cores, and will process 3 documents simultaneously. See: How does Symphony OCR impact the performance of the server or indexer PC


Upon making changes to any of the above settings, select "Save Changes".

...

9. Configuration Guide - Folders

9.1. Quick Start Configuration Guide - Folders

For a video showing these steps, visit:  Symphony OCR - Configure for Windows Folder Tree


Determine what folders to process

  • Copy and paste the folder path that you wish to process*
  • Click the "Add" button
  • Repeat for all folder paths
  • Select "Save Changes"

For more detailed information and advanced settings for configuring Symphony OCR, visit Configuration Guide - Folders

*Symphony OCR will process the entire directory tree of the path you provide (e.g. X:\Clients will process all documents in the subfolders beneath X:\Clients, like X:\Clients\Anderson, Matthew and X:\Clients\Anderson, Matthew\Agreements)


What Now?

Symphony OCR should be off and running now! The Finder is looking through your repository and sending documents to the Analyzer to determine what needs to be processed. The Analyzer sends any documents eligible for OCR to the Processor which applies an invisible layer of text to the document.

By default, Symphony OCR queries the Folder document repository for newly saved and modified files every 120 minutes.  Generally speaking, newly saved files will be OCRed within about 120 minutes.  Depending on the volume of image-only documents already filed to Worldox, it may take a while for Symphony OCR to process the backlog (legacy files).  Symphony OCR gives precedence to newer files, so documents that are scanned today will be processed before the backlog.  

Refer to the section, Configuration Guide - Folder - Finder, for further information on finder settings that determine when Symphony OCR locates files for processing.

Refer to the section, Configuration Guide - Folder - Processor, for further information on configuration settings that determine which files are processed.

...

9.2. Licensing



License

This is where your Symphony OCR license is set. To change your Symphony OCR license, simply click "Licensing" from the Configuration side bar, enter your new license, and select "Save Changes."

License Details

Provides you details of the license.

Features Allowed by your License

This area tells you which features are allowed by your license.

Updating your License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your renewal invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will automatically see this new license, download it and install it.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can click the "Check for Updated License" link on this page.  This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers.  As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

 

...

9.3. Notifications



Notifications allow users to be emailed nightly based on the status of Symphony OCR.

  • Enter the email address for the person you wish to receive notifications in the "Add e-mail address" box

  • Select the Notification Type from the drop down


Each email address may be configured with one of four types:

Never - nightly emails will never be sent to this recipient (instead, after entering an email address you can select "Send Now" and deliver an email to the recipient on demand).

When there are errors - the nightly email will only be sent to the recipient if the overall system condition is Error.  This is useful for recipients who only need to know when the system is not processing documents because of some major error (licensing issues are the most common major error).

When there are warnings or errors - the nightly email will only be sent to the recipient if the overall system condition is Warning or Error.  The warning condition is triggered by documents in the Needs Attention list, configuration problems or other system level issues that should be looked at, even though they haven't completely stopped processing from occurring.

Always (aka Daily) - the nightly email will be sent to the recipient every night regardless of system status.  This is useful for firms who want to monitor the 'Not Processed' lists to ensure that every document that couldn't be OCRed (e.g., because of security or corruption) has been reviewed.  Users can review documents in the various 'Not Processed' lists and either correct the underlying issue, or move the documents to the Ignore list using Bulk Operations >Ignore.

  • Select "Save Changes"

If you have a user leave the firm or you no longer wish for a particular user to be notified, you can change the Notification Type to "Never" or remove the user entirely by selecting "Remove" to the right of the address.

...

9.4. Folders

Folders to Monitor

This is the list of folders that Symphony OCR is monitoring.

Search Frequency - The frequency in which the Finder will query this directory tree for new pdf & tif documents.

Default Priority - The priority level in which this directory will be processed.  For more information on setting document priorities see:  Processing Priorities

Add a folder

To add a folder or directory tree to the list of folders that should be monitored by Symphony OCR, add the path to the field and select "Add".  Symphony OCR will process the entire directory tree of the path you provide.  (e.g. X:\Clients will process all documents in the subfolders beneath X:\Clients, like X:\Clients\Anderson, Matthew and X:\Clients\Anderson, Matthew\Agreements, then select the Add button on the right.  This will add the directory tree to the list of folders that Symphony OCR is monitoring.

Note:  If you wish to process files in a hidden folder, you must explicitly indicate that folder. For example, if you have a root folder like X:\Clients and under that a hidden folder called "Inactive" (e.g. X:\Client\Inactive), you must explicitly add that folder to the Monitored folders.

Advanced Settings

Process Read Only Files - if you wish to process read-only files, you should check this check box

 

...

9.5. Enable Read Only Processing

  • Open the Symphony OCR hompage
  • Select "Folder" from the left side bar
  • Check the "Process read only files" checkbox
  • Select "Save Changes"
...

9.6. Scheduler


The Scheduler determines when and how frequently Symphony OCR performs specific tasks, such as when to send a heartbeat, when to search for new documents, when to purge backup files, etc.

To adjust a setting select "Edit" to the left of the specific setting you would like to adjust. 

To delete a specific Scheduler entry, select "Delete" on the right of the particular setting.

Most users will not require changing these items, however there are special cases when you may wish to do this.  For example, if the firm runs their indexer software and Symphony OCR on a user's workstation, you may wish to only process items overnight.

...

9.7. Finder

Status

Folder Search - performs a search in the monitored folder structure to find all documents that are eligible for OCR regardless of how recently the document has been created or modified.   By default it performs this search once every 120 minutes.     This can be adjusted by selecting "Manage".   This will take you to the Folders page where you can adjust the search frequency for each folder.

 

...

9.8. Analyzer



The Analyzer is responsible for looking at each document and determining if it is eligible for OCR. If a document is eligible it is placed in the Processing list. If a document is not eligible, it is placed in the appropriate list (for more information on why a document might not be eligible for OCR, refer to the section, Not Processed List).

Control 

In the control area, you can choose to refresh the Analyzer or stop the Analyzer:

Refresh - Selecting Refresh will refresh the Status of the Analyzer page.

Stop Analyzer - Selecting this option will stop the Analyzer from Analyzing documents in the document repository.

Status 

Displays the status of the Analyzer.

Information

Machine Processors - Indicates how many logical processors the workstation running Symphony OCR contains.

Licensed parallel processing - Indicates how many documents will be analyzed at a time based on your license features.

Recent Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Overall Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Settings

Do not analyze documents younger than - The default setting is 30 seconds. If you wish to have the Analyzer wait longer to analyze documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

To change this setting, simply type in the number of seconds, and then select "Save Changes".

...

9.9. Processor

Accessing the Processor

Select Processor in the navigation panel:


The Processor manages the actual OCR processes. Once a document has been identified as eligible for OCR by the Analyzer, the Processor confirms that the file is still eligible for OCR, and then OCRs the file. If a document is successfully OCRed, it is moved to the Processed list (for more information about the flow of documents throughout Symphony OCR, refer to the section Symphony Workflow, Tools & Document Lists).


Control

In the control area, you can choose to refresh the Processor or stop the Processor:

Refresh - Selecting Refresh will refresh the status of the Processor page.

Stop Processor - Selecting this option will stop the Processor from processing documents in the document repository.

Status

The status of the Processor (what it is currently processing).

Information

Processing Capacity Remaining - If you have a license that limits the number of pages you can process per year, the number of pages remaining will appear here.
Machine Processors - Indicates how many logical processors the workstation running SymphonyOCR contains.
Licensed parallel processing - Indicates the number of documents that will be processed by the processor simultaneously.

Recent Performance

Provides performance statistics such as the number of documents and pages that Symphony OCR has processed in a smaller sample size and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.

Overall Performance

Provides performance statistics such as the total number of documents and pages that Symphony OCR has processed and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.


Basic Settings

Process TIFFs (OCR and convert to PDF) - Symphony OCR can process TIFF files and convert them to image + text PDF files. This is an optional setting. If you wish to process TIFF documents, simply check this checkbox.

Note:  If the firm opts to process TIFF documents, this will change the file extension to .tif.  This will "break" any relationships or projects that include this file.

Process MSG (email) attachments - Symphony OCR can process email message attachments.  This is an optional setting.  If you wish to process email message attachments, check this checkbox. 

<Big fat scary warning: 

Due to a limitation in newer versions of Office, Microsoft prevents us from accessing the DLLs that allow us to read/process emails under the following conditions: 
> Symphony OCR is configured to run as a service
> 'Process MSG (email) attachments' is checked
> Outlook 2013 (or possibly Outlook 2016) is open

In these circumstances, you're likely to see the following error:

Therefore, if Symphony OCR is being installed to run as a service *and* will be configured to process email attachments, it is our recommendation to install it on a machine that will not normally have Outlook 2013 (or possibly 2016) open.  On the bright side, our testing has shown that in these situations, Symphony is still processing normal documents and WILL eventually recover and process emails after Office is closed.  But if you can, we recommend avoiding this situation.  If your experience is different, we'd like to hear about it. 

End of big fat scary warning>

Do not process documents younger than - The default setting is 30 seconds. If you wish to have the Processor wait longer to process documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

Do not process documents older than - If you have older documents that you do not want Symphony OCR to process, enter a specific number of days for which the software should process backlog.

Automatically rotate pages to proper orientation -  If selected, the pages will rotate either landscape or portrait according to the text on the page.


Original retention settings

Retain originals of processed files - If selected, Symphony OCR retains copies of the documents that is has processed. These copies appear as versions (if Symphony OCR processes a document 3 times, it will maintain copies of all 3 versions of the document). The user can restore previous versions of a document from the Symphony OCR backup using the document Details screen.

Purge originals of processed files after - The default setting is to retain the originals of processed files for 7 days after which they will be purged. If you wish to change this setting, you can change the value to the appropriate number of days for your firm.

Backlog throttling settings (only needed when your license does NOT have unlimited pages for processing)

Default processing capacity reserved for new documents (based on the actual number of new pages added each day) -  This is calculated from the number of pages that were added to the site in the past year.

Override the default processing capacity reserve - This will determine the number of pages you would like to reserve for new documents, evenly spreading the page count capacity across the entire year. To determine a reasonable reserve, allow the Symphony OCR Analyzer module to run, then look at the timeline for the Processing Queue. Adding the number of pages in the first 52 weeks, and dividing by 365 will give an average number of pages added to the system per year. Trumpet recommends adding an additional 10% to accommodate for future growth or above average filing. This value should be a reasonable overclocking reserve.

Advanced Settings:

Enable OCR debug logging - This will enable debugging for support purposes.

Create thumbnails (if not already present) - Checking this checkbox will create thumbnails if they are not already present.

Enable OCR debug logging - This will help our support team address issues if necessary.  In order to reserve disk space, we recommend not enabling this unless requested by our support team.

Limit parallel processing to X documents- This allows you to limit the number of cores that Symphony OCR will utilize. It uses 1 core per document. For example, if you input 3 Symphony OCR will only use 3 cores, and will process 3 documents simultaneously. See: How does Symphony OCR impact the performance of the server or indexer PC


Upon making changes to any of the above settings, select "Save Changes".

...

9.10. Enabling Text Search in 64-bit versions of Windows

In order to enable a 64-bit version of Windows to search for text in PDF files, there are a few additional steps to take.  You will need to download and install Adobe's PDF iFilter for 64-bit version of Windows.

To get started, visit: http://www.adobe.com/support/downloads/detail.jsp?ftpID=5542 to download and install the Adobe iFilter.

 

Once you have installed the Adobe iFilter, open your Control Panel and go to "Indexing Options".

 

Click "Advanced" at the bottom and then select the "File Types" tab.

     

 

Scroll down to "pdf" under "Extensions" and it should now say "PDF Filter".

 

...

9.11. Integrating with Time Matters?

Symphony OCR's Folder integration can be used to point at your Time Matters repository directory. Simply input the path to the repository into the Folders configuration using the instructions found earlier in this chapter.  Remember, if Symphony is installed as a service be sure to input this as a UNC path.

To get the absolute most out of Symphony OCR, make sure your Time Matters' text indexing functionality is enabled so that you can do 'text-in-file' searches. Refer to your Time Matters rep/support for assistance with that.

...

10. Configuration Guide - Box

10.1. Quick Start Configuration Guide - Box

Note:  Symphony OCR works with the locally synced (either desktop or server) folder tree of Box and uses a Windows Folder Tree License.  For more information on Box's sync tool visit:  Box Sync Installation Information

Determine what folders to process

  • Copy and paste the folder path that you wish to process*
  • Click the "Add" button
  • Repeat for all folder paths
  • Select "Save Changes"

For more detailed information and advanced settings for configuring Symphony OCR, visit Configuration Guide - Box

*Symphony OCR will process the entire directory tree of the path you provide (e.g. X:\Clients will process all documents in the subfolders beneath X:\Clients, like X:\Clients\Anderson, Matthew and X:\Clients\Anderson, Matthew\Agreements) 


What Now?

Symphony OCR should be off and running now! The Finder is looking through your repository and sending documents to the Analyzer to determine what needs to be processed. The Analyzer sends any documents eligible for OCR to the Processor which applies an invisible layer of text to the document. 

By default, Symphony OCR queries the Folder document repository for newly saved and modified files every 120 minutes.  Generally speaking, newly saved files will be OCRed within about 120 minutes.  Depending on the volume of image-only documents already filed to Worldox, it may take a while for Symphony OCR to process the backlog (legacy files).  Symphony OCR gives precedence to newer files, so documents that are scanned today will be processed before the backlog.  

Refer to the section, Configuration Guide - Box - Finder, for further information on finder settings that determine when Symphony OCR locates files for processing.

Refer to the section, Configuration Guide - Box - Processor, for further information on configuration settings that determine which files are processed.


...

10.2. Licensing



License

This is where your Symphony OCR license is set. To change your Symphony OCR license, simply click "Licensing" from the Configuration side bar, enter your new license, and select "Save Changes."

License Details

Provides you details of the license.

Features Allowed by your License

This area tells you which features are allowed by your license.

Updating your License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your renewal invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will automatically see this new license, download it and install it.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can click the "Check for Updated License" link on this page.  This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers.  As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

 


...

10.3. Notifications



Notifications allow users to be emailed nightly based on the status of Symphony OCR.

  • Enter the email address for the person you wish to receive notifications in the "Add e-mail address" box

  • Select the Notification Type from the drop down


Each email address may be configured with one of four types:

Never - nightly emails will never be sent to this recipient (instead, after entering an email address you can select "Send Now" and deliver an email to the recipient on demand).

When there are errors - the nightly email will only be sent to the recipient if the overall system condition is Error.  This is useful for recipients who only need to know when the system is not processing documents because of some major error (licensing issues are the most common major error).

When there are warnings or errors - the nightly email will only be sent to the recipient if the overall system condition is Warning or Error.  The warning condition is triggered by documents in the Needs Attention list, configuration problems or other system level issues that should be looked at, even though they haven't completely stopped processing from occurring.

Always (aka Daily) - the nightly email will be sent to the recipient every night regardless of system status.  This is useful for firms who want to monitor the 'Not Processed' lists to ensure that every document that couldn't be OCRed (e.g., because of security or corruption) has been reviewed.  Users can review documents in the various 'Not Processed' lists and either correct the underlying issue, or move the documents to the Ignore list using Bulk Operations >Ignore.

  • Select "Save Changes"

If you have a user leave the firm or you no longer wish for a particular user to be notified, you can change the Notification Type to "Never" or remove the user entirely by selecting "Remove" to the right of the address.


...

10.4. Folders

Folders to Monitor

This is the list of folders that Symphony OCR is monitoring.

Search Frequency - The frequency in which the Finder will query this directory tree for new pdf & tif documents.

Default Priority - The priority level in which this directory will be processed.  For more information on setting document priorities see:  Processing Priorities

Add a folder

To add a folder or directory tree to the list of folders that should be monitored by Symphony OCR, add the path to the field and select "Add".  Symphony OCR will process the entire directory tree of the path you provide.  (e.g. X:\Clients will process all documents in the subfolders beneath X:\Clients, like X:\Clients\Anderson, Matthew and X:\Clients\Anderson, Matthew\Agreements, then select the Add button on the right.  This will add the directory tree to the list of folders that Symphony OCR is monitoring.

Note:  If you wish to process files in a hidden folder, you must explicitly indicate that folder. For example, if you have a root folder like X:\Clients and under that a hidden folder called "Inactive" (e.g. X:\Client\Inactive), you must explicitly add that folder to the Monitored folders.

Advanced Settings

Process Read Only Files - if you wish to process read-only files, you should check this check box

 


...

10.5. Enable Read Only Processing

  • Open the Symphony OCR hompage
  • Select "Folder" from the left side bar
  • Check the "Process read only files" checkbox
  • Select "Save Changes"

...

10.6. Scheduler


The Scheduler determines when and how frequently Symphony OCR performs specific tasks, such as when to send a heartbeat, when to search for new documents, when to purge backup files, etc.

To adjust a setting select "Edit" to the left of the specific setting you would like to adjust. 

To delete a specific Scheduler entry, select "Delete" on the right of the particular setting.

Most users will not require changing these items, however there are special cases when you may wish to do this.  For example, if the firm runs their indexer software and Symphony OCR on a user's workstation, you may wish to only process items overnight.


...

10.7. Finder

Status

Folder Search - performs a search in the monitored folder structure to find all documents that are eligible for OCR regardless of how recently the document has been created or modified.   By default it performs this search once every 120 minutes.     This can be adjusted by selecting "Manage".   This will take you to the Folders page where you can adjust the search frequency for each folder.

 


...

10.8. Analyzer



The Analyzer is responsible for looking at each document and determining if it is eligible for OCR. If a document is eligible it is placed in the Processing list. If a document is not eligible, it is placed in the appropriate list (for more information on why a document might not be eligible for OCR, refer to the section, Not Processed List).

Control 

In the control area, you can choose to refresh the Analyzer or stop the Analyzer:

Refresh - Selecting Refresh will refresh the Status of the Analyzer page.

Stop Analyzer - Selecting this option will stop the Analyzer from Analyzing documents in the document repository.

Status 

Displays the status of the Analyzer.

Information

Machine Processors - Indicates how many logical processors the workstation running Symphony OCR contains.

Licensed parallel processing - Indicates how many documents will be analyzed at a time based on your license features.

Recent Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Overall Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Settings

Do not analyze documents younger than - The default setting is 30 seconds. If you wish to have the Analyzer wait longer to analyze documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

To change this setting, simply type in the number of seconds, and then select "Save Changes".


...

10.9. Processor

Accessing the Processor

Select Processor in the navigation panel:


The Processor manages the actual OCR processes. Once a document has been identified as eligible for OCR by the Analyzer, the Processor confirms that the file is still eligible for OCR, and then OCRs the file. If a document is successfully OCRed, it is moved to the Processed list (for more information about the flow of documents throughout Symphony OCR, refer to the section Symphony Workflow, Tools & Document Lists).


Control

In the control area, you can choose to refresh the Processor or stop the Processor:

Refresh - Selecting Refresh will refresh the status of the Processor page.

Stop Processor - Selecting this option will stop the Processor from processing documents in the document repository.

Status

The status of the Processor (what it is currently processing).

Information

Processing Capacity Remaining - If you have a license that limits the number of pages you can process per year, the number of pages remaining will appear here.
Machine Processors - Indicates how many logical processors the workstation running SymphonyOCR contains.
Licensed parallel processing - Indicates the number of documents that will be processed by the processor simultaneously.

Recent Performance

Provides performance statistics such as the number of documents and pages that Symphony OCR has processed in a smaller sample size and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.

Overall Performance

Provides performance statistics such as the total number of documents and pages that Symphony OCR has processed and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.


Basic Settings

Process TIFFs (OCR and convert to PDF) - Symphony OCR can process TIFF files and convert them to image + text PDF files. This is an optional setting. If you wish to process TIFF documents, simply check this checkbox.

Note:  If the firm opts to process TIFF documents, this will change the file extension to .tif.  This will "break" any relationships or projects that include this file.

Process MSG (email) attachments - Symphony OCR can process email message attachments.  This is an optional setting.  If you wish to process email message attachments, check this checkbox. 

<Big fat scary warning: 

Due to a limitation in newer versions of Office, Microsoft prevents us from accessing the DLLs that allow us to read/process emails under the following conditions: 
> Symphony OCR is configured to run as a service
> 'Process MSG (email) attachments' is checked
> Outlook 2013 (or possibly Outlook 2016) is open

In these circumstances, you're likely to see the following error:

Therefore, if Symphony OCR is being installed to run as a service *and* will be configured to process email attachments, it is our recommendation to install it on a machine that will not normally have Outlook 2013 (or possibly 2016) open.  On the bright side, our testing has shown that in these situations, Symphony is still processing normal documents and WILL eventually recover and process emails after Office is closed.  But if you can, we recommend avoiding this situation.  If your experience is different, we'd like to hear about it. 

End of big fat scary warning>

Do not process documents younger than - The default setting is 30 seconds. If you wish to have the Processor wait longer to process documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

Do not process documents older than - If you have older documents that you do not want Symphony OCR to process, enter a specific number of days for which the software should process backlog.

Automatically rotate pages to proper orientation -  If selected, the pages will rotate either landscape or portrait according to the text on the page.


Original retention settings

Retain originals of processed files - If selected, Symphony OCR retains copies of the documents that is has processed. These copies appear as versions (if Symphony OCR processes a document 3 times, it will maintain copies of all 3 versions of the document). The user can restore previous versions of a document from the Symphony OCR backup using the document Details screen.

Purge originals of processed files after - The default setting is to retain the originals of processed files for 7 days after which they will be purged. If you wish to change this setting, you can change the value to the appropriate number of days for your firm.

Backlog throttling settings (only needed when your license does NOT have unlimited pages for processing)

Default processing capacity reserved for new documents (based on the actual number of new pages added each day) -  This is calculated from the number of pages that were added to the site in the past year.

Override the default processing capacity reserve - This will determine the number of pages you would like to reserve for new documents, evenly spreading the page count capacity across the entire year. To determine a reasonable reserve, allow the Symphony OCR Analyzer module to run, then look at the timeline for the Processing Queue. Adding the number of pages in the first 52 weeks, and dividing by 365 will give an average number of pages added to the system per year. Trumpet recommends adding an additional 10% to accommodate for future growth or above average filing. This value should be a reasonable overclocking reserve.

Advanced Settings:

Enable OCR debug logging - This will enable debugging for support purposes.

Create thumbnails (if not already present) - Checking this checkbox will create thumbnails if they are not already present.

Enable OCR debug logging - This will help our support team address issues if necessary.  In order to reserve disk space, we recommend not enabling this unless requested by our support team.

Limit parallel processing to X documents- This allows you to limit the number of cores that Symphony OCR will utilize. It uses 1 core per document. For example, if you input 3 Symphony OCR will only use 3 cores, and will process 3 documents simultaneously. See: How does Symphony OCR impact the performance of the server or indexer PC


Upon making changes to any of the above settings, select "Save Changes".

...

11. Configuration Guide - Dropbox

11.1. Quick Start Configuration Guide - Dropbox

Note:  Symphony OCR works with the locally synced (either desktop or server) folder tree of Dropbox and uses a Windows Folder Tree License.

Determine what folders to process

  • Copy and paste the folder path that you wish to process*
  • Click the "Add" button
  • Repeat for all folder paths
  • Select "Save Changes"

For more detailed information and advanced settings for configuring Symphony OCR, visit Configuration Guide - Dropbox

*Symphony OCR will process the entire directory tree of the path you provide (e.g. X:\Clients will process all documents in the subfolders beneath X:\Clients, like X:\Clients\Anderson, Matthew and X:\Clients\Anderson, Matthew\Agreements) 


What Now?

Symphony OCR should be off and running now! The Finder is looking through your repository and sending documents to the Analyzer to determine what needs to be processed. The Analyzer sends any documents eligible for OCR to the Processor which applies an invisible layer of text to the document. 

By default, Symphony OCR queries the Folder document repository for newly saved and modified files every 120 minutes.  Generally speaking, newly saved files will be OCRed within about 120 minutes.  Depending on the volume of image-only documents already filed to Worldox, it may take a while for Symphony OCR to process the backlog (legacy files).  Symphony OCR gives precedence to newer files, so documents that are scanned today will be processed before the backlog.  

Refer to the section, Configuration Guide - Dropbox - Finder, for further information on finder settings that determine when Symphony OCR locates files for processing.

Refer to the section, Configuration Guide - Dropbox - Processor, for further information on configuration settings that determine which files are processed.


...

11.2. Licensing



License

This is where your Symphony OCR license is set. To change your Symphony OCR license, simply click "Licensing" from the Configuration side bar, enter your new license, and select "Save Changes."

License Details

Provides you details of the license.

Features Allowed by your License

This area tells you which features are allowed by your license.

Updating your License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your renewal invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will automatically see this new license, download it and install it.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can click the "Check for Updated License" link on this page.  This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers.  As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

 


...

11.3. Notifications



Notifications allow users to be emailed nightly based on the status of Symphony OCR.

  • Enter the email address for the person you wish to receive notifications in the "Add e-mail address" box

  • Select the Notification Type from the drop down


Each email address may be configured with one of four types:

Never - nightly emails will never be sent to this recipient (instead, after entering an email address you can select "Send Now" and deliver an email to the recipient on demand).

When there are errors - the nightly email will only be sent to the recipient if the overall system condition is Error.  This is useful for recipients who only need to know when the system is not processing documents because of some major error (licensing issues are the most common major error).

When there are warnings or errors - the nightly email will only be sent to the recipient if the overall system condition is Warning or Error.  The warning condition is triggered by documents in the Needs Attention list, configuration problems or other system level issues that should be looked at, even though they haven't completely stopped processing from occurring.

Always (aka Daily) - the nightly email will be sent to the recipient every night regardless of system status.  This is useful for firms who want to monitor the 'Not Processed' lists to ensure that every document that couldn't be OCRed (e.g., because of security or corruption) has been reviewed.  Users can review documents in the various 'Not Processed' lists and either correct the underlying issue, or move the documents to the Ignore list using Bulk Operations >Ignore.

  • Select "Save Changes"

If you have a user leave the firm or you no longer wish for a particular user to be notified, you can change the Notification Type to "Never" or remove the user entirely by selecting "Remove" to the right of the address.


...

11.4. Folders

Folders to Monitor

This is the list of folders that Symphony OCR is monitoring.

Search Frequency - The frequency in which the Finder will query this directory tree for new pdf & tif documents.

Default Priority - The priority level in which this directory will be processed.  For more information on setting document priorities see:  Processing Priorities

Add a folder

To add a folder or directory tree to the list of folders that should be monitored by Symphony OCR, add the path to the field and select "Add".  Symphony OCR will process the entire directory tree of the path you provide.  (e.g. X:\Clients will process all documents in the subfolders beneath X:\Clients, like X:\Clients\Anderson, Matthew and X:\Clients\Anderson, Matthew\Agreements, then select the Add button on the right.  This will add the directory tree to the list of folders that Symphony OCR is monitoring.

Note:  If you wish to process files in a hidden folder, you must explicitly indicate that folder. For example, if you have a root folder like X:\Clients and under that a hidden folder called "Inactive" (e.g. X:\Client\Inactive), you must explicitly add that folder to the Monitored folders.

Advanced Settings

Process Read Only Files - if you wish to process read-only files, you should check this check box

 


...

11.5. Scheduler


The Scheduler determines when and how frequently Symphony OCR performs specific tasks, such as when to send a heartbeat, when to search for new documents, when to purge backup files, etc.

To adjust a setting select "Edit" to the left of the specific setting you would like to adjust. 

To delete a specific Scheduler entry, select "Delete" on the right of the particular setting.

Most users will not require changing these items, however there are special cases when you may wish to do this.  For example, if the firm runs their indexer software and Symphony OCR on a user's workstation, you may wish to only process items overnight.


...

11.6. Finder

Status

Folder Search - performs a search in the monitored folder structure to find all documents that are eligible for OCR regardless of how recently the document has been created or modified.   By default it performs this search once every 120 minutes.     This can be adjusted by selecting "Manage".   This will take you to the Folders page where you can adjust the search frequency for each folder.

 


...

11.7. Analyzer



The Analyzer is responsible for looking at each document and determining if it is eligible for OCR. If a document is eligible it is placed in the Processing list. If a document is not eligible, it is placed in the appropriate list (for more information on why a document might not be eligible for OCR, refer to the section, Not Processed List).

Control 

In the control area, you can choose to refresh the Analyzer or stop the Analyzer:

Refresh - Selecting Refresh will refresh the Status of the Analyzer page.

Stop Analyzer - Selecting this option will stop the Analyzer from Analyzing documents in the document repository.

Status 

Displays the status of the Analyzer.

Information

Machine Processors - Indicates how many logical processors the workstation running Symphony OCR contains.

Licensed parallel processing - Indicates how many documents will be analyzed at a time based on your license features.

Recent Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Overall Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Settings

Do not analyze documents younger than - The default setting is 30 seconds. If you wish to have the Analyzer wait longer to analyze documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

To change this setting, simply type in the number of seconds, and then select "Save Changes".


...

11.8. Processor

Accessing the Processor

Select Processor in the navigation panel:


The Processor manages the actual OCR processes. Once a document has been identified as eligible for OCR by the Analyzer, the Processor confirms that the file is still eligible for OCR, and then OCRs the file. If a document is successfully OCRed, it is moved to the Processed list (for more information about the flow of documents throughout Symphony OCR, refer to the section Symphony Workflow, Tools & Document Lists).


Control

In the control area, you can choose to refresh the Processor or stop the Processor:

Refresh - Selecting Refresh will refresh the status of the Processor page.

Stop Processor - Selecting this option will stop the Processor from processing documents in the document repository.

Status

The status of the Processor (what it is currently processing).

Information

Processing Capacity Remaining - If you have a license that limits the number of pages you can process per year, the number of pages remaining will appear here.
Machine Processors - Indicates how many logical processors the workstation running SymphonyOCR contains.
Licensed parallel processing - Indicates the number of documents that will be processed by the processor simultaneously.

Recent Performance

Provides performance statistics such as the number of documents and pages that Symphony OCR has processed in a smaller sample size and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.

Overall Performance

Provides performance statistics such as the total number of documents and pages that Symphony OCR has processed and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.


Basic Settings

Process TIFFs (OCR and convert to PDF) - Symphony OCR can process TIFF files and convert them to image + text PDF files. This is an optional setting. If you wish to process TIFF documents, simply check this checkbox.

Note:  If the firm opts to process TIFF documents, this will change the file extension to .tif.  This will "break" any relationships or projects that include this file.

Process MSG (email) attachments - Symphony OCR can process email message attachments.  This is an optional setting.  If you wish to process email message attachments, check this checkbox. 

<Big fat scary warning: 

Due to a limitation in newer versions of Office, Microsoft prevents us from accessing the DLLs that allow us to read/process emails under the following conditions: 
> Symphony OCR is configured to run as a service
> 'Process MSG (email) attachments' is checked
> Outlook 2013 (or possibly Outlook 2016) is open

In these circumstances, you're likely to see the following error:

Therefore, if Symphony OCR is being installed to run as a service *and* will be configured to process email attachments, it is our recommendation to install it on a machine that will not normally have Outlook 2013 (or possibly 2016) open.  On the bright side, our testing has shown that in these situations, Symphony is still processing normal documents and WILL eventually recover and process emails after Office is closed.  But if you can, we recommend avoiding this situation.  If your experience is different, we'd like to hear about it. 

End of big fat scary warning>

Do not process documents younger than - The default setting is 30 seconds. If you wish to have the Processor wait longer to process documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

Do not process documents older than - If you have older documents that you do not want Symphony OCR to process, enter a specific number of days for which the software should process backlog.

Automatically rotate pages to proper orientation -  If selected, the pages will rotate either landscape or portrait according to the text on the page.


Original retention settings

Retain originals of processed files - If selected, Symphony OCR retains copies of the documents that is has processed. These copies appear as versions (if Symphony OCR processes a document 3 times, it will maintain copies of all 3 versions of the document). The user can restore previous versions of a document from the Symphony OCR backup using the document Details screen.

Purge originals of processed files after - The default setting is to retain the originals of processed files for 7 days after which they will be purged. If you wish to change this setting, you can change the value to the appropriate number of days for your firm.

Backlog throttling settings (only needed when your license does NOT have unlimited pages for processing)

Default processing capacity reserved for new documents (based on the actual number of new pages added each day) -  This is calculated from the number of pages that were added to the site in the past year.

Override the default processing capacity reserve - This will determine the number of pages you would like to reserve for new documents, evenly spreading the page count capacity across the entire year. To determine a reasonable reserve, allow the Symphony OCR Analyzer module to run, then look at the timeline for the Processing Queue. Adding the number of pages in the first 52 weeks, and dividing by 365 will give an average number of pages added to the system per year. Trumpet recommends adding an additional 10% to accommodate for future growth or above average filing. This value should be a reasonable overclocking reserve.

Advanced Settings:

Enable OCR debug logging - This will enable debugging for support purposes.

Create thumbnails (if not already present) - Checking this checkbox will create thumbnails if they are not already present.

Enable OCR debug logging - This will help our support team address issues if necessary.  In order to reserve disk space, we recommend not enabling this unless requested by our support team.

Limit parallel processing to X documents- This allows you to limit the number of cores that Symphony OCR will utilize. It uses 1 core per document. For example, if you input 3 Symphony OCR will only use 3 cores, and will process 3 documents simultaneously. See: How does Symphony OCR impact the performance of the server or indexer PC


Upon making changes to any of the above settings, select "Save Changes".

...

12. Configuration Guide - Google Drive

12.1. Quick Start Configuration Guide - Google Drive

Note:  Symphony OCR works with the locally synced (either desktop or server) folder tree of Google Drive and uses a Windows Folder Tree License.

Determine what folders to process

  • Copy and paste the folder path that you wish to process*
  • Click the "Add" button
  • Repeat for all folder paths
  • Select "Save Changes"

For more detailed information and advanced settings for configuring Symphony OCR, visit Configuration Guide - Google Drive

*Symphony OCR will process the entire directory tree of the path you provide (e.g. X:\Clients will process all documents in the subfolders beneath X:\Clients, like X:\Clients\Anderson, Matthew and X:\Clients\Anderson, Matthew\Agreements) 


What Now?

Symphony OCR should be off and running now! The Finder is looking through your repository and sending documents to the Analyzer to determine what needs to be processed. The Analyzer sends any documents eligible for OCR to the Processor which applies an invisible layer of text to the document. 

By default, Symphony OCR queries the Folder document repository for newly saved and modified files every 120 minutes.  Generally speaking, newly saved files will be OCRed within about 120 minutes.  Depending on the volume of image-only documents already filed to Worldox, it may take a while for Symphony OCR to process the backlog (legacy files).  Symphony OCR gives precedence to newer files, so documents that are scanned today will be processed before the backlog.  

Refer to the section, Configuration Guide - Google Drive - Finder, for further information on finder settings that determine when Symphony OCR locates files for processing.

Refer to the section, Configuration Guide - Google Drive - Processor, for further information on configuration settings that determine which files are processed.



...

12.2. Licensing



License

This is where your Symphony OCR license is set. To change your Symphony OCR license, simply click "Licensing" from the Configuration side bar, enter your new license, and select "Save Changes."

License Details

Provides you details of the license.

Features Allowed by your License

This area tells you which features are allowed by your license.

Updating your License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your renewal invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will automatically see this new license, download it and install it.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can click the "Check for Updated License" link on this page.  This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers.  As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

 


...

12.3. Notifications



Notifications allow users to be emailed nightly based on the status of Symphony OCR.

  • Enter the email address for the person you wish to receive notifications in the "Add e-mail address" box

  • Select the Notification Type from the drop down


Each email address may be configured with one of four types:

Never - nightly emails will never be sent to this recipient (instead, after entering an email address you can select "Send Now" and deliver an email to the recipient on demand).

When there are errors - the nightly email will only be sent to the recipient if the overall system condition is Error.  This is useful for recipients who only need to know when the system is not processing documents because of some major error (licensing issues are the most common major error).

When there are warnings or errors - the nightly email will only be sent to the recipient if the overall system condition is Warning or Error.  The warning condition is triggered by documents in the Needs Attention list, configuration problems or other system level issues that should be looked at, even though they haven't completely stopped processing from occurring.

Always (aka Daily) - the nightly email will be sent to the recipient every night regardless of system status.  This is useful for firms who want to monitor the 'Not Processed' lists to ensure that every document that couldn't be OCRed (e.g., because of security or corruption) has been reviewed.  Users can review documents in the various 'Not Processed' lists and either correct the underlying issue, or move the documents to the Ignore list using Bulk Operations >Ignore.

  • Select "Save Changes"

If you have a user leave the firm or you no longer wish for a particular user to be notified, you can change the Notification Type to "Never" or remove the user entirely by selecting "Remove" to the right of the address.


...

12.4. Folders

Folders to Monitor

This is the list of folders that Symphony OCR is monitoring.

Search Frequency - The frequency in which the Finder will query this directory tree for new pdf & tif documents.

Default Priority - The priority level in which this directory will be processed.  For more information on setting document priorities see:  Processing Priorities

Add a folder

To add a folder or directory tree to the list of folders that should be monitored by Symphony OCR, add the path to the field and select "Add".  Symphony OCR will process the entire directory tree of the path you provide.  (e.g. X:\Clients will process all documents in the subfolders beneath X:\Clients, like X:\Clients\Anderson, Matthew and X:\Clients\Anderson, Matthew\Agreements, then select the Add button on the right.  This will add the directory tree to the list of folders that Symphony OCR is monitoring.

Note:  If you wish to process files in a hidden folder, you must explicitly indicate that folder. For example, if you have a root folder like X:\Clients and under that a hidden folder called "Inactive" (e.g. X:\Client\Inactive), you must explicitly add that folder to the Monitored folders.

Advanced Settings

Process Read Only Files - if you wish to process read-only files, you should check this check box

 


...

12.5. Enable Read Only Processing

  • Open the Symphony OCR hompage
  • Select "Folder" from the left side bar
  • Check the "Process read only files" checkbox
  • Select "Save Changes"

...

12.6. Scheduler


The Scheduler determines when and how frequently Symphony OCR performs specific tasks, such as when to send a heartbeat, when to search for new documents, when to purge backup files, etc.

To adjust a setting select "Edit" to the left of the specific setting you would like to adjust. 

To delete a specific Scheduler entry, select "Delete" on the right of the particular setting.

Most users will not require changing these items, however there are special cases when you may wish to do this.  For example, if the firm runs their indexer software and Symphony OCR on a user's workstation, you may wish to only process items overnight.


...

12.7. Finder

Status

Folder Search - performs a search in the monitored folder structure to find all documents that are eligible for OCR regardless of how recently the document has been created or modified.   By default it performs this search once every 120 minutes.     This can be adjusted by selecting "Manage".   This will take you to the Folders page where you can adjust the search frequency for each folder.

 


...

12.8. Analyzer



The Analyzer is responsible for looking at each document and determining if it is eligible for OCR. If a document is eligible it is placed in the Processing list. If a document is not eligible, it is placed in the appropriate list (for more information on why a document might not be eligible for OCR, refer to the section, Not Processed List).

Control 

In the control area, you can choose to refresh the Analyzer or stop the Analyzer:

Refresh - Selecting Refresh will refresh the Status of the Analyzer page.

Stop Analyzer - Selecting this option will stop the Analyzer from Analyzing documents in the document repository.

Status 

Displays the status of the Analyzer.

Information

Machine Processors - Indicates how many logical processors the workstation running Symphony OCR contains.

Licensed parallel processing - Indicates how many documents will be analyzed at a time based on your license features.

Recent Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Overall Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Settings

Do not analyze documents younger than - The default setting is 30 seconds. If you wish to have the Analyzer wait longer to analyze documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

To change this setting, simply type in the number of seconds, and then select "Save Changes".


...

12.9. Processor

Accessing the Processor

Select Processor in the navigation panel:


The Processor manages the actual OCR processes. Once a document has been identified as eligible for OCR by the Analyzer, the Processor confirms that the file is still eligible for OCR, and then OCRs the file. If a document is successfully OCRed, it is moved to the Processed list (for more information about the flow of documents throughout Symphony OCR, refer to the section Symphony Workflow, Tools & Document Lists).


Control

In the control area, you can choose to refresh the Processor or stop the Processor:

Refresh - Selecting Refresh will refresh the status of the Processor page.

Stop Processor - Selecting this option will stop the Processor from processing documents in the document repository.

Status

The status of the Processor (what it is currently processing).

Information

Processing Capacity Remaining - If you have a license that limits the number of pages you can process per year, the number of pages remaining will appear here.
Machine Processors - Indicates how many logical processors the workstation running SymphonyOCR contains.
Licensed parallel processing - Indicates the number of documents that will be processed by the processor simultaneously.

Recent Performance

Provides performance statistics such as the number of documents and pages that Symphony OCR has processed in a smaller sample size and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.

Overall Performance

Provides performance statistics such as the total number of documents and pages that Symphony OCR has processed and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.


Basic Settings

Process TIFFs (OCR and convert to PDF) - Symphony OCR can process TIFF files and convert them to image + text PDF files. This is an optional setting. If you wish to process TIFF documents, simply check this checkbox.

Note:  If the firm opts to process TIFF documents, this will change the file extension to .tif.  This will "break" any relationships or projects that include this file.

Process MSG (email) attachments - Symphony OCR can process email message attachments.  This is an optional setting.  If you wish to process email message attachments, check this checkbox. 

<Big fat scary warning: 

Due to a limitation in newer versions of Office, Microsoft prevents us from accessing the DLLs that allow us to read/process emails under the following conditions: 
> Symphony OCR is configured to run as a service
> 'Process MSG (email) attachments' is checked
> Outlook 2013 (or possibly Outlook 2016) is open

In these circumstances, you're likely to see the following error:

Therefore, if Symphony OCR is being installed to run as a service *and* will be configured to process email attachments, it is our recommendation to install it on a machine that will not normally have Outlook 2013 (or possibly 2016) open.  On the bright side, our testing has shown that in these situations, Symphony is still processing normal documents and WILL eventually recover and process emails after Office is closed.  But if you can, we recommend avoiding this situation.  If your experience is different, we'd like to hear about it. 

End of big fat scary warning>

Do not process documents younger than - The default setting is 30 seconds. If you wish to have the Processor wait longer to process documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

Do not process documents older than - If you have older documents that you do not want Symphony OCR to process, enter a specific number of days for which the software should process backlog.

Automatically rotate pages to proper orientation -  If selected, the pages will rotate either landscape or portrait according to the text on the page.


Original retention settings

Retain originals of processed files - If selected, Symphony OCR retains copies of the documents that is has processed. These copies appear as versions (if Symphony OCR processes a document 3 times, it will maintain copies of all 3 versions of the document). The user can restore previous versions of a document from the Symphony OCR backup using the document Details screen.

Purge originals of processed files after - The default setting is to retain the originals of processed files for 7 days after which they will be purged. If you wish to change this setting, you can change the value to the appropriate number of days for your firm.

Backlog throttling settings (only needed when your license does NOT have unlimited pages for processing)

Default processing capacity reserved for new documents (based on the actual number of new pages added each day) -  This is calculated from the number of pages that were added to the site in the past year.

Override the default processing capacity reserve - This will determine the number of pages you would like to reserve for new documents, evenly spreading the page count capacity across the entire year. To determine a reasonable reserve, allow the Symphony OCR Analyzer module to run, then look at the timeline for the Processing Queue. Adding the number of pages in the first 52 weeks, and dividing by 365 will give an average number of pages added to the system per year. Trumpet recommends adding an additional 10% to accommodate for future growth or above average filing. This value should be a reasonable overclocking reserve.

Advanced Settings:

Enable OCR debug logging - This will enable debugging for support purposes.

Create thumbnails (if not already present) - Checking this checkbox will create thumbnails if they are not already present.

Enable OCR debug logging - This will help our support team address issues if necessary.  In order to reserve disk space, we recommend not enabling this unless requested by our support team.

Limit parallel processing to X documents- This allows you to limit the number of cores that Symphony OCR will utilize. It uses 1 core per document. For example, if you input 3 Symphony OCR will only use 3 cores, and will process 3 documents simultaneously. See: How does Symphony OCR impact the performance of the server or indexer PC


Upon making changes to any of the above settings, select "Save Changes".

...

13. Configuration Guide - Microsoft One Drive

13.1. Quick Start Configuration Guide - Microsoft One Drive

Note:  Symphony OCR works with the locally synced (either desktop or server) folder tree of Microsoft One Drive and uses a Windows Folder Tree License.

Determine what folders to process

  • Copy and paste the folder path that you wish to process*
  • Click the "Add" button
  • Repeat for all folder paths
  • Select "Save Changes"

For more detailed information and advanced settings for configuring Symphony OCR, visit Configuration Guide - Microsoft One Drive

*Symphony OCR will process the entire directory tree of the path you provide (e.g. X:\Clients will process all documents in the subfolders beneath X:\Clients, like X:\Clients\Anderson, Matthew and X:\Clients\Anderson, Matthew\Agreements) 


What Now?

Symphony OCR should be off and running now! The Finder is looking through your repository and sending documents to the Analyzer to determine what needs to be processed. The Analyzer sends any documents eligible for OCR to the Processor which applies an invisible layer of text to the document. 

By default, Symphony OCR queries the Folder document repository for newly saved and modified files every 120 minutes.  Generally speaking, newly saved files will be OCRed within about 120 minutes.  Depending on the volume of image-only documents already filed to Worldox, it may take a while for Symphony OCR to process the backlog (legacy files).  Symphony OCR gives precedence to newer files, so documents that are scanned today will be processed before the backlog.  

Refer to the section, Configuration Guide - Microsoft One Drive - Finder, for further information on finder settings that determine when Symphony OCR locates files for processing.

Refer to the section, Configuration Guide - Microsoft One Drive - Processor, for further information on configuration settings that determine which files are processed.


...

13.2. Licensing



License

This is where your Symphony OCR license is set. To change your Symphony OCR license, simply click "Licensing" from the Configuration side bar, enter your new license, and select "Save Changes."

License Details

Provides you details of the license.

Features Allowed by your License

This area tells you which features are allowed by your license.

Updating your License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your renewal invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will automatically see this new license, download it and install it.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can click the "Check for Updated License" link on this page.  This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers.  As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

 


...

13.3. Notifications



Notifications allow users to be emailed nightly based on the status of Symphony OCR.

  • Enter the email address for the person you wish to receive notifications in the "Add e-mail address" box

  • Select the Notification Type from the drop down


Each email address may be configured with one of four types:

Never - nightly emails will never be sent to this recipient (instead, after entering an email address you can select "Send Now" and deliver an email to the recipient on demand).

When there are errors - the nightly email will only be sent to the recipient if the overall system condition is Error.  This is useful for recipients who only need to know when the system is not processing documents because of some major error (licensing issues are the most common major error).

When there are warnings or errors - the nightly email will only be sent to the recipient if the overall system condition is Warning or Error.  The warning condition is triggered by documents in the Needs Attention list, configuration problems or other system level issues that should be looked at, even though they haven't completely stopped processing from occurring.

Always (aka Daily) - the nightly email will be sent to the recipient every night regardless of system status.  This is useful for firms who want to monitor the 'Not Processed' lists to ensure that every document that couldn't be OCRed (e.g., because of security or corruption) has been reviewed.  Users can review documents in the various 'Not Processed' lists and either correct the underlying issue, or move the documents to the Ignore list using Bulk Operations >Ignore.

  • Select "Save Changes"

If you have a user leave the firm or you no longer wish for a particular user to be notified, you can change the Notification Type to "Never" or remove the user entirely by selecting "Remove" to the right of the address.


...

13.4. Folders

Folders to Monitor

This is the list of folders that Symphony OCR is monitoring.

Search Frequency - The frequency in which the Finder will query this directory tree for new pdf & tif documents.

Default Priority - The priority level in which this directory will be processed.  For more information on setting document priorities see:  Processing Priorities

Add a folder

To add a folder or directory tree to the list of folders that should be monitored by Symphony OCR, add the path to the field and select "Add".  Symphony OCR will process the entire directory tree of the path you provide.  (e.g. X:\Clients will process all documents in the subfolders beneath X:\Clients, like X:\Clients\Anderson, Matthew and X:\Clients\Anderson, Matthew\Agreements, then select the Add button on the right.  This will add the directory tree to the list of folders that Symphony OCR is monitoring.

Note:  If you wish to process files in a hidden folder, you must explicitly indicate that folder. For example, if you have a root folder like X:\Clients and under that a hidden folder called "Inactive" (e.g. X:\Client\Inactive), you must explicitly add that folder to the Monitored folders.

Advanced Settings

Process Read Only Files - if you wish to process read-only files, you should check this check box

 


...

13.5. Enable Read Only Processing

  • Open the Symphony OCR hompage
  • Select "Folder" from the left side bar
  • Check the "Process read only files" checkbox
  • Select "Save Changes"

...

13.6. Scheduler


The Scheduler determines when and how frequently Symphony OCR performs specific tasks, such as when to send a heartbeat, when to search for new documents, when to purge backup files, etc.

To adjust a setting select "Edit" to the left of the specific setting you would like to adjust. 

To delete a specific Scheduler entry, select "Delete" on the right of the particular setting.

Most users will not require changing these items, however there are special cases when you may wish to do this.  For example, if the firm runs their indexer software and Symphony OCR on a user's workstation, you may wish to only process items overnight.


...

13.7. Finder

Status

Folder Search - performs a search in the monitored folder structure to find all documents that are eligible for OCR regardless of how recently the document has been created or modified.   By default it performs this search once every 120 minutes.     This can be adjusted by selecting "Manage".   This will take you to the Folders page where you can adjust the search frequency for each folder.

 


...

13.8. Analyzer



The Analyzer is responsible for looking at each document and determining if it is eligible for OCR. If a document is eligible it is placed in the Processing list. If a document is not eligible, it is placed in the appropriate list (for more information on why a document might not be eligible for OCR, refer to the section, Not Processed List).

Control 

In the control area, you can choose to refresh the Analyzer or stop the Analyzer:

Refresh - Selecting Refresh will refresh the Status of the Analyzer page.

Stop Analyzer - Selecting this option will stop the Analyzer from Analyzing documents in the document repository.

Status 

Displays the status of the Analyzer.

Information

Machine Processors - Indicates how many logical processors the workstation running Symphony OCR contains.

Licensed parallel processing - Indicates how many documents will be analyzed at a time based on your license features.

Recent Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Overall Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Settings

Do not analyze documents younger than - The default setting is 30 seconds. If you wish to have the Analyzer wait longer to analyze documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

To change this setting, simply type in the number of seconds, and then select "Save Changes".


...

13.9. Processor

Accessing the Processor

Select Processor in the navigation panel:


The Processor manages the actual OCR processes. Once a document has been identified as eligible for OCR by the Analyzer, the Processor confirms that the file is still eligible for OCR, and then OCRs the file. If a document is successfully OCRed, it is moved to the Processed list (for more information about the flow of documents throughout Symphony OCR, refer to the section Symphony Workflow, Tools & Document Lists).


Control

In the control area, you can choose to refresh the Processor or stop the Processor:

Refresh - Selecting Refresh will refresh the status of the Processor page.

Stop Processor - Selecting this option will stop the Processor from processing documents in the document repository.

Status

The status of the Processor (what it is currently processing).

Information

Processing Capacity Remaining - If you have a license that limits the number of pages you can process per year, the number of pages remaining will appear here.
Machine Processors - Indicates how many logical processors the workstation running SymphonyOCR contains.
Licensed parallel processing - Indicates the number of documents that will be processed by the processor simultaneously.

Recent Performance

Provides performance statistics such as the number of documents and pages that Symphony OCR has processed in a smaller sample size and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.

Overall Performance

Provides performance statistics such as the total number of documents and pages that Symphony OCR has processed and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.


Basic Settings

Process TIFFs (OCR and convert to PDF) - Symphony OCR can process TIFF files and convert them to image + text PDF files. This is an optional setting. If you wish to process TIFF documents, simply check this checkbox.

Note:  If the firm opts to process TIFF documents, this will change the file extension to .tif.  This will "break" any relationships or projects that include this file.

Process MSG (email) attachments - Symphony OCR can process email message attachments.  This is an optional setting.  If you wish to process email message attachments, check this checkbox. 

<Big fat scary warning: 

Due to a limitation in newer versions of Office, Microsoft prevents us from accessing the DLLs that allow us to read/process emails under the following conditions: 
> Symphony OCR is configured to run as a service
> 'Process MSG (email) attachments' is checked
> Outlook 2013 (or possibly Outlook 2016) is open

In these circumstances, you're likely to see the following error:

Therefore, if Symphony OCR is being installed to run as a service *and* will be configured to process email attachments, it is our recommendation to install it on a machine that will not normally have Outlook 2013 (or possibly 2016) open.  On the bright side, our testing has shown that in these situations, Symphony is still processing normal documents and WILL eventually recover and process emails after Office is closed.  But if you can, we recommend avoiding this situation.  If your experience is different, we'd like to hear about it. 

End of big fat scary warning>

Do not process documents younger than - The default setting is 30 seconds. If you wish to have the Processor wait longer to process documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

Do not process documents older than - If you have older documents that you do not want Symphony OCR to process, enter a specific number of days for which the software should process backlog.

Automatically rotate pages to proper orientation -  If selected, the pages will rotate either landscape or portrait according to the text on the page.


Original retention settings

Retain originals of processed files - If selected, Symphony OCR retains copies of the documents that is has processed. These copies appear as versions (if Symphony OCR processes a document 3 times, it will maintain copies of all 3 versions of the document). The user can restore previous versions of a document from the Symphony OCR backup using the document Details screen.

Purge originals of processed files after - The default setting is to retain the originals of processed files for 7 days after which they will be purged. If you wish to change this setting, you can change the value to the appropriate number of days for your firm.

Backlog throttling settings (only needed when your license does NOT have unlimited pages for processing)

Default processing capacity reserved for new documents (based on the actual number of new pages added each day) -  This is calculated from the number of pages that were added to the site in the past year.

Override the default processing capacity reserve - This will determine the number of pages you would like to reserve for new documents, evenly spreading the page count capacity across the entire year. To determine a reasonable reserve, allow the Symphony OCR Analyzer module to run, then look at the timeline for the Processing Queue. Adding the number of pages in the first 52 weeks, and dividing by 365 will give an average number of pages added to the system per year. Trumpet recommends adding an additional 10% to accommodate for future growth or above average filing. This value should be a reasonable overclocking reserve.

Advanced Settings:

Enable OCR debug logging - This will enable debugging for support purposes.

Create thumbnails (if not already present) - Checking this checkbox will create thumbnails if they are not already present.

Enable OCR debug logging - This will help our support team address issues if necessary.  In order to reserve disk space, we recommend not enabling this unless requested by our support team.

Limit parallel processing to X documents- This allows you to limit the number of cores that Symphony OCR will utilize. It uses 1 core per document. For example, if you input 3 Symphony OCR will only use 3 cores, and will process 3 documents simultaneously. See: How does Symphony OCR impact the performance of the server or indexer PC


Upon making changes to any of the above settings, select "Save Changes".

...

14. Configuration Guide - SharePoint

14.1. Quick Start Configuration Guide - SharePoint

Connect to SharePoint

For a quick video showing the installation and configuration of SharePoint visit:  https://youtu.be/UNGbJiaRn9A

  • Enter the SharePoint Tenant URL (be sure to include the https://)  and choose "Connect to SharePoint"
  • SymphonyOCR will be redirected to the SharePoint tenant site where you will be directed to enter the SharePoint username and Password in which you would like to run SymphonyOCR.
  • Enter the Username and Password
  • You will be prompted by Sharepoint to Trust "extranet.trumpetinc.com", choose "Trust It"


By default, SymphonyOCR will process all sites and subsites within your site tenant.



...

14.2. Licensing



License

This is where your Symphony OCR license is set. To change your Symphony OCR license, simply click "Licensing" from the Configuration side bar, enter your new license, and select "Save Changes."

License Details

Provides you details of the license.

Features Allowed by your License

This area tells you which features are allowed by your license.

Updating your License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your renewal invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will automatically see this new license, download it and install it.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can click the "Check for Updated License" link on this page.  This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers.  As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

 


...

14.3. Notifications



Notifications allow users to be emailed nightly based on the status of Symphony OCR.

  • Enter the email address for the person you wish to receive notifications in the "Add e-mail address" box

  • Select the Notification Type from the drop down


Each email address may be configured with one of four types:

Never - nightly emails will never be sent to this recipient (instead, after entering an email address you can select "Send Now" and deliver an email to the recipient on demand).

When there are errors - the nightly email will only be sent to the recipient if the overall system condition is Error.  This is useful for recipients who only need to know when the system is not processing documents because of some major error (licensing issues are the most common major error).

When there are warnings or errors - the nightly email will only be sent to the recipient if the overall system condition is Warning or Error.  The warning condition is triggered by documents in the Needs Attention list, configuration problems or other system level issues that should be looked at, even though they haven't completely stopped processing from occurring.

Always (aka Daily) - the nightly email will be sent to the recipient every night regardless of system status.  This is useful for firms who want to monitor the 'Not Processed' lists to ensure that every document that couldn't be OCRed (e.g., because of security or corruption) has been reviewed.  Users can review documents in the various 'Not Processed' lists and either correct the underlying issue, or move the documents to the Ignore list using Bulk Operations >Ignore.

  • Select "Save Changes"

If you have a user leave the firm or you no longer wish for a particular user to be notified, you can change the Notification Type to "Never" or remove the user entirely by selecting "Remove" to the right of the address.


...

14.4. SharePoint

Connect to SharePoint

For a quick video showing the installation and configuration of SharePoint visit:  https://youtu.be/UNGbJiaRn9A

  • Enter the SharePoint Tenant URL (be sure to include the https://)  and choose "Connect to SharePoint"
  • SymphonyOCR will be redirected to the SharePoint tenant site where you will be directed to enter the SharePoint username and Password in which you would like to run SymphonyOCR.
  • Enter the Username and Password
  • You will be prompted by Sharepoint to Trust "extranet.trumpetinc.com", choose "Trust It"


By default, SymphonyOCR will process all sites and subsites within your site tenant.



...

14.5. Reset SharePoint integration

Background:

When you need to update the SharePoint credentials that SymphonyOCR connects with.

Possibly from updating the SharePoint users password and Symphony enters an error state that it cannot connect to SharePoint any longer. Or the user that SymphonyOCR connected to SharePoint with has been deactivated and Symphony enters an error state that it cannot connect to SharePoint any longer

Solution: 

  • In the Root SymphonyOCR folder open the Config folder and then Settings.xml
  • Identify the lines containing the following and delete them:
    • <sharePointConnectionManager realm=
    • refreshToken=
    • sharepointurl=
  • Restart the SymphonyOCR service. 
    • If you are not running Symphony as a service close the interface and close it form the System tray.
  • When Symphony Starts back up again you will be able to reconnect to the SharePoint Tenant and enter the new credentials.

For assistance with this process you can reach out to your SymphonyOCR Channel Partner or Trumpet at Support@trumpetinc.com.

 

...

14.6. Scheduler


The Scheduler determines when and how frequently Symphony OCR performs specific tasks, such as when to send a heartbeat, when to search for new documents, when to purge backup files, etc.

To adjust a setting select "Edit" to the left of the specific setting you would like to adjust. 

To delete a specific Scheduler entry, select "Delete" on the right of the particular setting.

Most users will not require changing these items, however there are special cases when you may wish to do this.  For example, if the firm runs their indexer software and Symphony OCR on a user's workstation, you may wish to only process items overnight.


...

14.7. Finder

SharePoint New Document Search in "X" Folders:  ("X", indicates the number of folders Symphony OCR will process)  performs a search in the folder structure to find all documents that are eligible for OCR.   By default the finder does its search every hour.  This can be adjusted by selecting "Manage".  This will take you to the SharePoint page where you can adjust the search frequency for each folder.

SharePoint Legacy Document Search in "X":    ("X", indicates the number of folders Symphony OCR will process) performs a search in the folder structure to find legacy documents that are eligible for OCR.  By default the finder searches for legacy documents every 12 hours.  This can be adjusted by selecting "Manage".  This will take you to the SharePoint page where you can adjust the frequency of legacy document searches.

...

14.8. Analyzer



The Analyzer is responsible for looking at each document and determining if it is eligible for OCR. If a document is eligible it is placed in the Processing list. If a document is not eligible, it is placed in the appropriate list (for more information on why a document might not be eligible for OCR, refer to the section, Not Processed List).

Control 

In the control area, you can choose to refresh the Analyzer or stop the Analyzer:

Refresh - Selecting Refresh will refresh the Status of the Analyzer page.

Stop Analyzer - Selecting this option will stop the Analyzer from Analyzing documents in the document repository.

Status 

Displays the status of the Analyzer.

Information

Machine Processors - Indicates how many logical processors the workstation running Symphony OCR contains.

Licensed parallel processing - Indicates how many documents will be analyzed at a time based on your license features.

Recent Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Overall Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Settings

Do not analyze documents younger than - The default setting is 30 seconds. If you wish to have the Analyzer wait longer to analyze documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

To change this setting, simply type in the number of seconds, and then select "Save Changes".


...

14.9. Processor

Accessing the Processor

Select Processor in the navigation panel:


The Processor manages the actual OCR processes. Once a document has been identified as eligible for OCR by the Analyzer, the Processor confirms that the file is still eligible for OCR, and then OCRs the file. If a document is successfully OCRed, it is moved to the Processed list (for more information about the flow of documents throughout Symphony OCR, refer to the section Symphony Workflow, Tools & Document Lists).


Control

In the control area, you can choose to refresh the Processor or stop the Processor:

Refresh - Selecting Refresh will refresh the status of the Processor page.

Stop Processor - Selecting this option will stop the Processor from processing documents in the document repository.

Status

The status of the Processor (what it is currently processing).

Information

Processing Capacity Remaining - If you have a license that limits the number of pages you can process per year, the number of pages remaining will appear here.
Machine Processors - Indicates how many logical processors the workstation running SymphonyOCR contains.
Licensed parallel processing - Indicates the number of documents that will be processed by the processor simultaneously.

Recent Performance

Provides performance statistics such as the number of documents and pages that Symphony OCR has processed in a smaller sample size and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.

Overall Performance

Provides performance statistics such as the total number of documents and pages that Symphony OCR has processed and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.


Basic Settings

Process TIFFs (OCR and convert to PDF) - Symphony OCR can process TIFF files and convert them to image + text PDF files. This is an optional setting. If you wish to process TIFF documents, simply check this checkbox.

Note:  If the firm opts to process TIFF documents, this will change the file extension to .tif.  This will "break" any relationships or projects that include this file.

Process MSG (email) attachments - Symphony OCR can process email message attachments.  This is an optional setting.  If you wish to process email message attachments, check this checkbox. 

<Big fat scary warning: 

Due to a limitation in newer versions of Office, Microsoft prevents us from accessing the DLLs that allow us to read/process emails under the following conditions: 
> Symphony OCR is configured to run as a service
> 'Process MSG (email) attachments' is checked
> Outlook 2013 (or possibly Outlook 2016) is open

In these circumstances, you're likely to see the following error:

Therefore, if Symphony OCR is being installed to run as a service *and* will be configured to process email attachments, it is our recommendation to install it on a machine that will not normally have Outlook 2013 (or possibly 2016) open.  On the bright side, our testing has shown that in these situations, Symphony is still processing normal documents and WILL eventually recover and process emails after Office is closed.  But if you can, we recommend avoiding this situation.  If your experience is different, we'd like to hear about it. 

End of big fat scary warning>

Do not process documents younger than - The default setting is 30 seconds. If you wish to have the Processor wait longer to process documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

Do not process documents older than - If you have older documents that you do not want Symphony OCR to process, enter a specific number of days for which the software should process backlog.

Automatically rotate pages to proper orientation -  If selected, the pages will rotate either landscape or portrait according to the text on the page.


Original retention settings

Retain originals of processed files - If selected, Symphony OCR retains copies of the documents that is has processed. These copies appear as versions (if Symphony OCR processes a document 3 times, it will maintain copies of all 3 versions of the document). The user can restore previous versions of a document from the Symphony OCR backup using the document Details screen.

Purge originals of processed files after - The default setting is to retain the originals of processed files for 7 days after which they will be purged. If you wish to change this setting, you can change the value to the appropriate number of days for your firm.

Backlog throttling settings (only needed when your license does NOT have unlimited pages for processing)

Default processing capacity reserved for new documents (based on the actual number of new pages added each day) -  This is calculated from the number of pages that were added to the site in the past year.

Override the default processing capacity reserve - This will determine the number of pages you would like to reserve for new documents, evenly spreading the page count capacity across the entire year. To determine a reasonable reserve, allow the Symphony OCR Analyzer module to run, then look at the timeline for the Processing Queue. Adding the number of pages in the first 52 weeks, and dividing by 365 will give an average number of pages added to the system per year. Trumpet recommends adding an additional 10% to accommodate for future growth or above average filing. This value should be a reasonable overclocking reserve.

Advanced Settings:

Enable OCR debug logging - This will enable debugging for support purposes.

Create thumbnails (if not already present) - Checking this checkbox will create thumbnails if they are not already present.

Enable OCR debug logging - This will help our support team address issues if necessary.  In order to reserve disk space, we recommend not enabling this unless requested by our support team.

Limit parallel processing to X documents- This allows you to limit the number of cores that Symphony OCR will utilize. It uses 1 core per document. For example, if you input 3 Symphony OCR will only use 3 cores, and will process 3 documents simultaneously. See: How does Symphony OCR impact the performance of the server or indexer PC


Upon making changes to any of the above settings, select "Save Changes".

...

15. Administrator Guide

15.1. Symphony OCR - Basic Functionality

Opening and Closing Symphony

To access it directly log onto the workstation where Symphony OCR is installed.

  • To open Symphony OCR:
    • If the web browser is closed, but Symphony OCR is still open, the user interface can be accessed by right-clicking on the Symphony OCR icon in the system tray and choosing "Show Browser Window"
    • If Symphony OCR is not running, use the desktop shortcut "Symphony OCR" to launch Symphony OCR
  • To close Symphony OCR (Run as a logged in User):
    • Right-click on the Symphony OCR icon in the system tray and select "Quit"
    • Or, select "Quit" from the bottom left corner of the web browser window
  • To close Symphony OCR (Run as a Windows Service):
    • Navigate to the Control Panel -> Administrative Tools -> Services
    • Select Symphony OCR and click the "Stop" link

You can also access Symphony OCR from your workstation see:  Accessing Symphony OCR

 

Refreshing Data

Since Symphony OCR uses a web interface, the display may not automatically refresh as it performs its work. You can manually refresh by selecting the "Refresh" button in Symphony OCR or in the web browser. The Symphony OCR summary page refreshes automatically every 60 seconds. All other pages require a manual refresh.

...

15.2. Accessing Symphony OCR

Symphony OCR can be accessed from the web browser of any workstation connected to the network by typing in the address found in the web browser in which Symphony OCR runs:


If you would prefer for the End Users to see only the Summary View of Symphony OCR, you can do so by doing the following:

In the main Symphony OCR page, select "Simple View"


This will open the Summary Screen without the additional navigation panel:

Copy and paste the URL from here to provide to those users.


...

15.3. Summary Page

The Symphony OCR Summary page can be considered the "Dashboard" for Symphony OCR to allow users to view/manage the system condition of Symphony OCR, including current and historical progress and many more items.  Below we've highlighted the most common features for Symphony OCR's dashboard.



  1. Graph - This is a graphical display to show the number of pages that have been processed, the number of pages pending processing and a timeline to show when those documents were originally saved to the document repository.
  2. License Info - Tells you the remaining processing capacity (the number of pages available in your license and when it renews), how many new pages were added to the document repository in the past year and the recommended processing capacity based on historical analysis.
  3. Statistics - Tells you how many documents are pending analysis and how many documents are pending OCR. The section "Current OCR throughput" tells you how long it takes Symphony OCR to process a page and does some calculations to determine the estimated time to OCR the backlog. Finally, this section of the Summary page tells you when the last document was OCRed.
  4. Document Summary - Provides you with a dashboard that includes the ability to view documents in each of the document lists.  For more on document lists, visit Workflow, Tools & Document Lists
...

15.4. Deleted

Deleted:  Documents in the deleted list mean that the document record is in the process of being purged from the database (documents should only be in this state for a very short period of time).

...

15.5. Lookup Document

Checking the Details of a Document

There are two methods for looking up the details of a particular document:

Lookup By Path - Enter the full path of the document and click "Query" (See also:  Checking the Status of a Document).

Document Lists
 - Simply select the document in any of the document lists and this will open a details page.

See also our YouTube Videos Here:

Look up a document - NetDocuments
Look up Document- Worldox

Look up a document - Windows Folder Tree

Interpreting the Details of a Document

Once on the details page, the user can perform these functions:

Refresh - Provides the most current details of a document.

View
 - Opens the file.

Delete detail
 - Deletes the details for the document.

Re-Analyze
 - Re-analyzes the document (if it has been unable to be processed) and attempts to process the document again.

Purge Backups
 - Deletes any copies of the file that Symphony has saved.

There are also various bits of data or history showing what's been found on the file, and what's been done to it. For example, the "History" section shows all of the events logged for that file.

Note: If you delete the details of this document, it will delete this history and start from scratch. The "Page Analysis Details (before processing)" indicates how many words per page were found within the file BEFORE Symphony OCRed it. Visible words are computer-readable words (like digital headers or footers, or text generated by Word, etc). Hidden (aka invisible) words would be words applied by something like Symphony OCR. Note that these numbers are PRE-processing and they do not update after Symphony OCRs the file.


...

15.6. System Condition

Symphony OCR is a back-end processing engine. This means that very little user interaction is required, but as an administrator, you may wish to check on the status of the software.

For additional suggestions on monitoring Symphony OCR, visit Ongoing Care & Feeding.

Symphony OCR has three system status settings

  • OK (green) - System is running with no errors or warnings
  • Warn (orange) - System has warnings.  Possible causes could be:
    • There are documents in the "Needs Attention" list (Refer to the section, Not Processed Lists, for additional information)
    • The Analyzer or Processor is not running
    • There are configuration problems in the non-critical sub-systems, such as the Heartbeat system
  • Error (red) - System is not running.  Possible causes could be:
    • The Symphony OCR license has expired or is not valid
    • The Worldox Indexer application is not running
    • The Finder, Analyzer or Processor has errors
    • The annual page count limitation has been reached (please contact Trumpet, Inc. (support@trumpetinc.com) for information on increasing the page count license)

 

...

15.7. What will be processed

There are four different types of files pertinent to this discussion:

  • Image-only PDF - This is an image created by a scanner.
  • Rendered PDF - This is a PDF created by a computer (e.g. a Word document converted to PDF). It contains computer readable text by default, so it is fully text searchable as is without the need for OCR.
  • Hybrid PDF - This is a PDF that contains both images and rendered content or annotations. If scanning a document, then use Adobe's markup tools, for example, to annotate the image, the PDF will be a hybrid PDF.
  • Image + text PDF - This is a PDF that is created when the OCR engine 'reads' an image-only PDF and adds a layer of invisible, computer readable text to the original image. These files retain the exact original image, but also provide the ability to perform context sensitive search for text inside the PDF, as well as copying text to the Windows clipboard.

What Symphony OCR Will Process by Default

  • Symphony OCR will process image-only PDF files (and TIFF files if you choose to do so) and convert them to image + text PDF files.
  • Symphony OCR will process image-only pages within a hybrid PDF and convert them to an image + text PDF (but will not process the rendered pages as they are already text searchable).

What Symphony OCR Will Not Process

  • Symphony OCR will not process rendered PDFs, as these documents are already text searchable. Instead, it will place these files into the "Contains Text" or "Already OCRed" lists.
  • To ensure integrity of the original PDF content, Symphony OCR will not process a PDF if the PDF has been encrypted.
  • Symphony OCR will not recognize handwriting.
...

15.8. When will documents be processed

Worldox Document Repository

By default, Symphony OCR queries the Worldox document repository for newly saved and modified files every 15 minutes.  Generally speaking, newly saved files will be OCRed within about 15 minutes.  Depending on the volume of image-only documents already filed to Worldox, it may take a while for Symphony OCR to process the backlog (legacy files).  Symphony OCR gives precedence to newer files, so documents that are scanned today will be processed before the backlog.  

Refer to the section, Configuration Guide - Worldox - Finder, for further information on finder settings that determine when Symphony OCR locates files for processing.

Refer to the section, Configuration Guide - Worldox - Processor, for further information on configuration settings that determine which files are processed.

Note:  While Symphony OCR may process documents within about 15 minutes, you will need to wait until the text indexes are updated (typically overnight) in order to do full text in file searching for the documents using the Worldox document management system.

Folder Document Repository

By default, Symphony OCR queries the Folder document repository for newly saved and modified files every 120 minutes.  Generally speaking, newly saved files will be OCRed within about 120 minutes.  Depending on the volume of image-only documents already filed to Worldox, it may take a while for Symphony OCR to process the backlog (legacy files).  Symphony OCR gives precedence to newer files, so documents that are scanned today will be processed before the backlog.  

Refer to the section, Configuration Guide - Folder - Finder, for further information on finder settings that determine when Symphony OCR locates files for processing.

Refer to the section, Configuration Guide - Folder - Processor, for further information on configuration settings that determine which files are processed.

NetDocuments Repository

By default, Symphony OCR queries the Folder document repository for newly saved and modified files every 15 minutes.  Generally speaking, newly saved files will be OCRed within about 15 minutes. Symphony OCR can also optionally process the files already stored in NetDocuments.  By default, it performs a query for these files every 7 days.  Symphony OCR gives precedence to newer files, so documents that are scanned today will be processed before the legacy documents. 

Note: While Symphony OCR may process documents within about 15 minutes, Netdocuments may take up to 6-8 hours to update its text index. Meaning, if you'd like to run a Netoducments search for words within a document that was recently OCR'd then there may be a 6-8 hour delay. This is due to how Netdocuments prioritizes API activity.

Refer to the section, Configuration Guide - NetDocuments - Finder, for further information on finder settings that determine when Symphony OCR locates files for processing.

Refer to the section, Configuration Guide - NetDocuments - Processor, for further information on settings that determine which files are processed.

...

15.9. Symphony OCR Workflow, Tools & Document Lists

Common Workflow Diagram

Symphony OCR searches the document repository for documents to process. It then organizes those documents into one of several lists (these lists are available on the left side of the Symphony OCR web interface). The following diagram displays the tools, lists and explains how they interact:

Symphony OCR Tools

Symphony OCR consists of three main tools that interact to provide full OCR services:

Finder - locates documents in your document repository

Analyzer - determines if a given document is a candidate for OCR

Processor - performs the actual OCR

As documents flow through Symphony OCR, each of the above components works on the document, then places it in a particular Document List, as described in the next section.

Symphony OCR Document Lists

Backlog Lists

The backlog consists of documents that have not been analyzed or OCRed. These are documents that Symphony OCR is still working on.  The following document lists represent the backlog:

Analyzing - Documents waiting for the Analyzer to determine if they are candidates for OCR or not

In Process  - Documents that are in the process of being Analyzed

Processing - Documents are candidates for OCR, but have not been processed yet

Reprocessing  - Documents had some recoverable problem during OCR, and will be processed again later.  Typical causes are if the document is open by a user, or was modified while OCR was taking place

Processed Lists

These lists represent documents that were successfully analyzed or OCRed.  They are documents that have either been OCRed or were already text searchable (and thus not in need of OCR):

Processed  - documents that have been successfully OCRed by Symphony OCR

In Process  - documents that are currently being OCRed

Already OCRed - documents that were already OCRed (by some other processor or by an earlier version of Symphony OCR)

Contains text
- documents that are already text searchable (no OCR needed)

No image or text
  - some rare documents contain no text, but also contain no images. These generally do not need to be OCRed however, you may choose to do these on a one-off basis.  See "How to Process No Image or Text Documents" for instructions

Email Messages - contains the list of email messages that contained attachments that were processed by Symphony OCR

Not Processed Lists

These lists represent documents that could not be processed for some reason. In most cases, an administrator will want to glance over these lists from time to time to ensure that there are no issues with the documents that didn't get processed:

Needs Attention:  Documents in the 'Needs Attention' list are those that appear to be eligible for OCR, but encountered problems during processing.  Files in this list could be corrupted or contain invalid images (try opening them in an image viewer to be sure), or they may be images that Symphony OCR does not handle yet. 

Occasionally, a document can fall into the 'Needs Attention' list because of bad timing - Symphony OCR trying to process the document when it isn't fully available.  So we always recommend clicking the "Show Bulk Operations" button and then "Re-Analyze All", just to ensure this isn't the case.

If the document is corrupted, you can either remove the document from Worldox, or manually tell Symphony OCR to "ignore" it, which will put it on the 'Ignored' list.  If the 'Needs Attention' list contains any documents, the overall system condition will show as "Warn."  Ignoring a document that you have already checked is a good way to change the system condition back to "OK". 

If the document does not appear corrupted, the next step would be to allow us to see a copy of the file.  Because PDFs can be generated in countless different ways, we occasionally run into a specific sub-type of PDF that we've not encountered before.  If we can get a copy of the file that is falling into the 'Needs Attention' list, we can in almost all cases, add support for the file.  Please contact us at support@trumpetinc.com for instructions to upload documents to our secure site.


New:   Documents in the New list are those that have be found by the finder tool, but not yet allocated to another document list (documents are only in the New state for a very short period of time).

Deleted:  Documents in the deleted list mean that the document record is in the process of being purged from the database (documents should only be in this state for a very short period of time).

Too Old:   Documents in the Too Old list are those that have a file modified date older than the cut off age defined the Processor configuration.

Inaccessible:   Documents in the Inaccessible list are those that could not be processed because of file system security, Worldox security, read-only attributes or other conditions that prevent the document from being accessed and worked on.   In addition, if the profile group in which the documents reside contains an invalid base path (containing a space for example), or if the file has a space immediately prior to the document extension, they will be shown in the inaccessible list

Corrupted Documents - Documents in the corrupted list are those that Symphony OCR does not recognize as valid files. The most common reason is that the file is an invalid or corrupted PDF (try opening in Adobe to be sure).  Another possibility is that there is some characteristic of the PDF that the Symphony OCR parsing algorithm isn't handling properly.  Trumpet does periodically update the PDF parsing algorithms to address corner cases that have not been encountered before.

What to do?

Try opening the file in Acrobat, then hit Save (Acrobat will try to open and auto-repair corrupted files - when you save the document, it will save uncorrupted).  After saving and closing the document, click the Re-Analyze button on the document record in Symphony OCR.  This will only work if the file is only lightly corrupted, but is worth a shot.

If that doesn't help, next check to see if the file is already text searchable (i.e. can you search for text inside the PDF already?).  If you can, then the document isn't a candidate for OCR anyway, and you can just move the document to the Ignore list.

If the document does need to be OCRed, and the Adobe repair doesn't help, then you may want to submit the document to us for analysis.  Open a support ticket by emailing support@trumpetinc.com and we will send information on how to securely upload the document to us.  If we find a problem in our parsing algorithms, we'll fix the issue and get you a patch.

If there are a large number of files that have the same corruption reason, and the files don't appear to actually be corrupted, please open a support ticket by emailing support@trumpetinc.com and we will send information on how to securely upload a sample document to us.  If we find a problem in our parsing algorithms, we'll fix the issue and get you a patch.  Alternatively, you can use a bulk Ignore operation to move the documents to Ignore.

Encrypted / Restricted:   Documents in the Encrypted/Restricted list are those that are restricted from being processed because of some characteristic of the file itself (for example, an encrypted or partially restricted PDF file will not be processed).

Ignored:   Documents in the Ignored list are documents that a Symphony OCR administrator has explicitly told Symphony OCR not to process. Any document on this list was explicitly placed there by human intervention.

Wrong Type:   Documents in the Wrong Type lists are a tif documents and TIFF processing is not enabled.

Moved / Unavailable:   Documents in the Moved / Unavailable list are no longer available in the Document Management System (DMS).   This could mean that the DMS has gone "offline" or the DMS settings have been adjusted so that the documents would not have been found for processing (e.g., if a user selects a profile group to analyze and OCR, and then chooses to un-check that profile group or no longer process it).   Document records in the Moved/Unavailable list will be deleted from the database after 15 days.   Documents can also appear in the Moved / Unavailable list if they are no longer at that current location.

Digitally Signed:   Documents that are digitally signed will not be processed by Symphony OCR because adding OCR information to these documents would invalidate the digital signature.   If you wish to have these documents OCRed anyway (and are OK with invalidating the digital signature), please send an email to support@trumpetinc.com and request that functionality be added.

Too Big (to 8.0.0 and higher)

If a document falls into this list, it does NOT mean the document is contains too many pages.  Symphony OCR processes files one page at a time.  So if a document falls into this list, it means the document contains one or more pages with pixel dimensions larger than a specified value.  In this version of Symphony OCR that value is 32512 x 32512 pixels.  

This is a hard limit and cannot be overwritten.

Too Big (Prior to 8.0.0)

If a document falls into this list, it does NOT mean the document is too big.  Symphony OCR processes files one page at a time.  So if a document falls into this list, it means the document contains one or more pages with pixel dimensions larger than a specified value (ie. The page couldn't be loaded into memory).  We usually see this in documents like blueprints of schematic drawings.  But there are some things we can do to try to get these types of documents processed, if you find that it needs to be processed.  

Clicking on the document in the 'Too Big' list will tell you the size of the offending page. 

1) Click on the 'Too big' list.

2) Click on the individual document in question.

3) The offending size of the document is available in the document details.



If you find you have a series of the same type of documents, it's usually the case where the same size file is exceeding the limit.  You can attempt to process these documents by modifying the value(s) declared in the setting.xml file.  (Defaults differ depending on the version you're running.

See Manipulating Document Lists for more information on how to manage these lists

...

15.10. Manipulating Document Lists

You are able to manipulate the document records for documents in the various document lists.   You may wish to manipulate the records in bulk or on a per document basis.    

Control

Refresh - refreshes the current list

Export as CSV - exports the current list to a .csv file

View Timeline
- provides you with a timeline of documents processed.   See Document Timelines for more information

Show Bulk Operations / Hide Bulk Operations
- this is a 'toggle' button which will show or hide the bulk operations sections

Filter
- filters the list based on criteria you enter.   In the "Filter" field, include the criteria in which you would like to use to filter the lists, and then select   "Filter" to filter the list (e.g. X:\Docvault\Client\).

Bulk Operations - Bulk operations will be applied to the filtered (or unfiltered list seen below)

Reanalyze All - will reanalyze the documents in the list

Delete All
- will delete the Symphony OCR database record for the documents in the list

Ignore - will place all the documents items in the Ignore list

Adjust Priority to
- will allow you to set the priority for all documents in the list.   By default, Symphony OCR will analyze/process the most recent documents first, and then work backwards to process other documents.   See Processing Priorities for more information.

Single Document Control - the following apply to the documents in the list and not bulk operations

Document Details - to see the details regarding the document (e.g. document history, visible words, hidden words, etc) select the document from the list

Reanalyze Document - will place the document in the Analyzing list for re-analyzation

Ignore - will place the document in the Ignore list

Open
- opens the document

...

15.11. Processing Priorities

Some firms may wish to make a decision about which documents should be OCRed and which should not.   Symphony OCR has always provided firms with the option to process only documents in particular Worldox profile groups, but in version 6.0 and higher, we have added more fine grain control.  

As you may already know, your Symphony license allows you to OCR a certain number of pages each year.   Therefore, the ability to control processing priority allows firms to ensure that their highest priority documents are processed first, saving lower priority documents for times when excess page count is available.

Priority Levels

There are five levels of processing priority in Symphony OCR:

Very High - these documents are always processed first, and will be processed even if the system is throttling based on page count needs
High - these documents will be processed after those with the priority level of "Very High," but will still be processed even if the system is throttling based on page count needs
Normal - this is the default priority for documents
Low - these documents will not be processed if the system is throttling based on page count needs unless there is sufficient page count available
Very Low - this is the lowest priority, and these documents will not be processed if the system is throttling based on page count needs unless there is sufficient page count available and the "Low" priority documents have been processed

Note: Documents inside each of the priority levels will still get processed by age (most recent first).

For further information on throttling, see:   How Backlog Throttling Works

There is a difference between the priority level of a given document and the default priority level that is assigned at the Worldox profile group or folder level.   For example, you may set the priority level for a particular profile group to be "High".   This will automatically set the priority level for any new documents saved to that location as "High", however, documents that have already been found prior to the assignment will be processed at the default level (Normal) unless you re-prioritize the documents already found.   In addition, you may opt to assign a particular document (or set of documents) that are in the profile group to have a "Low" priority level even though the default for newly saved documents is "High".

Use Cases

Setting Priority Levels for Profile Groups

One example of utilizing this particular tool in Symphony OCR is assigning the documents in a particular profile group with a Very Low priority.   For example, a firm may have a legacy store of documents and there is a Legacy profile group pointing to this legacy store.   While the firm would like to process the legacy store, it's much more important to the firm to process the documents in the current "live" Worldox document repository before processing these.   Therefore, you can set the Legacy profile group to have a Very Low or Low priority level.

For further information on configuring profile groups to process by priority levels see:   Configuring Worldox

Setting Priority Levels for Monitored Folders

One example of utilizing this particular tool in Symphony OCR is assigning the documents in a particular monitored folder with a Very High priority.   For example, a firm may have a set of documents that you need to process with a higher priority than others.   While the firm would like to process the other documents, it's much less important than documents for other clients and/or matters.   Therefore, you can create a separate monitored folder for these documents and assign them to have a High Priority.

For further information on configuring profile groups to process by priority levels see:   Configuring Monitored Folders

Setting Priority Levels for Documents

Another example may be that a particular firm has a need to process a certain set of documents very quickly.   For example, perhaps the firm has an impending court case and needs these documents OCRd urgently.   The default priority level for the location in which these documents reside may be set to "Normal", but the firm would like to process these particular files right away.   In this instance, you can find this set of documents by filtering the PROCESSING list.   Once the documents have been found in the Processing list, you can reassign their priority to "VERY HIGH," for example.   This will ensure that they are processed before other documents.   These files may be in the same profile group or they may be in different profile groups.

For further information on re-prioritizing documents in existing lists see:   Manipulating Document Lists and How to Adjust Processing Priorities

...

15.12. Checking Status of a Document

From the Symphony OCR Interface

Use the Document Info tool to enter the full path of the document or the document id and click "Query."

Using Worldox's Audit Trail

You can also determine whether or not a document has been processed by checking the Audit Trail on the document.     Simply display the AppName column in Worldox:

For more information on how to display Audit trail information for a particular document, see Audit a Single Document's Events

...

15.13. Checking the Details of a Document

Checking the Details of a Document

There are two methods for looking up the details of a particular document:

Lookup By Path - Enter the full path of the document and click "Query" (See also:  Checking the Status of a Document).

Document Lists
 - Simply select the document in any of the document lists and this will open a details page.

See also our YouTube Videos Here:

Look up a document - NetDocuments
Look up Document- Worldox

Look up a document - Windows Folder Tree

Interpreting the Details of a Document

Once on the details page, the user can perform these functions:

Refresh - Provides the most current details of a document.

View
 - Opens the file.

Delete detail
 - Deletes the details for the document.

Re-Analyze
 - Re-analyzes the document (if it has been unable to be processed) and attempts to process the document again.

Purge Backups
 - Deletes any copies of the file that Symphony has saved.

There are also various bits of data or history showing what's been found on the file, and what's been done to it. For example, the "History" section shows all of the events logged for that file.

Note: If you delete the details of this document, it will delete this history and start from scratch. The "Page Analysis Details (before processing)" indicates how many words per page were found within the file BEFORE Symphony OCRed it. Visible words are computer-readable words (like digital headers or footers, or text generated by Word, etc). Hidden (aka invisible) words would be words applied by something like Symphony OCR. Note that these numbers are PRE-processing and they do not update after Symphony OCRs the file.

...

15.14. Status Notification Emails

Depending on how Symphony OCR Notifications are configured, Status Notifications may be sent to you nightly, when there are errors, or when there are warnings.  See Notifications for how to set this up / edit the notification frequency.

The email notification will look and feel very similar to the Summary Page but without the large graph.

You can utilize the buttons in the Notifications to manage Symphony OCR providing that you have network connectivity to the Symphony OCR servers.

If you're not on the same network as Symphony OCR then you can't use the buttons, but the data presented can still give you a quick glance at its progress.

System Statistics tells you how many files are in the Analyzer or OCRing backlog, and how long it estimates it will take to complete those backlogs.

Document Lists give you the itemized numbers of documents found that were Processed or Not Processed. Read the article titled "Symphony OCR Workflow, Tools & Document Lists" for more information on those lists.

...

15.15. Ongoing Care and Feeding

Symphony OCR is a back-end process that requires very little ongoing maintenance. However, as an administrator, there are a few tasks that should be performed on a monthly basis to ensure that everything continues to run smoothly. This will take maybe 5 minutes per month, and will help to detect and correct any systemic issues before they become problems:

  • Ensure that the system condition is Green (and either fix or ignore documents that are in the Needs Attention list)
  • Check the Not Processed documents lists for anything that looks unusual
  • Check the Summary page and ensure that the available page count is not getting low (page count resets automatically each year)
  • Every few months use the "Check for Updates" link in the top right corner of the interface to apply software updates. See here for how to do that: Update Symphony OCR
...

15.16. Symphony OCR Trials: How Backlog Throttling Works (and how to configure Backlog Overclocking)

If you have a Symphony OCR Trial then this applies to you!

Backlog throttling is an advanced feature in Symphony OCR that ensures that your system will have processing capacity to use on new documents as they are added to your document repository.  If that capacity does not get used in a given day, then the backlog will be processed using that unused capacity.  This only applies for folks that have a limited page count (if you have a Symphony OCR Trial license this applies to you).  If you have an unlimited page count with Symphony OCR, then this won't be necessary.

Symphony OCR considers any document with modified date older than 5 days to be part of the backlog.  Any document with modified date less than 5 days will be processed regardless of any throttling.

If you wish your backlog to be processed more quickly than the throttling allows, you have two choices:

  1. Increase your processing capacity (contact your Symphony Channel Partner for pricing)
  2. Adjust the backlog throttling algorithm parameters by enabling "overclocking"

Of these, increasing the processing capacity is almost always the correct answer.  If you enable overclocking, you could wind up with your entire backlog being processed, but not being able to process any new documents added to your document repository.

If you are certain that adjusting the backlog throttling algorithm is what you need, here's how:

  1. The settings that control backlog overclocking are found under the Processor configuration screen
  2. Enable backlog overclocking with the checkbox
  3. Set the number of new pages per day you expect your firm to be adding - these will be held in reserve, until the day is passed.  Reserved capacity that is not used in a given day is immediately released for processing of the backlog. A value of 0 would mean that Symphony is NOT holding back, and it will process your backlog without reservation and use up your allotted pages quickly.

Important:  Enabling overclocking can result in Symphony OCR having insufficient capacity to process your newly scanned documents - this is almost always not what you want to do if you would like to process documents for longer term (like the full 30 day duration of your trial license).  Contact your Trumpet sales rep when you're ready to buy the full version with unlimited page processing!

...

15.17. Changing the Retain Originals (Backup) Location

Background

By default, Symphony OCR will retain originals of documents it processes for between 7 and 14 days.   The retention period can be configured in the Processor Settings screen.

Symphony OCR stores these originals in the Work\Backups folder beneath the Symphony OCR installation directory (normally on the C drive of the Symphony OCR workstation).   If you wish to move the backup directory to a different volume, here is how:

Important: While you can technically change this storage location to a network drive, we strongly recommend against it.

Procedure

  1. Close Symphony OCR
  2. Copy the work\backupfiles folder to the new location (this could take awhile)
  3. Locate the Config\Settings.xml file, and open it in a text editor
  4. Locate the backupFileRoot= entry and change it to point at the new location.   This value should always point at the backupfiles directory.
  5. Save settings.xml
  6. Launch Symphony OCR
...

15.18. Roll Back to Original Version

From time to time you may need to roll back to the original of a particular document (the document that has not been OCR'd).  This functionality has been available for some time but has been exposed in the user interface in builds of Symphony OCR that are version 6.4.44 and higher.

In order to "roll back" a file that has been OCR'd to it's original version:

  • Open the Symphony OCR interface (see Accessing Symphony OCR)
  • Perform a "Look up" or "Query" for the document using the "Lookup Document" link in the navigation panel on the left hand side of the Symphony OCR Window (see Checking the Status of a Document)
  • In the "Details" area of the Document Status window, select the link "Roll Back"

  • This will restore the original version to your document repository so that you can grab it.  And in addition, Symphony OCR places the document in the Backlog's "Reprocessing" list to be reprocessed at a later date (Symphony OCR waits until the Finder locates the document again, and then reprocesses).

 Note:  If you do not want the document to be reprocessed, you can move it to the "Ignore" list.  This will ensure that Symphony OCR does not re-OCR the document.

...

15.19. Lookup Document to Force processing in NetDocuments

Users may need a document found and processed prior to Symphony OCR finding the document in NetDocuments due to a lengthy backlog.  To do so in NetDocuments, you can perform the following steps:

  • Find the DOC ID in NetDocuments
  • In Symphony OCR, choose "Lookup Document" in the navigation panel
  • Enter the Document ID in the Document Info path/id field:
  • Select Query

This will search Netdocuments for the document and add a document record for Symphony OCR to begin it's process of OCR'ing the file.

...

15.20. Anti-Virus Exclusions

Recently, we have seen instances where certain anti-virus software has been interfering with SymphonyOCR processing. 

One specific example that has been clearly defined is WebRoot blocking access to one of the working folders ("C:\Program Files (x86)\Trumpet\SymphonyOCR\work\processor1"), or Unable to create working folder C:\Program Files (x86)\Trumpet\SymphonyOCR\work\msg1.  This is a new development, and one we find very curious because this portion of SymphonyOCR has not been changed in years.  So we're not sure why it is suddenly a problem for WebRoot.  Despite this, we are trying to work with them to help alleviate this issue.

Until we are able to work out a resolution with WebRoot, we recommend starting with adding exclusions for all executables in the SymphonyOCR folder and all subfolders ("C:\Program Files (x86)\Trumpet\SymphonyOCR").  There are no risks for adding exclusions for our executables.

NOTE:  The names of the executables can change slightly between versions.  So this will need to be checked and possibly updated with any update to SymphonyOCR.

If the anti-virus software is not monitoring executables, but rather doing some type of file level access monitoring, it may be necessary to add a full exclusion to the "C:\Program Files (x86)\Trumpet" folder and all sub-folders.  We will update this when and if we are able to get further information, or if we find any other anti-virus software causing similar issues.



...

15.21. Symphony OCR use of other Servers

Symphony OCR communicates with outside servers for the following purposes:

  • Cloud-based DMS Systems (e.g. NetDocuments, SharePoint) for the purposes of finding, analyzing and processing documents
  • Downloading parts of the system during installations and updates including the Symphony Engine and the necessary 32-bit version of Java Runtime
  • Sending nightly notification emails
  • Checking for license updates
  • Checking for software updates
  • Providing you with links to user documentation
  • Sending heartbeat information.  Heartbeats provide a small amount of detail that allow our support team to monitor the health of your Symphony OCR software in order to take proactive actions if there are issues.  The heartbeats do not include reference to any specific document, or the contents of the document.  Here's an example of a heartbeat:

    Status: OK
    Service: false
    User: admin.nd
    AvailableCPUs: 6
    ParallelDocuments: 4
    ParallelPages: 1
    Host: XXXXXXX.LOCAL
    Pages left: 12344
    OCR backlog: 3158221 pgs
    Usage: {}
    AnalyzerUsage: {6632=[6632,989948,9715768]}
    ND preserve update info: true
    ND versioning enabled: Do not create versions

If you require the list of servers to configure your firewall and ensure that Symphony OCR will work nicely on your network, please email us at support@trumpetinc.com requesting the confidential document entitled: 

External Servers Used by Symphony OCR (134350)

...

15.22. Uninstalling Symphony OCR

In the event you should need to remove Symphony OCR, the steps to do so are below. Note- if you're looking to migrate Symphony OCR to another workstation, the instructions to do so can be found here: Migrating Symphony OCR to a new workstation


Open the Control Panel and navigate to Programs and Features.


From within the Programs and Features list either find or search for 'Symphony OCR (remove only)' and double click to launch the uninstaller.


When the uninstaller launches, simply click 'Uninstall', wait for it to complete and then hit 'Finish'.

  



Symphony OCR will then be uninstalled.


...

16. Enabling Functionality in Symphony OCR

16.1. Enabling MICR Processing

MICR processing can be enabled by editing the ocrHandlerProvider configuration in settings.xml by adding enableMicrProcessing="true"

Once MICR processing is enabled, that will be indicated in the Advanced Settings section of the Processor Config screen.




...

16.2. Enable the Processing of Email Attachments

To enable processing of Email (.msg) message attachments:

  • Open the Symphony OCR interface (Accessing Symphony OCR)
  • In the Navigation panel on the left side, select "Processor"
  • In the Basic Settings section, select the "Process MSG (email) attachments check box:

  • Select "Save Changes"

Notes: 

- 32bit Outlook will need to be installed and launched once on the workstation that SOCR is installed to, however you don't have to configure an actual email address. If this has not been done then SOCR will provide an alert in your Summary page.

- Worldox Document Management System only:  The above enables the OCR of the email attachments, but in order to enable full indexing of email attachments in Worldox, you will need to enable the feature.  Here are the instructions:  How to Enable Text Indexing of Email Attachments

...

16.3. Enable Tiff Processing

To enable processing of TIFF (.tif or .tiff) documents:

  • In the Navigation panel on the left side, select "Processor"
  • In the Basic Settings section, select the "Process TIFFs (OCR and convert to PDF)" check box:

  • Select "Save Changes"

Note:  Enabling this feature will convert files with the extension .tif to .pdf so if you have any shortcuts that rely on the full file path of the document (including the file extension), you will need to update those.

...

16.4. Enable Read Only Processing in Worldox

To OCR files that are marked as Read-Only:

  • Open the Symphony OCR homepage
  • Select "Worldox" from the left sidebar
  • Under Advanced Settings, check "Process read-only files"

  • Select "Save Changes"

...

16.5. Enable Read Only Processing for Folders

  • Open the Symphony OCR hompage
  • Select "Folder" from the left side bar
  • Check the "Process read only files" checkbox
  • Select "Save Changes"

...

16.6. Disable Automatic Page Rotation

By default Symphony OCR will rotate processed pages so that the page orientation is correct when opening the files in the document repository.  If you prefer for Symphony OCR not to rotate pages, you can disable the functionality.

  • Open the Symphony OCR interface (Accessing Symphony OCR)
  • In the Navigation panel on the left side, select "Processor"
  • In the Basic Settings section, uncheck the "Automatically rotate pages to proper orientation" checkbox:


  • Select "Save Changes"
...

16.7. Disable Back Up of Original Files

By default, Symphony OCR will retain originals of documents it processes for between 7 and 14 days.    Here are the steps for disabling the back up:

  • From the Symphony OCR navigation list, select the "Processor" page
  • In the Original retention settings section, uncheck the "Retain originals of processed files" checkbox

  • Select "Save Changes"

 

 

...

16.8. Change the Retention Period for Processed Files

By default, Symphony OCR will retain originals of documents it processes for between 7 and 14 days.   You can adjust the retention settings to maintain the documents for a longer period of time.  To do so:

  • From the Symphony OCR navigation list, select the "Processor" page
  • In the Original retention settings section, update the field entitled "Purge originals of processed files after" to be the number of days you'd like to retain the originals

  • Select "Save Changes"

...

16.9. Change the Retain Originals (Back up) Location

Background

By default, Symphony OCR will retain originals of documents it processes for between 7 and 14 days.   The retention period can be configured in the Processor Settings screen.

Symphony OCR stores these originals in the Work\Backups folder beneath the Symphony OCR installation directory (normally on the C drive of the Symphony OCR workstation).   If you wish to move the backup directory to a different volume, here is how:

Important: While you can technically change this storage location to a network drive, we strongly recommend against it.

Procedure

  1. Close Symphony OCR
  2. Copy the work\backupfiles folder to the new location (this could take awhile)
  3. Locate the Config\Settings.xml file, and open it in a text editor
  4. Locate the backupFileRoot= entry and change it to point at the new location.   This value should always point at the backupfiles directory.
  5. Save settings.xml
  6. Launch Symphony OCR

...

16.10. Allow Filtering based on modified date

Background

For sites that need to split processing between multiple installs of SOCR, the preferred method is to split by cabinet or folder (i.e. have one set of folders that one install of SOCR is responsible for, and another set of folders that another install is responsible for).   This isn't always possible, and another strategy is to use date filters to segment the processing.   The idea here is to allow configuration of a filter that completely blocks the instance of SOCR from finding the documents that it shouldn't process (they won't appear in the second Symphony OCR database at all).

Note   DMSes that force the modified date to change (i.e. NetDocuments) will ultimately end up with the document being discovered by the other SOCR install.   So the document will get analyzed a second time and moved immediately to the Processing list.   Not the end of the world, but it'll involve unnecessary downloading of the file.

Resolution

Currently, this functionality cannot be configured through the UI.   In order to configure this option

  • Update Symphony OCR to version 6.4.81 or higher (if necessary) on both workstations that will run Symphony OCR
  • Go to the license page and click "Save Settings" (this updates the settings.xml file to include the necessary entries)
  • Close Symphony OCR
  • Navigate to the C:\Program Files\SymphonyOCR\Config
  • Open the settings.xml in notepad
  • Look for this entry:
    <finderHandler cutoffTimeHigh="NOCUTOFF" cutoffTimeLow="NOCUTOFF"/>

    you can change either or both of the NOCUTOFF values to be a date.   The date is specified in GMT timezone.

    For example:

              <finderHandler cutoffTimeHigh="12/31/2010" cutoffTimeLow="NOCUTOFF"/>

    will make it so this instance of SOCR will only see documents with modified dates prior to 12/31/2010.

    You'd then configure the second instance of Sympony OCR like this:

              <finderHandler cutoffTimeHigh="NOCUTOFF" cutoffTimeLow="12/31/2010"/>

This will ensure that one instance of the software is only processing documents with a later modified date than the one specified, and the other instance is only processing documents with an earlier date than the one specified.

 

...

17. Advanced Reporting Features

17.1. Progress Details

The search summary provides a list of cabinets and the number of documents and pages within that cabinet (profile group). 

To access the Progress Details page, select "View Detailed Progress" in the appropriate DMS's configuration page.


From the Progress Details, you can evaluate the number of documents / pages per cabinet (profile group) and determine the percentage of completion within the cabinet (profile group).  This can assist you in understanding how many documents are eligible for OCR in a particular cabinet (profile group) and can lead to decision making regarding the priority in which you may want to process those documents.


Once you have accessed this page, you can select any of the cabinets (profile groups) to open the Processing Document List.  This will automatically filter the Processing Document List to contain only the documents in this particular cabinet (profile group).

See Manipulating Document Lists  for further information on how to manipulate the document records for these documents.

 

...

17.2. Document Timelines

Document Timelines give a week-by-week summary of the number of documents and pages in a given document list. The timelines are organized around the document's modified date, so they represent approximately when the document was added to the system. To view the timeline for a given document list, click into the list then click the "View Timeline" button at the top of the list.

Timelines can be useful for determining how quickly new documents are added to your document management system. For example, the timeline of the Processing and Processed document lists can provide how many documents and pages that are eligible for OCR have been added to the system in the past 52 weeks. This will give an approximate rate of new documents per year.

To view the timeline for processed documents:

  1. Click on the Processed link on the left side of the screen.

2. On the "Documents of Type Processed Screen" locate and select "View Timeline" link.

3. This will take you to the page for "The Timeline of Processed" documents.

This screen will show how many documents and pages were processed from week to week (cumulatively as well). The timeline can also be exported as a CSV or Image file by selecting the appropriate button (this will give you the full history as opposed to going back only 100 weeks).

...

18. FAQs

18.1. What Systems Does Symphony OCR Integrate With?

Symphony OCR can integrate with the following systems: The links will take you to the configuration instructions:

Configuration Guide - Worldox

Configuration Guide - NetDocuments

Configuration Guide - ShareFile

Configuration Guide - Open Text

Configuration Guide - Practice Master

Configuration Guide - LSSe64

Configuration Guide - Folders

Configuration Guide - Box

Configuration Guide - Dropbox

Configuration Guide - Microsoft One Drive

Configuration Guide - Google Drive

Configuration Guide - SharePoint

...

18.2. Does Symphony OCR support "fast web view" format?

If the file is "web optimized" before Symphony processes it, then it will continue to be "web optimized" after - so this is more of a question of how your scanning is configured.  Trumpet's mandate is to completely preserve the original.  It is much more important to preserve the original than to do any sort of optimization on an existing file - just not worth the risk - especially when you consider that these processes are being done at high volume, unattended

...

18.3. System Requirements

Symphony OCR System Requirements

Symphony OCR is typically installed to a single machine.

Symphony OCR Requires:

Windows 7 or later 
Windows Server 2003 or later
Physical or virtual machine
750 MHz or faster processor 
1 GB RAM
1 GB available disk space
100 Mbps or better network connection to your file server
 
The operating system must be 64-bit or higher (starting in versions 8.0)

For Wordox DMS integration, Symphony OCR is typically installed to the PC that also runs the Worldox indexer, but this is not required.

Note: The OCR process will automatically throttle itself if another process on the same computer needs to run.

Note: If you wish to OCR email attachments, then the 32bit version of Outlook (2010 or newer) will need to be installed as well. Does not work with 64 bit Outlook.

Workstation considerations for maximum performance:

4 core CPU, highest clock speed available (up to 16 cores if you purchase a license that supports additional cores)
At least 4 GB RAM
Fast network connection between the PC and server (1 gbps is recommended)
100 GB+ Disk space

 

Backups:

While definitely not required, some clients like to back up the Symphony OCR database, which resides in the 'C:\Program Files (x86)\Trumpet\SymphonyOCR\data' folder, on a nightly basis.  But it's not required — The Symphony OCR machine can be recreated using the documents in your document management system.

...

18.4. How much does OCR increase file size?

OCR adds between 1 and 5% to the total size of the source file if the source file is scanned in black and white.  For grayscale or color images, the increase in size is less than 1%.

If that's not making sense to you then here's an explanation and a metaphor:

Grayscale and color images are larger (in bits) than black and white images. This means that a 5-page document scanned to PDF in color/grayscale has more bits than the same 5-page document scanned to PDF in black and white. Symphony OCR, however, applies the same layer of text to both documents and that text would increase the same # of bits on each of the two scans. So, the percentage of size Symphony OCR adds actually goes DOWN the higher quality the scan gets.

Here's the metaphor: picture those scans as a sink of water (black/white scan) versus a tub of water (Color/GS). Now add a rubber duck (SOCR text) to each. The space the duck takes up in each body of water has a different percentage in relation to that body of water. The duck's percentage of space added in the tub is LESS than it is in the sink.

...

18.5. How accurate is Symphony OCR?

OCR accuracy is highly dependent on the quality of the scan.   But given the same scan, the engine that Symphony OCR uses (ABBYY) is widely recognized as the most accurate in the industry.

Here is an analysis of OCR accuracy performed by outside sources

...

18.6. Recommended Scan Settings

How should I configure my scanner for optimal use with Symphony OCR?

For best accuracy, we recommend scanning at:

  • 300 dpi for black and white images, or
  • 200 dpi for grayscale and color

Note: 200 dpi will still work for black and white, but accuracy will drop slightly.  Below 200 dpi, accuracy drops off sharply for smaller fonts, and is not recommended.  Above 300 dpi, accuracy does not measurably improve (and results in much larger files that are slower to process).

...

18.7. Is Symphony OCR PDF/A compliant?

Symphony OCR's content preservation system is PDF/A aware.   If the source document is PDF/A compliant, then the results after OCR will also be PDF/A compliant.

...

18.8. Does Symphony Suite Run on a Mac OS?

Does Symphony Suite Run on a Mac OS?

No, Symphony Suite is only compatible with the Windows operating system.

...

18.9. How does Symphony OCR impact the performance of the server or indexer PC?

How does Symphony OCR impact the performance of the PC/Server it is running on?

Symphony OCR is designed to automatically throttle down it's CPU usage if another application on the computer needs the CPU, so the performance impact of Symphony OCR on the indexer PC is negligible.

Symphony OCR is designed to use as much of the computer's CPU capacity that is available (it will use up to 5 cores in full processing mode, 4 of which are used for the actual OCR operation), so you will see the CPU pegged at 100% utilization while there are documents to process, but this processing will not impact other processes on the computer from running normally.

If you wish to Limit the number of Cores that Symphony OCR utilizes, you can do so in the Advanced Settings of the Processor see:   Processor for more information..

 

 

...

18.10. Configuring 3rd party monitoring software

If your firm uses a solution like Servers Alive to track the health of software and servers, you may want to have it monitor the Symphony OCR system status.

As of version 5.2.45, Symphony OCR adds a special status page that is easy for monitoring applications to parse:

http://your.symphony.workstation.name:14722/maestro/do/status

This will return a plain text response that looks like this:

Depending on the state of Symphony OCR, the status values could be one of:

  • Status: OK
  • Status: WARN
  • Status: ERROR

Tip:   To test a change in status, go to the Processor configuration screen and stop the processor - this will switch the system status from OK to WARN.   Be sure to re-start the Processor after you are done testing.

Finally, here is the configuration screen for Servers Alive to check status of Symphony - if you use a different monitoring application, adjust as you see fit:

...

18.11. How to Disable OCR in Adobe Acrobat

Background

Optical Character Recognition (OCR) converts images into characters for text searching.   Because OCR is time and resource intensive, performing OCR during scanning significantly reduces your efficiency. Symphony OCR performs the OCR task in a background process, allowing you to turn OCR off during scanning.   This procedure covers how to disable OCR when scanning using Adobe Acrobat.

Note that this setting is made on a per-user basis, so you'll have to do it for each user on a given workstation (you can thank Adobe for that - if anyone does figure out a way to get this turned off using a registry setting or anything like that, please let us know).

Procedure

Acrobat 9 & 10

  1. Select Create > PDF from Scanner > Configure Presets
  2. In the Configure Presets window, deselect the Make Searchable (Run OCR) checkbox
  3. Click OK
...

18.12. How to Process No Image or Text Documents

For the most part documents that do not contain an image or text don't need to be processed.  One example of this might be your company's PDF letterhead template.  By default these PDF documents will not be processed and will be placed in the "No Image or Text" Document List.  How can a PDF have no image and no text?

Content can be drawn in a PDF using one of three mechanisms:
  1. A bitmap image can be placed on the page. 
  2. Text can be rendered on the page using fonts. 
  3. Simple drawing operations (i.e. line or curve segments).
Scanned documents are always bitmaps drawn on the page - these documents are not text searchable without OCR.  If a document is created electronically (e.g. print to PDF from Word), the document will typically consist of text rendered using fonts - these documents are generally text searchable without OCR.  However, there are some cases where the font can't be embedded in the PDF (for font licensing or other reasons), the text content will be rendered using simple drawing operations.  When this happens, the documents are not text searchable without OCR.  

Certain tools like Autocad always use drawing operations to render all content (line segments in an architectural drawing, for example), including text.

How does Symphony OCR handle no-image/no-text documents?  It is not possible for Symphony to reliably differentiate between drawn line segments that are part of words and drawn line segments that are just lines that are part of an architectural drawing, table lines, etc...  As such, we mark these documents for special handling 'No Image/No Text', and allow the user (that's you!) to force OCR of a given document if desired.

If you would like Symphony OCR to process a specific document (or set of documents), you may force processing.  Here's how:

To process only a single document in the list:

  • Select the "No Image or Text" item from the navigation panel on the left
  • Click on the appropriate document in the document path to open the Details view
  • Click "Enable Processing"

To process all No Image or Text files on a "per document" level:

  • Select the "No Image or Text" item from the navigation panel on the left
  • Optionally use the Filter box to get a sub-set of the documents
  • Select "Show Bulk Operations"
  • Choose "Enable Processing"

 To process all No Image or Text files permanently going forward:

  • Close Symphony OCR by using the Quit button in the bottom left corner of the interface (or, if Symphony OCR is installed as a service, stop the service)
  • Go to C:\Program Files (x86)\Trumpet\SymphonyOCR\config (may vary if you installed in a different location)
  • Right-click on the settings.xml file and select Open With > Notepad
  • Locate this line: <heuristicComputerProvider alwaysAnalyzeAndProcessNoImageNoTextPages="false"/>
  • Change the "false" to "true"
  • Save and close the file
  • Re-launch Symphony OCR (start the service if Symphony OCR is installed as a service)

You can confirm this setting has been applied by viewing the Analyzer and Processor pages:


...

18.13. Can I process Magnetic Ink Characters (MIC) often found on checks?

Yes! If you have a lot of checks that you scan and save into your repository then you may want to ensure that checking account numbers are getting OCR'd. Symphony OCR has the ability to enable MIC Recognition but it will require a Trumpet technician to enable for you. It's a quick process, just contact support@trumpetinc.com to schedule a call.

If MICR processing is enabled, the setting will be indicated in the Advanced Settings section of the Processor Config screen. If you do not see the setting then it has not been enabled.

NOTE: When enabled, only a single processor core will be used for OCR, which will decrease performance. So, this should only be enabled for sites that truly need MICR processing.

Trumpet Support:  see Page 1979

...

18.14. How to Enable the Processing of Email Attachments

To enable processing of Email (.msg) message attachments:

  • Open the Symphony OCR interface (Accessing Symphony OCR)
  • In the Navigation panel on the left side, select "Processor"
  • In the Basic Settings section, select the "Process MSG (email) attachments check box:

  • Select "Save Changes"

Notes: 

- 32bit Outlook will need to be installed and launched once on the workstation that SOCR is installed to, however you don't have to configure an actual email address. If this has not been done then SOCR will provide an alert in your Summary page.

- Worldox Document Management System only:  The above enables the OCR of the email attachments, but in order to enable full indexing of email attachments in Worldox, you will need to enable the feature.  Here are the instructions:  How to Enable Text Indexing of Email Attachments

...

18.15. How to Enable Text Indexing of Email Attachments in Worldox

**This article is for Worldox integration only. If you have NetDocuments or another document management solution, refer to the provider for more information to ensure Email Text Indexing is available and enabled. **

Worldox Professional

If your firm wishes to text index email attachments, you must enable the setting. 

Here's how:

Note:  In order for the Indexer to index email attachments (and .msg files) the Indexer must have Outlook installed.

  • From the Windows Start menu, type the following (where "X" is the network location of Worldox):

    X:\Worldox\wdadmin.exe /ini

  • Hit Enter.  This will launch the administrative properties dialog

  • Select the WDIndex tab

  • Select the category "Common Options" and set the "Index Email Attachments" setting to "Yes"

  • .After enabling the feature, perform an INIT on the text indexes. The Worldox Indexer performs INITs once a week by default and that time can be identified in WDInexer and then checking the schedule

Worldox Cloud

If you wish to Index your email attachments, please notify Trumpet by sending an email to support@trumpetinc.com.

...

18.16. How to determine which email attachments are processed

Symphony OCR assigns a unique identifier to each email attachment that it processes. The identifier will start with the path of the actual document and then include a unique identifier for that particular attachment:

In order to determine the "name" of the email attachment that is assigned the unique identifier, select the document in the "Document Path" of any of the Worldox lists

This will provide you with the "Details" of that particular document record, and the attachment name will be listed in parenthesis:

 

 

...

18.17. How to Adjust Processing Priorities

There may be instances where you need to OCR a set of documents more quickly than others.   For example, you may have a particular matter going to trial next week and need to OCR the discovery for that matter or perhaps you’d like to set a priority of processing all discovery documents in your document repository first.

Note:   This procedure is a one-time adjustment of the priorities of filtered files.   You can also adjust the default priority of the documents by setting up separate monitored folders for each and assigning different priorities as applicable by following the instructions found here:   Processing Priorities


To adjust the processing priorities for Symphony OCR for a one-time instance, you can use the following procedure:

  • Navigate to the Processing Document List   (this is a list of all documents that are in queue for processing)
  • Use the “Filter” area to filter the documents.     Here are some example filters using wildcards:
    • C:\Document Repository\123-7789\*.   Use something like this to find all folders and subfolders for this particular matter.
    • C:\Document Repository\*\Discovery\*.   Use something like this to find all Discovery subfolders for all matters
  • Then select   the “Filter” button
  • This will filter your Symphony OCR document List to show only the files you’ve identified in your filter:

    Note:   This will only display 20 to 30 of the documents at a time, but you can scroll through the list using the “Next” button at the top of the list.
  • Next, select the “Show Bulk Operations” link.  
  • This will display the various "Bulk" operations that you can perform.   In this instance we’re focusing on the “Priority” buttons:
  • Select the appropriate priority for the documents (e.g. High or Very High depending on the urgency for this filter in conjunction with other filters.   For more information on Processing Levels see:   Processing Priorities
  • The Documents in the Processing list will now show a sub-heading for the priority:

 

...

18.18. How to Run Symphony OCR as a Service

Background

Because Worldox now runs as a service, Trumpet has received many requests to have Symphony OCR also run as a service.  This is possible using version 6.6.13 and higher of Symphony OCR.

Resolution

If you are performing this update in concert with updating Worldox to run as a service, after the Worldox update you'll want to ensure that you've launched the Worldox client on the machine running the Symphony OCR service (typically the Indexer) as the Symphony OCR user.  You can determine the user by selecting the "Worldox" link in the navigation panel. It's typically 000000, but yours may be different. 

To run Symphony OCR as a service:

  • Click the "Check for Updates" link in the top right corner of Symphony OCR
  • Select the pre-release version to download the installer
  • Run the installer - accept the defaults until you come to the "How would you like to run Symphony OCR?" step
  • On this window, select "Run as a Windows Service" - Enter the Domain\User and Password.  Click 'Next'

Password Requirements: Must use the users Windows Password. PINS or other security Keys wont work




...

18.19. Life Cycle of a Tif Document in Symphony OCR

By default, Symphony OCR does not process .tif files.   The reason this is not enabled by default is that because it is not possible to add an invisible layer of text to a .tif file, Symphony OCR actually converts the tif files to pdf.     This provides firms with the option to enable if they choose to do so.

Here's the life cycle of a .tif file in Symphony OCR:

Finder Tool

The "Finder" tool in Symphony OCR is responsible for finding .tif, files in the document repository (amongst other file types).       It will search for .tif files regardless of whether or not the firm has chosen to process .tif files.   Once the Finder has found the documents, it passes them to the Analyzer Tool.

Analyzer Tool

The "Analyzer" tool in Symphony OCR is responsible for analyzing documents to determine if they're eligible candidates for OCR.   The Analyzer will determine if the .tif file can be OCR'ed regardless of whether or not the firm has chosen to process .tif files.   If the .tif file is eligible for processing, it will place the .tif file in the "Processing" queue.

Processor Tool

When .tif processing is *not* enabled

When the Processor determines that the file is a .tif file, it will immediately place the .tif file in the Not Processed \ Wrong Type list.

When .tif processing is enabled

The Processor will process the .tif file by converting the .tif file to a .pdf file.   Why?   The .tif format does not allow the invisible layer of text to be added.   Therefore, it must be converted to a .pdf file when processed.  

Analysis Only

If you wish to simply determine how many .tif documents are eligible for OCR in a particular document repository you can simply *not* enable .tif processing.   Then check the "Wrong Type" list's Timeline to determine the number of documents / pages that could be processed.

...

18.20. TIF and MSG Version Settings for NetDocuments Users

MSG Versioning Implications for NetDocuments/Symphony OCR

There are four different levels of versioning available for NetDocuments Symphony OCR users, and the type of versioning you choose will determine the behavior that Symphony OCR uses.   The levels and how they work with .msg files are listed below:

  1. Do not create versions - when this is set, Symphony OCR will not create new versions when processing a document with the exception of .msg files.  NetDocuments requires that version 1 of emails not be overwritten. Therefore, even when Symphony OCR is set to "Do not create versions" , it will create a Version 2 of email files.  If there are multiple attachments to the email, Symphony OCR will only create a Version 2 (not any additional versions) as it processes multiple attachments.
  2. Create versions for all documents -when this is set, Symphony OCR will create new versions for all documents it processes regardless of the file type / extension.  Therefore, if there are multiple attachments to the email, Symphony OCR will create a separate version for each email attachment. (If there are 10 email attachments, there will be 10 versions created)
  3. Create versions for pdf documents only - when this is set, Symphony OCR will create new versions for all PDF documents.  NetDocuments requires that version 1 of emails not be overwritten.  Therefore, even when Symphony OCR is set to "Create versions for pdf documents only", it will create a Version 2 of email files.  If there are multiple attachments to the email, Symphony OCR will only create a Version 2 (not any additional versions) as it processes multiple attachments.  This behavior for .msg files is the same as "Do not create versions".
  4. Create versions for non-pdf documents only - when this is set, Symphony OCR will create new versions of .tif and .msg files.   Therefore, if there are multiple attachments to the email, Symphony OCR will create a separate version for each email attachment. (If there are 10 email attachments, there will be 10 versions created)


TIF Versioning Implications for NetDocuments/Symphony OCR

Let's discuss the behavior with versioning for .tif files and the four applicable settings:

  1. Do not create versions - when this is set, Symphony OCR will not create new versions when processing a document.  The .tif file is converted to .pdf when processed, but no new version is created. 
  2. Create versions for all documents - when this is set, Symphony OCR will create a new version for all documents it processes regardless of the file type / extension.  Therefore,  the .tif file is converted to .pdf and saved as a new version, v1 remains a .tif file.
  3. Create versions for pdf documents only -when this is set, Symphony OCR will create new versions for all PDF documents only.  Symphony OCR will not create a new document for .tif files
  4. Create versions for non-pdf documents only - when this is set, Symphony OCR will create new versions of .tif and .msg files but not create versions for .pdf files.



...

18.21. How do I know Symphony OCR put text in my file?

If Symphony OCR has indicated in your Document Details that it has processed (OCR'd) a file but you'd like to see if with your own eyes, here are a couple tips/tricks we use to do that:

Remember: Symphony OCR applies an invisible layer of text to your files - it does NOT control the search mechanisms withing your document repository. If your trying to do text-in-file searches inside your document repository and you're getting no hits then check to make sure file text/content is being indexed by your repository tools. Again, Symphony doesn't control text searches, it only puts text in your files.

Find:

You can open your PDF and use Ctrl+F (find) and then type out a word you see on the page. Presuming the file has text now, your PDF text finder should highlight the word your searching for. Be sure you're searching for a word that you already know exists within your document.

Copy Paste:

Alternatively, you can try copying and pasting text from your file as well. This will show you that the text is there and is accurate.

Once a document has been OCR'd you can copy the OCR'd results of a document to Microsoft Word or other word processing editor.

To do so:

  • Open the OCRd document in your PDF Viewer (See Checking Status of a Document to determine if the document has been OCR'd)
  • Select the text
    • Ctrl + A on your keyboard to select all the text in the document
    • Or Use the "Text" tool to select portions of the document 
  • Right Click Copy
  • In Word or other word processing document (ie. Notepad), Select Ctrl + V to paste the text (or you can right click "Paste")

Note: This will not preserve the formatting of the document (Headers, fonts, etc) which can be adjusted in Microsoft Word.

...

18.22. My CPU is showing 100% - Isn't that a problem?

Background

After installing Symphony OCR, you will almost certainly notice that the CPU on the machine running Symphony OCR is consuming 100% of the CPU.

What's Going On

Symphony OCR is designed to consume all CPU resources available - Up to 16 cores (depending on the number of cores and the license you purchased) during OCR.  OCRing documents is an *extremely* CPU intensive operation, which means that it will use far more CPU than almost any other application you may be familiar with.  With many applications, seeing the CPU spike for a long period of time is cause for concern - but with Symphony OCR it is absolutely expected and desirable behavior.

That said, it is important that Symphony OCR be a good digital citizen and allow other applications to use those CPUs when they need to.  Symphony OCR is designed to allow exactly that to happen.  Symphony OCR runs at a lower priority than all other tasks, so it will always yield when another task needs the CPU.  You may notice a little delay when other applications need the CPU, but we've had no reports of Symphony preventing anything else from running as needed.  If you are seeing other apps hung up, we'd definitely like to know about it. 

To limit the cores that Symphony OCR utilizes, use the "Advanced settings" section of the 'Processor' tab (very bottom of the page).  Just enter the maximum number of CPUs for Symphony OCR to use in that field.  See Processor for more information.

Additional Information

In some extremely rare situations (we've seen this twice now), if thermal management of the CPU is not designed properly (e.g. incorrectly applied thermal paste between CPU and heatsink), it is possible for a machine running at 100% of CPU to overheat and shut down.  The two times we saw this, the machine powered itself down without any warning or user interaction.  After fixing the thermal paste, the problem never recurred.

Note: if you have other CPU-intensive or time-sensitive apps that need to run and you feel that Symphony OCR is interfering, you can add events in the Symphony OCR Scheduler to stop processing documents during time periods.  In practice, we've seen very few sites that require this type of schedule management.

 

...

18.23. Does Symphony OCR impact the performance of the document management system?

Symphony OCR is intelligently designed to be a "good digital citizen."   Bottom line, other applications get first priority after which Symphony OCR will use all available resources to process, so it throttles based on demand from other apps.


...

18.24. Symphony OCR and SharePoint Integration

These are some of the frequently asked questions with regards to how Symphony OCR works with SharePoint:

Does Symphony OCR integrate with OneDrive?

Symphony OCR supports OneDrive for Business (SharePoint Online) not personal OneDrive accounts.

Where does Symphony OCR run?

Symphony OCR integrates with your Office365 SharePoint tenant.  It does not actually run on the Office365 cloud, but runs on a workstation and integrates with SharePoint directly.

Can Symphony OCR be scheduled to run?

Symphony OCR has a scheduler component which is incorporated in the software.  You can determine which days of the week and what times of the days that it will run.  The OCR process is CPU intensive.  Data transfer accounts for approximately 3% of the processing time for a given document.  A fast server typically sees around 2-3 seconds per page throughput, workstation class operating systems are approximately twice that (4-6 seconds per page throughput).  This is very dependent on hardware and network speeds.

How does licensing work?

The Symphony OCR license must be greater than or equal to the SharePoint user count.  To determine your user count:

  • Log into SharePoint as an admin user
  • Paste the following into your browser replacing "yourcompany" with your SharePoint site at the beginning:
    https://<yourcompany>.sharepoint.com/_api/search/query?querytext='AccountName:"i:0%23.*"'&enablefql=false&rowsperpage=0&rowlimit=100&selectproperties='AccountName'&sourceid='B09A7990-05EA-4AF9-81EF-EDFAB16C4E31'
  • This will return xml information
  • Do a text search for 'd:RowCount' (don't include the quotes), you will find something like this:
    <d:RowCount m:type="Edm.Int32">10</d:RowCount>
    The number of users is "10"
  • To find each of the users, you can search through the same xml file for '<d:Key>AccountName</d:Key>' (don't include the quotes)
    e.g.: 

<d:element m:type="SP.KeyValue">
<d:Key>AccountName</d:Key>
<d:Value>i:0#.f|membership|kevin@trumpetinc.onmicrosoft.com</d:Value>
<d:ValueType>Edm.String</d:ValueType>
</d:element>

In the above example, the username is kevin@trumpetinc.onmicrosoft.com

If you add an additional user, Symphony OCR will send the person you indicate in the Notification center an email notification indicating you have exceeded your license count.   Symphony OCR will continue to process during a 10-day grace period to allow the software to continue running and ensure you can order additional users for Symphony OCR.

How to adjust your Sharepoint Tenant:

If you need to adjust your SharePoint Tenant, you'll need to do the following:

  1. Shut down the SOCR Service
  2. Open the Settings.xml file located at C:\Program Files (x86)\Trumpet\SymphonyOCR\config using Notepad
  3. Find the section entitled <sharePointConnectionManager
  4. Delete that section from the "realm" through to the existing Sharepoint URL
  5. Save the settings.xml file
  6. Relaunch Symphony OCR and start the service
  7. Re-establish the connection to the new tenant

Modified Dates

The Sharepoint API calls do not allow Symphony OCR to preserve the modified date of files.  Therefore, when Symphony OCR processes a document, the modified date will be adjusted to the date the document was processed by Symphony OCR.


 

...

18.25. NetDocuments Download Warning Threshold

Background

You integrate your Symphony OCR software with your NetDocuments repository and you're getting emails saying something to the extent of "NetDocuments Download Threshold Exceeded".

What does this mean?

This email comes from NetDocuments based on, as far as we know, repository settings configured to alert an admin of when a user's download activity exceeds a certain, defined, number. We do not control these settings or email notifications, but we've located an article published by NetDocuments which talks more about this which may be of use. 

Symphony OCR must download your documents in order to analyze and OCR them.  This process does count towards the NetDocuments warning threshold.

The setting to adjust the threshold is located in the NetDocuments Admin Settings under "Edit name, logo and billing information".

Please note that Trumpet, Inc. does not in any way support or represent NetDocuments, and if you would like to learn more about the email you received, or to adjust the settings that control it, please reach out to NetDocuments directly.

...

18.26. What is the maximum number of processors supported by Symphony OCR before a performance limit/wall will be hit because of application bitness?

If your license allows for our maximum core count of 16.  This is 16 cores used for OCR itself, and we'll use another 8 cores for pre-analysis.  Pre-analysis takes 1/50th the time as OCR, so it's only important to have 16 dedicated physical cores if you are trying to maximize Symphony performance.

Application bit-ness has no influence on our performance for multi-core operation, so no performance degredation from that as you add cores.

What's more likely is that Symphony will eventually saturate the bandwidth available for transferring files (either b/c of limitations at via your internet connection, or the server bandwidth).  Other potential limiting factors would be free RAM and disk performance (but on any modern server, these are negligible compared to internet bandwidth). 


...

18.27. If retention is set, what is the status of the working copy kept on the server/workstation?

The document retention setting controls a short term backup that we can keep on the local disks on the Symphony machine, just in case something goes horribly wrong.  This functionality was added to the original Symphony application 15 years ago as a belt-and-suspenders just-in-case feature.  We've actually never had to use this backup, and there are better ways to restore original documents, so very few of our sites bother to use it anymore.

Firms that want to be able to recover the original document (prior to OCR) generally use two strategies:

  1. They rely on Symphony's "Rollback" capability - here's a knowledgebook that describes this capability:  https://support.trumpetinc.com/index.php?pg=kb.page&id=1824  - what's cool about Rollback, is that you get the same functionality as if you retained a completely separate copy of the original document - but without having to store the second copy or increasing storage.
  2. They turn on 'Create versions' - see 'Create Versions of OCR results' in  https://support.trumpetinc.com/index.php?pg=kb.page&id=1378   - this approach does literally make an extra copy of the original (saving as version 1), then saves the OCR results back into the DMS as version 2.  This will obviously double the storage space of any document that is OCRed.


...

19. Troubleshooting Tips and Tricks

19.1. Troubleshooting Procedures

Symphony OCR may not yet have processed some documents, and / or may not be able to process some documents.   You can set up Notifications to track which documents are in the various Not Processed lists see:   Configuring Notifications for instructions.

If you do not currently receive notifications but want to troubleshoot why a particular document is not text searchable, here is a quick procedure:

  1. Open the PDF file.   Search for text inside the document itself.   If you can find the text, then the file has been OCR'd and the problem may be that the Worldox text search isn't working.   Worldox updates the text databases nightly, so it may be a matter of the text indexes not having been updated yet.   If the text indexes have been updated, see:   OCRed documents not text searchable for a potential solution.
  2. If you open the document and cannot find text within the file itself, then you can check the status of the document:   Checking the Status of a Document
    • If the document does not have a status at all, the issue is more than likely that Symphony OCR is not able to locate the document.   This is typically due to a Worldox permissions issue.   You should ensure that the user that is running Symphony OCR has full access to the files in the document management system.
    • If the document does have a status, you can look at the details page to determine the history of the document, or you can determine which document list the document resides in.   Here is a listing of the various document lists with further explanation:   Document Lists
    • If the document appears in the "Needs Attention" list in particular, please send an email to support@trumpetinc.com including a screenshot of the details screen.   Trumpet support may request that you upload a copy of the document via our secure upload portal.
...

19.2. There was a problem downloading the Symphony OCR Engine

Background

The Symphony OCR installer downloads and runs the Symphony OCR Engine installer.   If it is unable to download the Symphony OCR Engine you will receive an error message like this:

This type of error almost always means that you have a hardware firewall or border router that is interfering with the download of a required component.

Resolution

1)   Configure your firewall to allow traffic coming from the www.trumpetinc.com domain.
2)   If you are unable configure the firewall, you can download the required compontent from the URL provided by the installer error message

after you download, install to the folder that Symphony OCR is located in (this is typically 'C:\Program Files\Trumpet\SymphonyOCR', but may be called 'Maestro' in older installations).   One you complete this install, run the regular installer.   The regular installer will detect the engine and will not attempt to download it again.

 

...

19.3. Insufficient Disk Space Warning

Symptoms

Symphony OCR is in either a warning or error state with the error message indicating that there is Insufficient Disk Space.

Background

As a "belts and suspenders" operation, Symphony OCR creates a back up copy of files prior to processing them and saves that copy to the local workstation that runs Symphony OCR.   For more information regarding setting up the retention see:   Configuration Guide: Processor

If the workstation's amount of available disk space falls below 1.5 GB, Symphony Profiler will enter a "Warning" state.
If the workstation's amount of available disk space falls below 1.0 GB, Symphony Profiler will enter an "Error" state.

Resolution

There are a couple of options for you:

Adjust the firm's purge rules

If the firm has opted to keep files for longer than the standard 7 days, you may wish to adjust the firm's retention rates to something smaller to purge the documents more frequently.   See the section Configuration Guide: Processor to find out how to adjust

Change the firm's retention location

You can also change the location where the originals are stored by adjusting the retention location.   Here are instructions for doing so:   Changing the Retain Originals (Backup) Location

Other

You can also edit the settings.xml of the firm to warn and error at different rates.   Levels can be adjusted by manually editing settings.xml and adding errorUsableSpace and warnUsableSpace parameters to the <backupManager .... /> element

 

 

 

...

19.4. Unable to determine number of Worldox users

Background

Symphony OCR utilizes the Worldox API to find the applicable profile groups to search for documents and also uses the Worldox API to determine if the Worldox license count matches the Symphony OCR license count as the two must be the same for Symphony to Process documents.

If you see error messages like the above, it's typically caused by Worldox not having been launched (mirrored) as the user that Symphony OCR is configured to use.

Resolution

Launch Worldox (in Mirrored Mode) as the user that Symphony OCR is configured to use, then close Worldox.   (You do not need to leave Worldox running - launching one time is sufficient to get the Worldox API to register properly)

 

 

...

19.5. Unable to Determine number of Netdocuments users

Background

We've seen at least one instance where a site had received the following error message and all processing had stopped until it was corrected:

"Unable to determine the number of NetDocuments users - java.lang.NullPOinterException: path is 'null'.


Resolution

The problem appeared to be caused by strict Internet Security which was preventing the machine to access the necessary NetDocuments integration points. The solution was to reset the Advanced Settings in Internet Options.

To do this follow these steps:

  • Open Control Panel> Internet Options (can also access through your Browser)
  • Go to the "Advanced" tab
  • Click on the button that says "Reset..." under the Reset Internet Explorer Settings section

  • Hit OK
  • Close everything and Restart the machine.
  • Launch SOCR and confirm the error is gone and that the Finder, Analyzer, and Processor running smoothly again.
...

19.6. You have more users on your Worldox licensing than you have on your Symphony Licensing

Background

Symphony OCR utilizes the Worldox API to find the applicable profile groups to search for documents and also uses the Worldox API to determine if the Worldox license count matches the Symphony OCR license count as the two must be the same for Symphony to Process documents.

If you see error messages like the above, it's typically because you have more Worldox licenses than Symphony licenses.     Symphony OCR provides you with a 15 day grace period to get your Symphony OCR license up to date.

Resolution

Contact your Channel Partner to request additional Symphony licenses.

 

...

19.7. License Update Check Failed - Error Checking for License Updates

 

Background

Communication errors between your machine and our servers, such as the error above, can be caused by a discrepancy between your machine’s clock and our server’s clock. If their times are not matching by within a few seconds, then our server will not allow access for security reasons.

 

Resolution

Change the time on the machine to the correct time. If your computer is part of a network that is all on the same time then make sure the changes are made globally.

...

19.8. Migrating Symphony OCR to a new workstation

If you need to move Symphony OCR to a new workstation, you have two choices:

  1. Copy settings and database files from the old PC to the new - this is the preferred approach
  2. Install clean to the new workstation, reconfigure settings, then let Symphony OCR rebuild it's databases by processing the document repository (if you need to perform a clean install please refer to the Symphony OCR Installation Guide ) - this approach must be used if the original installation of Symphony OCR is not available (i.e. hard drive failure)

This procedure covers the first approach.

Prepare the Computer for Installation 

  • Ensure that the workstation and Windows user login for Symphony OCR meets the minimum system requirements
  • Ensure that a modern web browser and Adobe Reader (or Acrobat) are installed on the PC (this is not strictly necessary, but it makes troubleshooting easier)
  • For Worldox sites, make sure that Worldox has been launched at least one time on the new machine

Install and migrate

  1. Ensure Symphony OCR has been shut down on the old workstation
  2. Download and install the latest version of Symphony OCR to the new workstation (you can get a download link from our support team by emailing support@trumpetinc.com), but do NOT launch it after the installer finishes
  3. Copy the following files from the Symphony OCR directory on the old PC to the new PC (typically C:\Program Files\Trumpet\SymphonyOCR):
    • Config\settings.xml
    • Data\* (i.e. all files and sub-folders.  *Contents only, not the entire directory*) - this could take awhile, be patient
    • Logs\* (i.e. all files - this isn't strictly necessary, but it helps to maintain continuity.  *Contents only, not the entire directory*)
  4. *If running Symphony OCR as a service open the service manager and start the service
  5. Launch Symphony OCR on the new workstation
  6. For Worldox sites, open the Symphony OCR web interface (if not already showing in your browser, you can use the system tray applet to do this), go to the Worldox settings screen, adjust the Worldox location if it is still set to the old server
  7. Uninstall Symphony OCR from the old workstation (this is important to ensure that you do not violate your EULA license agreement for Symphony)
...

19.9. The Program Can't Start Because MSVCR71.dll is Missing Error

 

For the technical details, here's a link that tells a bit more about this error message: http://www.duckware.com/tech/java6msvcr71.html

An update to the latest version of Symphony Profiler or Symphony OCR should resolve the issue.  Please use your Symphony OCR / Symphony Profiler Installation Guide, or contact Trumpet support (http:\\support.trumpetinc.com) for update instructions.

...

19.10. Compact, repair or reset the Symphony OCR database

Background

If the internal Symphony OCR database becomes corrupted, the system will throw all sorts of interesting errors. This sort of problem should not be happening very much, so if you see it multiple times, be sure to request support. If you are directed by Trumpet support to either repair, reset or compact the database, here's how to do it:

Procedure

Compact/Repair

If the Symphony OCR database becomes corrupted, a 'Compact' operation will often repair the damage - here's how:

  1. Click the Debug Link (Advanced Link in older versions)
  2. Click the Compact Database Link
  3. Wait for the rebuild to finish (you will have to click refresh to update the Task list)

Reset

In some cases, a Symphony OCR will be so damaged that a full reset will be necessary - here's how:

  1. Close/Stop Symphony OCR application/service
  2. Navigate to the databases folder inside the Symphony OCR folder (typically located at c:\Program files\Trumpet\SymphonyOCR - but it could also be under a folder called Maestro on older installations)
  3. Rename the documents.db and documents.lg files
  4. Launch Symphony OCR
  5. Symphony OCR will now re-build the entire database by looking at each document in Worldox - this can take a little while.  Note:  the system isn't re-OCRing any documents that have already been OCRed.
...

19.11. Backlog Processing is Throttled

Symptoms

Your firm has more pages to process than you have processing capacity.   Symphony OCR's backlog throttling system is kicking in to ensure that you have sufficient processing capacity for the new documents that you are likely to add between now and the end of the year.

Background

Please refer to the following article for a discussion on backlog throttling: How Backlog Throttling Works   Please note that backlog throttling is a very advanced feature, and it is very unlikely that changing the parameters is the correct thing to do for your site.

Resolution

You have two choices:

  • Contact your Symphony Channel Partner to inquire about increasing your processing capacity, or
  • Be patient - that backlog will get processed over time as capacity becomes available
...

19.12. Removing many documents from the Symphony OCR database

Background

In some situations, you may wish to completely remove records from the Symphony OCR database.   One example of this is when the user changes the selected PGs in the Worldox configuration screen.   Changes to the Worldox configuration screen impact the *finder* component of Symphony OCR only.   Any documents that are already in the database will continue to be processed, even if they reside in profile groups that have been de-selected.

Procedure

Important: this procedure uses the Advanced/Debug interface for Symphony OCR.   Be careful!

Option 1 - remove many explicit documents

  1. Create a text file with the full path of the document records you wish to remove.   Place the file in a location that is available to Symphony OCR (generally a file on the disk of the workstation that runs Symphony OCR)
    Tip: You can use the Export CSV button in the Processing list to obtain a list of all documents in the processing queue.   This CSV can then be edited to include only the paths that you wish to delete.   Be sure to remove the other columns!
  2. Click Advanced
  3. Type the full path of the text file created in step 1 in the 'Or files in the following file' field:
  4. Click the Delete Database Records for Specified Files button

Option 2 - Copy and paste from Worldox

This option works well if you wish to delete records for a handful of documents (up to about 150)

  1. In Worlodox, search for the documents
  2. Select the documents and type Ctrl+C - this will place the paths of all of the documents onto the clipboard
  3. In Symphony OCR, click Advanced
  4. Paste the paths into the 'List files to perform debug operations on' field
  5. Click the Delete Database Records for Specified Files button

Option 3 - Remove all document records from Symphony OCR

If you wish to remove all document records from Symphony OCR, you can reset the Symphony OCR internal database.   When Symphony OCR is restarted, it will rebuild the database based on the documents that it finds.   This will result in a loss of all processing history, but any documents that have already been OCRed will not be affected, and records for those documents will be recreated and placed in the Processed list without consuming additional processing capacity.

  1. Close Symphony OCR using the Quit side-bar link
  2. Navigate to Program Files\Trumpet\Symphony OCR\Data
  3. Rename the following files:
    • documents.db   -> documents.db_old
    • documents.lg   -> documents.lg_old
    • ProcessorPerformance.dat   ->   ProcessorPerformance.dat_old
  4. Restart Symphony OCR
...

19.13. Explicitly adding documents for processing

Background

In some situations, you may wish to explicitly tell Symphony OCR to process a set of documents with higher priority than others.   This can also be used to process documents that wouldn't normally be discoverable by Symphony OCR (e.g. because of profile group selection).

This functionality was added in version 5.2.46

Procedure

Important: this procedure uses the Advanced/Debug interface for Symphony OCR.   Be careful!

  1. Click Advanced
  2. Paste the full paths of all of the documents whose records you wish to debug  (one to a line) into the "List files to perform debug operations on" text area
    Tip: If you have a set of documents selected in Worldox, you can type Ctrl+C to put all of their paths onto the clipboard, then paste the results into the "List files to perform..." text area of the Advanced screen
  3. Click the  Process Specified Documents Immediately button
...

19.14. Needs Attention file - Adding text to image failed — null

Background

After updating Symphony OCR to version 8.0.7 or higher, some files appear in the 'Needs Attention' list with the reason being "Adding text to image failed — null". When the files are reanalyzed, they are not processed and they return to the 'Needs Attention' list.

Resolution Updated

We have an update available that resolved the error. Update SymphonyOCR to version 8.1.3 or higher. 

Symphony will automatically reanalyze any files in the Needs Attention list after the update. If you have Ignored files related to this error you can manually tell Symphony to reanalyze them.

 

 

...

19.15. Symphony OCR is consuming lots of disk space on our indexer's local disk

Background

Important:   There was a bug in an earlier version of Symphony OCR (prior to 5.2.77) that could cause a large number of files to accumulate in the SymphonyOCR\Work\processor1\ocr folder.   If this folder contains lots of sub-folders or files, please use the Check for Updates link to install the latest version of Symphony OCR.   It is safe to manually purge the SymphonyOCR\Work\processor1\ocr folder (the DOS command rmdir /s "C:\Program files\Trumpet\SymphonyOCR\Work\processor1\ocr" will do this if you are comfortable using the command prompt).   The rest of this article applies if you are running versions higher than 5.2.77.

 

 

By default, Symphony OCR will retain versions of the documents it OCRs for a period of 7 days.   If your firm is processing a large backlog, this can represent a significant number of files.

The retained versioning system is merely a belts-and-suspenders feature - just in case something goes wrong.

These past versions are stored in the Symphony OCR\Work\backupfiles sub-folder.

Resolution

You can adjust the retention period for the Symphony OCR backups in the Processor Configuration Screen or you can disable the retention entirely.

If you wish to purge old files immediately, make the adjustments to the retention policy, then use Debug->Purge old backups (Advanced->Purge old backups, on older versions.)

...

19.16. Performing Maintenance screen never updates

Description

When Symphony OCR launches, it displays the Performing Maintenance.   This screen never refreshes, and the "progress" graphic does not spin (i.e. is not animated).

Clicking on the Symphony OCR icon in the upper left corner of each page should refresh that page - when this issue is happening, the refresh does not occur.

Background

This behavior can be caused by restrictive policies in Internet Explorer

Solution

Add the Symphony PC to the list of Trusted Sites in Internet Explorer.   Here's how:

  1. In Internet Explorer, with the maintenance screen showing, click Tools->Internet Options (you may have to press the Alt key to see the Tools menu)
  2. Switch to the Security tab
  3. Select Trusted Sites
  4. Click the Sites button
  5. Confirm that the URL in the 'Add this website to the zone' field is the Symphony PC
    Important:   The site should start with "http://" and NOT "https://" - if Internet Explorer won't let you add the site because of the https prefix requirement, uncheck the 'Require Server Verification' checkbox
  6. Click Add then Close
  7. Click OK to close the Internet Options dialog
  8. Click Refresh in Internet Explorer and confirm that the maintenance screen updates
...

19.17. OCRed documents not text searchable

Background

Symphony OCR converts image only PDF files into text searchable PDF files.   Once the PDF file contains text, users can immediately search *within* the PDF.   However, the Worldox text indexer must update before the text contents of the PDF become available for text-in-file searching (i.e. searching *for* the document).

We recently found an interaction problem between Symphony OCR and the Worldox indexer (WDINDEX) that prevents some PDF files from being added to the full text index.

Cause

WDINDEX maintains a local cache of the text it extracts from documents.   This allows Worldox to use the cached text extraction during text database inits - instead of re-parsing text from every file on the network.   This local text cache dramatically improves text database rebuild times.   WDINDEX determines whether it should use the text cache for a given file by checking the network file's modified date.   Symphony OCR preserves the file's modified date when it performs OCR.   This can result in WDINDEX using cached text extraction (consisting of no text) instead of the updated version of the file on the network.

Resolution

In version 5.2.67, Symphony OCR now modifies the date of files it OCRs by a single minute.   This will ensure that WDINDEX will not use cached text extraction results for files OCRed after S-OCR was updated to 5.2.67.

For files that were OCRed before the update to 5.2.67, the user will need to purge the WDINDEX text cache - this is easy to do, but can take a bit of time, depending on the size of the document repository.   Here are instructions:

Purging the WDINDEX local text cache

  1. Log on to the Indexer PC
  2. Make sure you have updated to Symphony OCR 5.2.67 or higher - (see the update instructions if you need them)
  3. If WDINDEX is counting down, click Close Server to return to the main WDINDEX configuration screen
  4. Select the first drive in the Active Drives list
  5. Click Drive->Purge Local Cache:
  6. When prompted to delete the local text cache data, click Yes
  7. After the purge finishes (this could take 5 to 20 minutes), select the next drive in the Active Drives list and repeat the above procedure

 

The next time WDINDEX rebuilds the text databases, it will rebuild the local text cache (this could cause the first rebuild to take longer than normal).   After that, everything should just like before - except that all of your OCRed files will be text searchable.

...

19.18. Java.io.IOException: StringLong keys can't be longer than 300K error

Symptoms

User notices that Symphony OCR has a system condition of Red and the finder screen displays an error of "java.io.IOException: StringLong keys can't be longer than 300K - database must be corrupted."

Background

There is a bug in Symphony OCR prior to version 5.2.63 that can result in corruption of the Symphony OCR database indexes.

Solution

  1. Update Symphony OCR to 5.2.63 or higher (see How to update your version of Symphony OCR)
  2. Compact the database (see 'Compact/Repair' in How to repair, reset or compact the Symphony OCR database)
...

19.19. Documents in Inaccessible list are not OCRed - Security prevents manipulation of document

Symptoms

Documents appear in the Inaccessible list, and have history with the following note in it:

01-30-2013 03:17:45 PM : Security prevents manipulation of document

In Worldox: Searching for the document in Worldox shows an icon next to the document with a yellow or red lock icon.

Background

Symphony honors your DMS's security model. If a document is hidden, or secured in a way that the Symphony User cannot modify it then it will not be processed.  In Worldox this means: if a document appears with a red or yellow lock icon, it cannot be processed.

Resolution

If the issue is related to Worldox security classifications:

  1. See: Configuring Worldox security classifications so secured documents can be OCRed
  2. If necessary, reclassify documents
  3. Click Re-analyze All in the Inaccessible list

If the issue is caused by the file being marked read-only:

...

19.20. Documents in Inaccessible list are not OCRed - File system or DMS security settings block processing (Worldox sites only)

Symptoms

Documents appear in the Inaccessible list, and have history with the following note in it:

File system or DMS security settings block processing

Searching for the document in Worldox shows an icon in the VER# column that looks like this:  .

Background

This icon indicates that the documents have been F-Locked, which is a special read-only setting in Worldox.  Once a file has been locked using this method you cannot unlock it and it becomes read-only.  The document will be identified as a locked file and a hash value ensures that file contents will not change.  If you want to edit the file after locking it, you must check it out, edit it, and check it back in.  The only option should be to create a new version, therefore, Symphony OCR cannot process a document that has been locked using this method.

Alternate Symptoms

Documents appear in the Inaccessible list, and have history with the following note in it:

File system or DMS security settings block processing

Searching for the document in Worldox shows no security which should prevent processing.

Background

This will be the result if the server doesn't have 8.3 filenames populated for the file.  8.3 filenames are required by Worldox, but we see this sometimes when older legacy cabinets are processed.  To determine if 8.3 filenames are populated, open a 'cmd' window and do a 'dir /x' on one of those file locations (see below for more instructions) - you should see the long filename *and* the 8.3 filename:


If the 8.3 filenames have not been populated, Trumpet has a tool to take care of that - 8.3 Filename Tool  (Purchase required.  Contact operations@trumpetinc.com for more information).

Note: For those that may not be super familiar with the command window, here's an example of what to type in the cmd line:

If the file in question lives in a path like this W:\DocVault\CLIENT\CLARCA1\BILLING\CHECK, then you'll type this in the command window:

dir /x W:\DocVault\CLIENT\CLARCA1\BILLING\CHECK

Hit enter and the window should generate a list of all of the files within that location. The farthest right column will be the document's name (doc ID) and the column to the left of that will show the 8.3 name, if available.


...

19.21. Documents in the Inaccessible List when security not applied

Symptom

You see documents in the Inaccessible list, with the "Inaccessible Reason" showing:  File system or DMS security settings block processing".

Background

Symphony honors the Worldox security model.  If a document appears with a red or yellow lock icon, it cannot be processed.   If these files are secured see:  Documents in Inaccessible list are not OCRed

If these files are not secured, (do not have the red or yellow lock icon next to them) this may be caused by having an invalid base path defined for the profile group.  Worldox requires that each part of the Profile Group base path be less than 8 characters and contain no spaces.

Resolution

Change the Worldox Profile Group base paths (all directories and filename) to contain no spaces and be less than 8 characters, rename the folders on disk, and adjust the indexing rules accordingly. 

Another strategy is to configure a new UNC share pointing at the base of the profile, and ensure that the UNC share name is 8 characters, no spaces.  Then use ?:\ notation to configure the base path (your Worldox reseller will probably need to help you do this).

 

...

19.22. Worldox API (WDAPI32, WBAPI) crashing

Background

Symphony OCR uses the Worldox API to interact with your Worldox document repository.   This API consists of a DLL called WDAPI32.DLL and an executable called WBAPI.EXE.   If these libraries are loaded from the server (instead of local to the Symphony workstation), any network interruption can cause the Worldox API to crash.   This will not completely bring Symphony OCR down, but it does cause problems, and it is generally best to load those libraries from the local C drive of the Symphony machine.

How Symphony Decides Where to Load the Worldox API Libraries

Symphony OCR uses the workstation's system path to determine the location it should load libraries from.   Worldox sets the system path appropriately whenever Worldox, WDMirror or WDIndex are launched.   If you launch Worldox or WDIndex directly from your file server, then the system path will be configured pointing at the file server.   If you use WDMirror, then the system path will be configured to point at the local C:\Worldox folder of the workstation.   This later is what you want.

Checking the System Path

  1. Open a command prompt (Start->Run, cmd)
  2. Type 'path'
  3. Check the resultant path for the Worldox folder.   If it is C:\Worldox, then this article doesn't apply to you.   If the path points to the Worldox folder on your network, then use the steps in this article to resolve the issue.

Resolution

On the Symphony workstation, check all of the shortcuts you use to launch Worldox or WDIndex ande confirm that they are using WDMIRROR.EXE to launch.   If you find any shortcuts that are launching worldox.exe, wdindex.exe, wbindex.exe (or any other executable besides wdmirror.exe), replace them with equivalent wdmirror.exe commands.

For example, here is the correct way to launch Worldox:

<network path to Worldox>\wdmirror.exe

or

C:\Worldox\wdmirror.exe

 

And here is the correct way to launch WDINDEX:

<network path to Worldox>\wdmirror.exe /wdindex

 

After you fix the shortcuts (remember to check the Start->Startup shortcuts also!!):

  1. Close Worldox, WDIndex, Symphony OCR (and Symphony Profiler if applicable)
  2. Launch Worldox
  3. Check the system path again and confirm that the path now refers to C:\Worldox
...

19.23. Very Large Documents Preventing others from Processing

Background

You may have a document in your document repository that is several thousand pages long.   While Symphony OCR will patiently process this document, it may take several days to complete the processing preventing other documents from being processed.  

Resolution

You can set a maximum page count for Symphony OCR to process.   This will keep Symphony OCR from getting tied up processing these very large documents.   To enable this, you can do the following:

  • Close Symphony OCR
  • Navigate to C:\Program Files\Trumpet\SymphonyOCR\Config\ and open the settings.xml file using notepad
  • The setting you want to add is highlighted in yellow below:
                  <documentProcessor autostart="true" dryRun="false" maxProcessingErrorAge="3600000" maximumFileAge="9223372036854775807" minimumFileAge="30000" processTif="false" maximumPageCount="300" />
    Change the "300" to whatever you feel is appropriate
  • Save the settings.xml file
  • Launch Symphony OCR

If you do opt to make this setting change, please let us know by emailing support@trumpetinc.com.   If we have several requests to do this, we may opt to add this to the User Interface.

...

19.24. ABBYY is probably not installed

Symptoms

Symphony OCR is in either a warning or error state with the error message indicating that ABBYY is not installed.

Background

Occasionally, as a part of a normal update, the OCR engine will need to be updated.   We've seen very rare cases where this engine will not automatically install.   This error is a result of the OCR engine not being installed properly.

Resolution

Manually install the OCR engine.   To do this, download "www.trumpetinc.com/getresource/symphonyenginesetup and save it to the desktop.   Run this engine installer, and install to the default directory:

After this, re-run the normal Symphony OCR installer and launch.

 

...

19.25. "Unable to process msg request" or "Outlook has not been launched one time on this machine" error during MSG processing

Background

When processing any MSG file, an error is thrown stating one of the following:

Unable to process msg request [0xfffffffd]

or

Outlook has not been launched one time on this machine or the Outlook installation needs to be Repaired.  Error [0xfffffffd]

 

Discussion/Resolution

This error is ultimately caused by Symphony being unable to work with the Outlook MAPI sub-system.  We have seen a few causes of this:

1.  The Outlook installation is corrupted (in which case, use Programs & Features to do a Repair on the Microsoft Office installation)

2.   Outlook is in the middle of an automated update and is waiting for the machine to be rebooted (in which case, reboot the machine and see if that takes care of things)

3.  Outlook hasn't been launched and set up on the machine (you don't have to set up an actual mail account - but you do need to go through the Outlook Welcome Wizard)

4.  There are multiple instances of Outlook installed on the same machine (e.g. Outlook 2010 and Outlook 365 on the same machine), MAPI is configured to use one of the versions, but the user is launching and using a different version (in which case either run a Repair on the install that the user does use, or launch the other Outlook install one time and go through the Outlook Welcome Wizard)

 NOTE: SOCR is not compatible with Outlook 64bit. To resolve, install Outlook 32bit and launch one time.

...

19.26. Unable to bring up Symphony OCR interface

Symptoms

Unable to open Symphony OCR - behavior is like a bad URL:



Background

First of all, if Symphony OCR is installed as a service, please make sure the Symphony OCR service is running. Open up Windows Services to check for the service and make sure it is running.

Otherwise, we've seen some sites that don't handle the fully-qualified domain names properly.  Consequently, the 'Machine.Domain.local' URL fails.  To confirm this, change the URL to omit the ".Domain.local" portion.

For example, my default URL may be:  Indexer.Trumpet.local:14722/maestro/do/showWelcomeScreen

If this failed, I would then try:  Indexer:14722/maestro/do/showWelcomeScreen

(Omitting the ".Trumpet.local" portion.)

If this works, then the fully-qualified domain name is causing the problem.


Alternatively, if even that is not working, you could try inputting the machine's IP address:

192.178.1.###:14722/maestro/do/showWelcomeScreen


And if that continues to fail then you may have an issue with rerouting entirely, at which point you can input "localhost":

localhost:14722/maestro/do/showWelcomeScreen

*Please note that with this solution (using localhost), the Symphony OCR interface will NOT be able to open from another machine.

Resolution

Once you're able to input a working URL, you may want to adjust the settings file so that it uses the new URL everytime, instead of the default. To do this, the default URL can by modified by editing the settings config file.  These are the steps:

  1. Close Symphony OCR (or if Symphony OCR is installed as a service, stop the service)
  2. Using Notepad, open the following file:  C:\Program Files (x86)\Trumpet\SymphonyOCR\config\settings.xml
  3. Find the line that begins with: <webServerConfiguration...
  4. Add item on that line: hostname="Indexer"

    (Note:  Your hostname will be different.  I'm following the first example from the 'Background' section above.)

    >>> So the line might look like this:
    <webServerConfiguration autoPort="false" listenPort="14722" hostname="Indexer"/>

  5. Save file and restart Symphony OCR


...

19.27. Documents aren't searchable with NetDocument's ECHO turned on

Background

If you have NetDocument's Echoing feature turned on, a Checked out document is downloaded to the Echo folder to improve performance and redundancy.  When you open that document from NetDocument's interface, it scans the Echo folder to determine it's there and opens a copy of the document in the Echo folder.

If your document has been OCR'd by Symphony OCR that copy will be in the document repository, but your local copy may not have the OCR results in that Echo Folder.  If other users can see the OCR results within the file, but you cannot, it might be because you are utilizing the Echoing feature.

Resolution(s)

NetDocument's knowledge book article here:  How does the Check In List and Echo Folder Work describes how to disable Echoing and how to Reset the Check In List which may be useful resolutions for you.





...

19.28. Symphony OCR doesn't automatically redirect to home page

Background

Symphony OCR does not automatically redirect to the home page upon launching, and user is required to click the "Click Here" button to proceed.


Resolution

Control Panel > Internet Options > Security > Internet > Custom Level > Scripting > Enable



Control Panel > Internet Options > Security > Trusted Sites > Sites > Enter current website for Symphony OCR > Add


...

19.29. Symphony fails sending email: PKIX path building failed

Symptoms

Symphony displays the following error:

Unable to send emails - Communication error - sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 

Background

This caused by a change in new Java libraries.

Solution

Update Java:


Stop Symphony (or stop service if configured to run as a service)


Go to Control Panel and uninstall all instances of Java


Go to Java.com and select 'Download' in the main title bar:



Select 'See all Java downloads':



Select 'Windows Offline':



Install that version and then re-launch Symphony (or start the service if configured to run as a service)



...

19.30. Symphony OCR crashes during analysis

Background

Symphony OCR crashes unexpectedly with the user interface showing java errors.  If you open the "C:\Program Files (x86)\Trumpet\SymphonyOCR\logs\maestro.log" log file, you'll notice an entry that looks like this:

2020-06-01 06:43:15,637 [Analyzer-2] ERROR com.trumpetinc.maestro.processormanager.ProcessorThreadPoolExecutor - Processing task failed with an error: Java heap space
java.lang.OutOfMemoryError: Java heap space

Cause

Symphony OCR requires a substantial amount of memory to be available for analyzing and processing documents.  Some documents require more memory than what Symphony OCR is able to allocate to analyzing and processing the document.

Resolution

  • In version 7.2.49 and higher, Symphony OCR increased the default memory allocation to 1024M.  If you haven't already done so, go ahead and update to this version or later
  • Open the "C:\Program Files (x86)\Trumpet\SymphonyOCR\launch.ini" file in Notepad.  Ensure the 'MaxHeap' entry is set to '1024m'.  If it is not, change the value and save the file.  Restart the Symphony OCR program/service after the change is made to verify the Analysis queue is processed.
  • Lastly, if the above don't resolve your issue, there still may not be enough memory allocated to process the very small subset of documents.  We've seen this with large topographical maps for example. The following steps can be used to manually ignore offending documents. This process may need to be repeated many times until the analysis backlog has been processed:
    • Stop the Symphony OCR program/service
    • Open the config file "C:\Program Files (x86)\Trumpet\SymphonyOCR\config\settings.xml" in Notepad (this is the default location, your system could be different)
    • Find the following lines and change the 'autostart' value from "true" to "false"

      <ocrManager autostart="true" maxThreads="0"/>
      <analyzerManager autostart="true" maxThreads="0"/>

    • Start the Symphony OCR program/service
    • Open the log file "C:\Program Files (x86)\Trumpet\SymphonyOCR\logs\maestro.log" in Notepad
    • Go to the bottom of the file and locate the last document with the status INPROCESS - Copy the path or Document ID number to the clipboard
    • Select 'Lookup Document' on the main Symphony OCR interface, and paste the path or document id number from the clipboard and click the "Query" button
    • In the details of the document, choose "Ignore"
    • After the file is ignored, restart the Analyzer - If Symphony OCR crashes again, repeat the above process until the Analyzer backlog is empty.
    • After the Analyzer queue has been fully processed, reset the default values in the "C:\Program Files (x86)\Trumpet\SymphonyOCR\config\settings.xml" config file - Find the following lines and change the 'autostart' value from "false" to "true"

      <ocrManager autostart="true" maxThreads="0"/>
      <analyzerManager autostart="true" maxThreads="0"/>

If you are unable to find the offending file(s), feel free to reach out to us at support@trumpetinc.com and we'll be happy to help.

If you do find an offending file is causing the problem, we'd love to get a representative sample to investigate in detail.  If possible, you can upload the file here:  https://extranet.trumpetinc.com/upload/clientdata

IMPORTANT: Under no circumstances should you email sensitive data to us. This upload site is secure; email is not a secure transport medium.



...

19.31. Handwritten pages are rotated by Symphony OCR when they shouldn't be

Background

Handwritten documents are upside down after having been processed by Symphony OCR.

Symphony OCR has the setting to Automatically Rotate documents turned on by default.  It makes every effort to determine the proper orientation for handwritten documents, however, handwriting makes the page orientation often hard to determine. 


Resolution

Unfortunately, this algorithm is not something that can be tuned, so you have two options:

  • Turn off automatic page rotation
    • Open Symphony OCR
    • Choose "Processor"
    • Under "Basic Settings", uncheck the "Automatically rotate pages to proper orientation" check box
      Note: this will disable the rotation for all documents
  • Turn on logging to indicate which pages have been rotated:
    • On the workstation running Symphony OCR, navigate to C:\Program Files (x86)\Trumpet\SymphonyOCR\config
    • Open the log4j.properties file using notepad
    • Change the "log4j.logger.autorotatelog=ERROR, AUTOROTATEFILE" line to "log4j.logger.autorotatelog=INFO, AUTOROTATEFILE".
    • You will now have a listing of files that have been rotated in a log file at: 'C:\Program Files (x86)\Trumpet\SymphonyOCR \ogs\autorotate.log' file., and can manually re-rotate them.
...

20. Not Processed List Description

20.1. Needs Attention

Needs Attention:  Documents in the 'Needs Attention' list are those that appear to be eligible for OCR, but encountered problems during processing.  Files in this list could be corrupted or contain invalid images (try opening them in an image viewer to be sure), or they may be images that Symphony OCR does not handle yet. 

Occasionally, a document can fall into the 'Needs Attention' list because of bad timing - Symphony OCR trying to process the document when it isn't fully available.  So we always recommend clicking the "Show Bulk Operations" button and then "Re-Analyze All", just to ensure this isn't the case.

If the document is corrupted, you can either remove the document from Worldox, or manually tell Symphony OCR to "ignore" it, which will put it on the 'Ignored' list.  If the 'Needs Attention' list contains any documents, the overall system condition will show as "Warn."  Ignoring a document that you have already checked is a good way to change the system condition back to "OK". 

If the document does not appear corrupted, the next step would be to allow us to see a copy of the file.  Because PDFs can be generated in countless different ways, we occasionally run into a specific sub-type of PDF that we've not encountered before.  If we can get a copy of the file that is falling into the 'Needs Attention' list, we can in almost all cases, add support for the file.  Please contact us at support@trumpetinc.com for instructions to upload documents to our secure site.


...

20.2. New

New:   Documents in the New list are those that have be found by the finder tool, but not yet allocated to another document list (documents are only in the New state for a very short period of time).

...

20.3. Not Safe

Not Safe:  Documents in the 'Not Safe' list are those that appear to be eligible for OCR, but due to the PDF library that created those documents, they are not safe to process.  If Symphony OCR processes those documents, they become corrupted.  


...

20.4. Too Old

Too Old:   Documents in the Too Old list are those that have a file modified date older than the cut off age defined the Processor configuration.

...

20.5. Inaccessible

Inaccessible:   Documents in the Inaccessible list are those that could not be processed because of file system security, Worldox security, read-only attributes or other conditions that prevent the document from being accessed and worked on.   In addition, if the profile group in which the documents reside contains an invalid base path (containing a space for example), or if the file has a space immediately prior to the document extension, they will be shown in the inaccessible list

...

20.6. Corrupted

Corrupted Documents - Documents in the corrupted list are those that Symphony OCR does not recognize as valid files. The most common reason is that the file is an invalid or corrupted PDF (try opening in Adobe to be sure).  Another possibility is that there is some characteristic of the PDF that the Symphony OCR parsing algorithm isn't handling properly.  Trumpet does periodically update the PDF parsing algorithms to address corner cases that have not been encountered before.

What to do?

Try opening the file in Acrobat, then hit Save (Acrobat will try to open and auto-repair corrupted files - when you save the document, it will save uncorrupted).  After saving and closing the document, click the Re-Analyze button on the document record in Symphony OCR.  This will only work if the file is only lightly corrupted, but is worth a shot.

If that doesn't help, next check to see if the file is already text searchable (i.e. can you search for text inside the PDF already?).  If you can, then the document isn't a candidate for OCR anyway, and you can just move the document to the Ignore list.

If the document does need to be OCRed, and the Adobe repair doesn't help, then you may want to submit the document to us for analysis.  Open a support ticket by emailing support@trumpetinc.com and we will send information on how to securely upload the document to us.  If we find a problem in our parsing algorithms, we'll fix the issue and get you a patch.

If there are a large number of files that have the same corruption reason, and the files don't appear to actually be corrupted, please open a support ticket by emailing support@trumpetinc.com and we will send information on how to securely upload a sample document to us.  If we find a problem in our parsing algorithms, we'll fix the issue and get you a patch.  Alternatively, you can use a bulk Ignore operation to move the documents to Ignore.

...

20.7. Encrypted / Restricted List

Encrypted / Restricted:   Documents in the Encrypted/Restricted list are those that are restricted from being processed because of some characteristic of the file itself (for example, an encrypted or partially restricted PDF file will not be processed).

...

20.8. Ignored

Ignored:   Documents in the Ignored list are documents that a Symphony OCR administrator has explicitly told Symphony OCR not to process. Any document on this list was explicitly placed there by human intervention.

...

20.9. Wrong Type

Wrong Type:   Documents in the Wrong Type lists are a tif documents and TIFF processing is not enabled.

...

20.10. Moved / Unavailable List

Moved / Unavailable:   Documents in the Moved / Unavailable list are no longer available in the Document Management System (DMS).   This could mean that the DMS has gone "offline" or the DMS settings have been adjusted so that the documents would not have been found for processing (e.g., if a user selects a profile group to analyze and OCR, and then chooses to un-check that profile group or no longer process it).   Document records in the Moved/Unavailable list will be deleted from the database after 15 days.   Documents can also appear in the Moved / Unavailable list if they are no longer at that current location.

...

20.11. Digitally Signed

Digitally Signed:   Documents that are digitally signed will not be processed by Symphony OCR because adding OCR information to these documents would invalidate the digital signature.   If you wish to have these documents OCRed anyway (and are OK with invalidating the digital signature), please send an email to support@trumpetinc.com and request that functionality be added.

...

20.12. Too Big

Too Big (to 8.0.0 and higher)

If a document falls into this list, it does NOT mean the document is contains too many pages.  Symphony OCR processes files one page at a time.  So if a document falls into this list, it means the document contains one or more pages with pixel dimensions larger than a specified value.  In this version of Symphony OCR that value is 32512 x 32512 pixels.  

This is a hard limit and cannot be overwritten.

Too Big (Prior to 8.0.0)

If a document falls into this list, it does NOT mean the document is too big.  Symphony OCR processes files one page at a time.  So if a document falls into this list, it means the document contains one or more pages with pixel dimensions larger than a specified value (ie. The page couldn't be loaded into memory).  We usually see this in documents like blueprints of schematic drawings.  But there are some things we can do to try to get these types of documents processed, if you find that it needs to be processed.  

Clicking on the document in the 'Too Big' list will tell you the size of the offending page. 

1) Click on the 'Too big' list.

2) Click on the individual document in question.

3) The offending size of the document is available in the document details.



If you find you have a series of the same type of documents, it's usually the case where the same size file is exceeding the limit.  You can attempt to process these documents by modifying the value(s) declared in the setting.xml file.  (Defaults differ depending on the version you're running.

...

20.13. Unprocessed Email

Unprocessed Email: This number indicates the number of email messages (.msg files) found in your repository. Of those .msg files, there may be attachments that would benefit from being OCR'd. The number you see under "Not Processed" does not indicate the number of eligible .msg attachments because those documents have not yet been analyzed.

Symphony OCR has a setting that allows you to analyze those .msg files and OCR any eligible attachments. To enable that setting review this article: Enable the Processing of Email Attachments.

Note: 32bit Outlook will need to be installed and launched once on the workstation that SOCR is installed to. If you have Worldox, you will also want to ensure that Text Indexing of Email Attachments is enabled.

...

20.14. No Image or Text

For the most part documents that do not contain an image or text don't need to be processed.  One example of this might be your company's PDF letterhead template.  By default these PDF documents will not be processed and will be placed in the "No Image or Text" Document List.  How can a PDF have no image and no text?

Content can be drawn in a PDF using one of three mechanisms:
  1. A bitmap image can be placed on the page. 
  2. Text can be rendered on the page using fonts. 
  3. Simple drawing operations (i.e. line or curve segments).
Scanned documents are always bitmaps drawn on the page - these documents are not text searchable without OCR.  If a document is created electronically (e.g. print to PDF from Word), the document will typically consist of text rendered using fonts - these documents are generally text searchable without OCR.  However, there are some cases where the font can't be embedded in the PDF (for font licensing or other reasons), the text content will be rendered using simple drawing operations.  When this happens, the documents are not text searchable without OCR.  

Certain tools like Autocad always use drawing operations to render all content (line segments in an architectural drawing, for example), including text.

How does Symphony OCR handle no-image/no-text documents?  It is not possible for Symphony to reliably differentiate between drawn line segments that are part of words and drawn line segments that are just lines that are part of an architectural drawing, table lines, etc...  As such, we mark these documents for special handling 'No Image/No Text', and allow the user (that's you!) to force OCR of a given document if desired.

If you would like Symphony OCR to process a specific document (or set of documents), you may force processing.  Here's how:

To process only a single document in the list:

  • Select the "No Image or Text" item from the navigation panel on the left
  • Click on the appropriate document in the document path to open the Details view
  • Click "Enable Processing"

To process all No Image or Text files on a "per document" level:

  • Select the "No Image or Text" item from the navigation panel on the left
  • Optionally use the Filter box to get a sub-set of the documents
  • Select "Show Bulk Operations"
  • Choose "Enable Processing"

 To process all No Image or Text files permanently going forward:

  • Close Symphony OCR by using the Quit button in the bottom left corner of the interface (or, if Symphony OCR is installed as a service, stop the service)
  • Go to C:\Program Files (x86)\Trumpet\SymphonyOCR\config (may vary if you installed in a different location)
  • Right-click on the settings.xml file and select Open With > Notepad
  • Locate this line: <heuristicComputerProvider alwaysAnalyzeAndProcessNoImageNoTextPages="false"/>
  • Change the "false" to "true"
  • Save and close the file
  • Re-launch Symphony OCR (start the service if Symphony OCR is installed as a service)

You can confirm this setting has been applied by viewing the Analyzer and Processor pages:


 

...

Configuration Guide

1. Licensing



License

This is where your Symphony OCR license is set. To change your Symphony OCR license, simply click "Licensing" from the Configuration side bar, enter your new license, and select "Save Changes."

License Details

Provides you details of the license.

Features Allowed by your License

This area tells you which features are allowed by your license.

Updating your License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your renewal invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will automatically see this new license, download it and install it.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can click the "Check for Updated License" link on this page.  This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers.  As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

 

...

2. Notifications



Notifications allow users to be emailed nightly based on the status of Symphony OCR.

  • Enter the email address for the person you wish to receive notifications in the "Add e-mail address" box

  • Select the Notification Type from the drop down


Each email address may be configured with one of four types:

Never - nightly emails will never be sent to this recipient (instead, after entering an email address you can select "Send Now" and deliver an email to the recipient on demand).

When there are errors - the nightly email will only be sent to the recipient if the overall system condition is Error.  This is useful for recipients who only need to know when the system is not processing documents because of some major error (licensing issues are the most common major error).

When there are warnings or errors - the nightly email will only be sent to the recipient if the overall system condition is Warning or Error.  The warning condition is triggered by documents in the Needs Attention list, configuration problems or other system level issues that should be looked at, even though they haven't completely stopped processing from occurring.

Always (aka Daily) - the nightly email will be sent to the recipient every night regardless of system status.  This is useful for firms who want to monitor the 'Not Processed' lists to ensure that every document that couldn't be OCRed (e.g., because of security or corruption) has been reviewed.  Users can review documents in the various 'Not Processed' lists and either correct the underlying issue, or move the documents to the Ignore list using Bulk Operations >Ignore.

  • Select "Save Changes"

If you have a user leave the firm or you no longer wish for a particular user to be notified, you can change the Notification Type to "Never" or remove the user entirely by selecting "Remove" to the right of the address.

...

3. Worldox

Basic settings

Worldox User Code - This is where the Worldox user is specified. This is the user that Symphony OCR should search for documents as (note that Symphony OCR does not actually use a Worldox license). Symphony OCR will have access to all profile groups that the specified Worldox user has access to. This user should have Worldox Manager Rights. We recommend using the 000000 user. 

Worldox Network Folder - This is the network folder in which Worldox is installed.  It can be identified as a UNC path or a mapped network drive (e.g. \\server1\DMS\Worldox, or X:\Worldox) unless you are running Symphony OCR as a service, in which case it must be identified as a UNC path.

Profile Groups to Monitor

This is the list of profile groups the user specified has access to. If a profile group does not appear in the list, this user does not have access to those profile groups (or the profile group has not been properly configured in Worldox). You can select the checkbox in the header area to automatically select and process all documents in all profile groups. If you wish to only process certain profile groups, you can simply select the applicable ones.  Be sure to select "Save Changes" at the bottom of the screen.

Default Priority - There are 6 processing priorities which range from Very Low to Very High and includes "Analyzer Only".  By default all profile groups will be processed with a "Normal" priority.  If you wish to change the priority for a particular profile group, select the appropriate item from the drop down arrow.  If you wish to re-prioritize documents that have already been found in that particular profile group as well as new documents that are in that profile group, select the "Reprioritize existing documents" checkbox.  For more information on Processing Priorities see:  Processing Priorities

Refresh - Allows you to refresh the list of available profile groups.  For example, if you have added a new profile group to Worldox, and wish to process that group, you can select this which will provide you with the newly added profile groups.

View Detailed Progress - Selecting this will take you to the Progress Details page.  This will provide you with a list of profile groups, the number of documents and pages that have been processed / not processed per profile group.

Advanced settings

Process Read Only Files - If you wish to process read-only files, you should check this checkbox.

Indexed Search Frequency - By default Symphony OCR will search for documents in selected profile groups once every 15 minutes using Indexed Searches.  This should be sufficient for your needs, however you can change this to search more or less frequently.

Non-indexed Search Frequency - By default Symphony OCR will search for documents in selected profile groups once every 12 hours without using Worldox indexes.  Because it takes a significant amount of time to crawl through the directory structure to find files, once every 12 hours should be sufficient.

Debugging

Reset Worldox Session - Selecting "Reset Worldox Session" will reset the Worldox session for the user defined in the Basic Settings above.

...

4. Open Text

Database login credentials

Login to database with username: Enter the username for the database

Login to database with password:  Enter the database password

Database computer name:  Enter the database computer name

Database server instance name:  this is optional and required only if there is more than one database on the server, if there is more than one database on the server, enter the instance name you wish to process

Database name:  enter the name of the database

Advanced settings:

New document search frequency:  By default Symphony OCR will search for documents once every 15 minutes.  This should be sufficient for your needs, however you can change this to search more or less frequently

Legacy document search frequency:  By default, Symphony OCR will perform a search for legacy documents (documents existing prior to installing Symphony OCR) every 7 days.

...

5. PracticeMaster

Basic settings

PracticeMaster network folder/current working directory this is where the Practice Master network folder is identified.  Copy and paste the path to the network folder into the field.

Documents folder  this is the root of where the documents reside within Practice Master.  Copy and paste the path to the folder into the field.

Advanced settings

Process Read Only Files - if you wish to process read-only files, you should check this check box

Finder Scan Frequency - by default Symphony OCR will search for documents once every 120 minutes.  This should be sufficient for your needs, however you can change this to search more or less frequently

 

 

...

6. Sharefile

Connect to Sharefile

  • Select the "Log In" link
  • You will be re-directed to ShareFile.  Enter the username and password you want to run Symphony OCR and select "Login"

Basic Settings

ShareFile Account - Displays the user that Symphony OCR connects to ShareFile as

Folders to Monitor

This is the list of Folders the user specified has access to. If a folder does not appear in the list, this user does not have access to those folders.

  • Select the checkbox in the header area to automatically select and process all documents in all folders or if you wish to only process certain folders, you can simply select the applicable ones. 
  • Select "Save Changes" at the bottom of the screen.  If you wish to process certain folders at a higher priority than others, you can do so by selecting the appropriate drop down in the list.  For more information see:  Processing Priorities

View detailed progress - Selecting this link will take you to the Progress Details page.  This will provide you with a list of Cabinets, the number of documents and pages that have been processed / not processed per cabinet.

Advanced Settings

Search frequency - By default, Symphony OCR will perform a search for new documents every 60 minutes.  The value on the may be adjusted if you require searching for documents less frequently.

...

7. Folder

Folders to Monitor

This is the list of folders that Symphony OCR is monitoring.

Search Frequency - The frequency in which the Finder will query this directory tree for new pdf & tif documents.

Default Priority - The priority level in which this directory will be processed.  For more information on setting document priorities see:  Processing Priorities

Add a folder

To add a folder or directory tree to the list of folders that should be monitored by Symphony OCR, add the path to the field and select "Add".  Symphony OCR will process the entire directory tree of the path you provide.  (e.g. X:\Clients will process all documents in the subfolders beneath X:\Clients, like X:\Clients\Anderson, Matthew and X:\Clients\Anderson, Matthew\Agreements, then select the Add button on the right.  This will add the directory tree to the list of folders that Symphony OCR is monitoring.

Note:  If you wish to process files in a hidden folder, you must explicitly indicate that folder. For example, if you have a root folder like X:\Clients and under that a hidden folder called "Inactive" (e.g. X:\Client\Inactive), you must explicitly add that folder to the Monitored folders.

Advanced Settings

Process Read Only Files - if you wish to process read-only files, you should check this check box

 

...

8. Box

Folders to Monitor

This is the list of folders that Symphony OCR is monitoring.

Search Frequency - The frequency in which the Finder will query this directory tree for new pdf & tif documents.

Default Priority - The priority level in which this directory will be processed.  For more information on setting document priorities see:  Processing Priorities

Add a folder

To add a folder or directory tree to the list of folders that should be monitored by Symphony OCR, add the path to the field and select "Add".  Symphony OCR will process the entire directory tree of the path you provide.  (e.g. X:\Clients will process all documents in the subfolders beneath X:\Clients, like X:\Clients\Anderson, Matthew and X:\Clients\Anderson, Matthew\Agreements, then select the Add button on the right.  This will add the directory tree to the list of folders that Symphony OCR is monitoring.

Note:  If you wish to process files in a hidden folder, you must explicitly indicate that folder. For example, if you have a root folder like X:\Clients and under that a hidden folder called "Inactive" (e.g. X:\Client\Inactive), you must explicitly add that folder to the Monitored folders.

Advanced Settings

Process Read Only Files - if you wish to process read-only files, you should check this check box

 


...

9. Dropbox

Folders to Monitor

This is the list of folders that Symphony OCR is monitoring.

Search Frequency - The frequency in which the Finder will query this directory tree for new pdf & tif documents.

Default Priority - The priority level in which this directory will be processed.  For more information on setting document priorities see:  Processing Priorities

Add a folder

To add a folder or directory tree to the list of folders that should be monitored by Symphony OCR, add the path to the field and select "Add".  Symphony OCR will process the entire directory tree of the path you provide.  (e.g. X:\Clients will process all documents in the subfolders beneath X:\Clients, like X:\Clients\Anderson, Matthew and X:\Clients\Anderson, Matthew\Agreements, then select the Add button on the right.  This will add the directory tree to the list of folders that Symphony OCR is monitoring.

Note:  If you wish to process files in a hidden folder, you must explicitly indicate that folder. For example, if you have a root folder like X:\Clients and under that a hidden folder called "Inactive" (e.g. X:\Client\Inactive), you must explicitly add that folder to the Monitored folders.

Advanced Settings

Process Read Only Files - if you wish to process read-only files, you should check this check box

 


...

10. Google Drive

Folders to Monitor

This is the list of folders that Symphony OCR is monitoring.

Search Frequency - The frequency in which the Finder will query this directory tree for new pdf & tif documents.

Default Priority - The priority level in which this directory will be processed.  For more information on setting document priorities see:  Processing Priorities

Add a folder

To add a folder or directory tree to the list of folders that should be monitored by Symphony OCR, add the path to the field and select "Add".  Symphony OCR will process the entire directory tree of the path you provide.  (e.g. X:\Clients will process all documents in the subfolders beneath X:\Clients, like X:\Clients\Anderson, Matthew and X:\Clients\Anderson, Matthew\Agreements, then select the Add button on the right.  This will add the directory tree to the list of folders that Symphony OCR is monitoring.

Note:  If you wish to process files in a hidden folder, you must explicitly indicate that folder. For example, if you have a root folder like X:\Clients and under that a hidden folder called "Inactive" (e.g. X:\Client\Inactive), you must explicitly add that folder to the Monitored folders.

Advanced Settings

Process Read Only Files - if you wish to process read-only files, you should check this check box

 


...

11. SharePoint

Connect to SharePoint

For a quick video showing the installation and configuration of SharePoint visit:  https://youtu.be/UNGbJiaRn9A

  • Enter the SharePoint Tenant URL (be sure to include the https://)  and choose "Connect to SharePoint"
  • SymphonyOCR will be redirected to the SharePoint tenant site where you will be directed to enter the SharePoint username and Password in which you would like to run SymphonyOCR.
  • Enter the Username and Password
  • You will be prompted by Sharepoint to Trust "extranet.trumpetinc.com", choose "Trust It"


By default, SymphonyOCR will process all sites and subsites within your site tenant.


...

12. Scheduler


The Scheduler determines when and how frequently Symphony OCR performs specific tasks, such as when to send a heartbeat, when to search for new documents, when to purge backup files, etc.

To adjust a setting select "Edit" to the left of the specific setting you would like to adjust. 

To delete a specific Scheduler entry, select "Delete" on the right of the particular setting.

Most users will not require changing these items, however there are special cases when you may wish to do this.  For example, if the firm runs their indexer software and Symphony OCR on a user's workstation, you may wish to only process items overnight.

...

13. Finder


The Finder is responsible for locating the documents in the document repository. 

Control

In the control area, you can choose to refresh the Finder or stop the Finder:

Refresh - Selecting Refresh will refresh the status of the Finder.

Stop Finder - Selecting this option will stop the Finder from finding documents in the document repository.

Status

There can be multiple tasks in the status depending on the firm's licensing and applicable document repository:

Worldox

Status

Worldox Indexed Search performs an indexed search to find documents that have been created or modified *today* that are eligible for OCR. By default, it performs the query every 15 minutes. This can be adjusted by selecting "Manage".  This will take you to the Worldox page where you can adjust the search frequency under Advanced Settings.

Worldox Non-Indexed Finder performs a non-indexed search to find all documents in Worldox that are eligible for OCR, regardless of how recently the document has been created or modified. By default, it performs this search once every 12 hours. This can be adjusted by selecting "Manage".  This will take you to the Worldox page where you can adjust the search frequency under Advanced Settings.

 

NetDocuments

Status

NetDocuments Recent Documents Search - performs a search to find documents that have been created or modified *today* that are eligible for OCR.   By default it performs the query every 15 minutes.   This can be adjusted by selecting "Manage".   This will take you to the NetDocuments page where you can adjust the search frequency under  Advanced Settings.

NetDocuments Legacy Documents Search - performs a search to find legacy documents that are eligible for OCR.   By default it performs the query every 7 days.   This can be adjusted by selecting "Manage".   This will take you to the NetDocuments page where you can adjust the search frequency under Advanced Settings.

Folders

Status

Folder Search - performs a search in the monitored folder structure to find all documents that are eligible for OCR regardless of how recently the document has been created or modified.   By default it performs this search once every 120 minutes.     This can be adjusted by selecting "Manage".   This will take you to the Folders page where you can adjust the search frequency for each folder.

 

 

 

 

 

...

14. Analyzer



The Analyzer is responsible for looking at each document and determining if it is eligible for OCR. If a document is eligible it is placed in the Processing list. If a document is not eligible, it is placed in the appropriate list (for more information on why a document might not be eligible for OCR, refer to the section, Not Processed List).

Control 

In the control area, you can choose to refresh the Analyzer or stop the Analyzer:

Refresh - Selecting Refresh will refresh the Status of the Analyzer page.

Stop Analyzer - Selecting this option will stop the Analyzer from Analyzing documents in the document repository.

Status 

Displays the status of the Analyzer.

Information

Machine Processors - Indicates how many logical processors the workstation running Symphony OCR contains.

Licensed parallel processing - Indicates how many documents will be analyzed at a time based on your license features.

Recent Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Overall Performance (since last restart)

Provides performance statistics such as the total number of documents and pages that Symphony OCR has found eligible for OCR, and the average speed of analysis per document since the last restart of Symphony OCR.

Settings

Do not analyze documents younger than - The default setting is 30 seconds. If you wish to have the Analyzer wait longer to analyze documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

To change this setting, simply type in the number of seconds, and then select "Save Changes".

...

15. Processor

Accessing the Processor

Select Processor in the navigation panel:


The Processor manages the actual OCR processes. Once a document has been identified as eligible for OCR by the Analyzer, the Processor confirms that the file is still eligible for OCR, and then OCRs the file. If a document is successfully OCRed, it is moved to the Processed list (for more information about the flow of documents throughout Symphony OCR, refer to the section Symphony Workflow, Tools & Document Lists).


Control

In the control area, you can choose to refresh the Processor or stop the Processor:

Refresh - Selecting Refresh will refresh the status of the Processor page.

Stop Processor - Selecting this option will stop the Processor from processing documents in the document repository.

Status

The status of the Processor (what it is currently processing).

Information

Processing Capacity Remaining - If you have a license that limits the number of pages you can process per year, the number of pages remaining will appear here.
Machine Processors - Indicates how many logical processors the workstation running SymphonyOCR contains.
Licensed parallel processing - Indicates the number of documents that will be processed by the processor simultaneously.

Recent Performance

Provides performance statistics such as the number of documents and pages that Symphony OCR has processed in a smaller sample size and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.

Overall Performance

Provides performance statistics such as the total number of documents and pages that Symphony OCR has processed and the average speed of processing per page, the average number of pages per document and the effective throughput of the documents.


Basic Settings

Process TIFFs (OCR and convert to PDF) - Symphony OCR can process TIFF files and convert them to image + text PDF files. This is an optional setting. If you wish to process TIFF documents, simply check this checkbox.

Note:  If the firm opts to process TIFF documents, this will change the file extension to .tif.  This will "break" any relationships or projects that include this file.

Process MSG (email) attachments - Symphony OCR can process email message attachments.  This is an optional setting.  If you wish to process email message attachments, check this checkbox. 

<Big fat scary warning: 

Due to a limitation in newer versions of Office, Microsoft prevents us from accessing the DLLs that allow us to read/process emails under the following conditions: 
> Symphony OCR is configured to run as a service
> 'Process MSG (email) attachments' is checked
> Outlook 2013 (or possibly Outlook 2016) is open

In these circumstances, you're likely to see the following error:

Therefore, if Symphony OCR is being installed to run as a service *and* will be configured to process email attachments, it is our recommendation to install it on a machine that will not normally have Outlook 2013 (or possibly 2016) open.  On the bright side, our testing has shown that in these situations, Symphony is still processing normal documents and WILL eventually recover and process emails after Office is closed.  But if you can, we recommend avoiding this situation.  If your experience is different, we'd like to hear about it. 

End of big fat scary warning>

Do not process documents younger than - The default setting is 30 seconds. If you wish to have the Processor wait longer to process documents, simply change the value in the field. Trumpet recommends that this value is not decreased to less than 30 seconds to ensure that documents are fully written to the disk before processing.

Do not process documents older than - If you have older documents that you do not want Symphony OCR to process, enter a specific number of days for which the software should process backlog.

Automatically rotate pages to proper orientation -  If selected, the pages will rotate either landscape or portrait according to the text on the page.


Original retention settings

Retain originals of processed files - If selected, Symphony OCR retains copies of the documents that is has processed. These copies appear as versions (if Symphony OCR processes a document 3 times, it will maintain copies of all 3 versions of the document). The user can restore previous versions of a document from the Symphony OCR backup using the document Details screen.

Purge originals of processed files after - The default setting is to retain the originals of processed files for 7 days after which they will be purged. If you wish to change this setting, you can change the value to the appropriate number of days for your firm.

Backlog throttling settings (only needed when your license does NOT have unlimited pages for processing)

Default processing capacity reserved for new documents (based on the actual number of new pages added each day) -  This is calculated from the number of pages that were added to the site in the past year.

Override the default processing capacity reserve - This will determine the number of pages you would like to reserve for new documents, evenly spreading the page count capacity across the entire year. To determine a reasonable reserve, allow the Symphony OCR Analyzer module to run, then look at the timeline for the Processing Queue. Adding the number of pages in the first 52 weeks, and dividing by 365 will give an average number of pages added to the system per year. Trumpet recommends adding an additional 10% to accommodate for future growth or above average filing. This value should be a reasonable overclocking reserve.

Advanced Settings:

Enable OCR debug logging - This will enable debugging for support purposes.

Create thumbnails (if not already present) - Checking this checkbox will create thumbnails if they are not already present.

Enable OCR debug logging - This will help our support team address issues if necessary.  In order to reserve disk space, we recommend not enabling this unless requested by our support team.

Limit parallel processing to X documents- This allows you to limit the number of cores that Symphony OCR will utilize. It uses 1 core per document. For example, if you input 3 Symphony OCR will only use 3 cores, and will process 3 documents simultaneously. See: How does Symphony OCR impact the performance of the server or indexer PC


Upon making changes to any of the above settings, select "Save Changes".
...

16. Windows Firewall to Allow Web Browser Administration

Symphony OCR is administered via a web browser interface.   If Windows Firewall (or some other software firewall) prevents Symphony OCR from accepting inbound connections from the web browser, administration will not be possible.

Here's how to configure Windows Firewall to allow Symphony OCR to accept inbound connections:

Configure the Firewall on the Indexer PC:

  1. Connect to the Indexer PC
  2. Control Panel > Security > Windows Firewall
  3. In the left pane, click Allow a program through Windows Firewall.   The screen looks like this:
  4. Select "Allow Another Program"
  5. Select "Symphony OCR"
  6. Click "Add"
  7. Click "OK"
...

Updates & Licensing

1. Trumpet's Product Release Cycle

Version Numbers

Trumpet products are versioned with a 3 digit number (e.g. 2.7.3, 2.18.4).  All versions of the product are released serially (i.e. version 2.7.3 contains all of the changes from 2.7.2, 2.7.1, 2.6.18, 2.6.17, etc…).  New versions are created frequently (often once or twice per week), and each version change consists of a very small amount of changed or additional functionality (i.e. one bug fix or one new feature).

It is not at all unusual for Trumpet to produce 2 or 3 versions of a given product in a single week.

Trumpet also has reporting on which customers have which versions – and whether those installations are in an OK, WARN, or ERROR status level.  We also track support requests (issues) by which version of the software was installed at the time of the request.  This allows us to make quantitative assessment of the risk of a given version of each product.  If a given version is in use at many sites, all of which are OK, and there have been no reported issues for that version, then we can say with confidence that the build is stable and safe to deploy broadly.

Because each version bump incorporates a very small number of changes, it is very, very easy to identify any regression issues that arise.

Release Management

Trumpet has 3 phases that a given software version might go through:

DevRelease

Software is highly unstable.  No testing has been performed.  May contain known and unknown huge glaring bugs and problems. This is not available to download through your software, and should not (and could not) be deployed to any system unless Development is involved

PreRelease

The software is considered stable, and has passed internal QA, but we don’t have exhaustive experience at tons of sites – it may still contain unknown bugs, but regressions are highly unlikely

Production Release

Software has been proven to be robust at a large number of sites – any bugs remaining are small.

 

This means that the latest Production release will always be at the same or less version as the latest PreRelease.  For example, 2.3.7 might be the latest Production release, and 2.3.23 might be the latest PreRelease.  The PreRelease would contain 16 small changes since the Production release was made.

Periodically (usually around every 3 months), a review determines the latest PreRelease that is considered to be ready for Production.  This review consists of looking at how many sites are running the PreRelease, whether there have been any support requests made for versions between the current Production release and the PreRelease that may indicate an issue with the underlying code, and whether sites currently using the PreRelease are in a warning or error state. 

Installers for DevRelease are named ‘DevRelease-ProductName-x.y.z.exe’.  Installers for PreRelease are named ‘PreRelease-ProductName-x.y.z.exe’.  Installers for Production are named without a prefix (‘ProductName-x.y.z.exe’).

Once a PreRelease version has been declared ready for production, a Production release is created with the same version number as the PreRelease (this is the exact same installer – we literally just rename the installer exe).

Once a Production version is identified, we generally bump the second number of the version for the next PreRelease we create.  For example, if a 2.3.16 PreRelease is marked as a Production release, the Production release will be 2.3.16, and the next change we make to the product will be under version 2.4.1.

Risk Management

The reason we have these phases is to minimize the risk of exposing a given problem to a large number of users.  PreReleases tend to roll out gradually to a small handful of sites as we work with firms who actually need functionality, or to those sites who choose to install it pro-actively.  PreRelease and Production releases could be installed by anyone at any time by doing a Help->Check for Updates (or we may send a blast email announcing a new version’s availability).

This approach results in customers being able to install the latest ‘Known Good’ version on a regular basis (3 or 4 times per year), while still enabling customers who need changes or fixes to get rapid updates at extremely low risk.

How risky are PreReleases?

Pre-Release versions are very stable.  If we are fixing a bug or adding new functionality, it is extremely rare that work could cause problems for the useful functionality of earlier versions (we refer to this sort of issue as a ’regression’, and our release management cycle is designed to prevent this sort of issue).  So there is a small chance that a given PreRelease might not completely fix the bug it was intended to fix – or there may be a subtle issue with new functionality that was added - but it is very rare that a given PreRelease would actually break the application in a meaningful way.

If you have an issue or need that the latest PreRelease fixes, it is generally a good idea to update, unless the issue is truly not important to your organization.

PreReleases (if available) can be obtained using the Check for Updates functionality available in all of Trumpet’s applications.

How risky are Production releases?

Production versions are not only very stable, but have had a good number of sites using the version without issue.

We recommend that you install all Production updates as they become available (though it’s perfectly fine to schedule this into your regular maintenance schedule).


...

2. Symphony OCR Licensing & Updates

Updating your Symphony OCR license / software is a two step process, the first, is to update your Symphony OCR software, and the second is to update your Symphony OCR license.  The following are instructions for doing each of these operations:

Update Symphony OCR Software

These steps assume that you have received an email instructing you to update your Symphony installation.  Depending on the update notification, that email may contain your client code and/or license number.

  1. Connect to the workstation that Symphony OCR is running on
  2. From the upper right-hand corner of the Symphony OCR main page, click the 'Check for Updates' link

    Check for Updates location

  3. Save the Symphony OCR installer to your desktop or browser
  4. Once the installer has finished downloading, double-click to launch the install
  5. Click "Next", then "I Agree"
  6. Leave the installation folder as the default, then click "Next"
  7. Leave the selected radio button under "How would you like to run Symphony OCR", and click "Next". Alternatively, if you'd like to adjust this then do so now to change the way it runs (Service vs logged in user)
  8. After the installation completes, leave the "Start Symphony OCR" checkbox checked, then click "Finish"
  9. Symphony OCR will launch

That's all there is to it!

Update Symphony OCR License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your yearly invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will see this new license, download and install.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can perform the following steps:

  • Navigate to the Licensing Page by selecting the Licensing link in the left hand navigation panel of Symphony OCR.
  • Select the "Check for Updated License" link on this page. 
  • This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers. 

As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

...

3. Update Symphony OCR

These steps assume that you have received an email instructing you to update your Symphony installation.  Depending on the update notification, that email may contain your client code and/or license number.

  1. Connect to the workstation that Symphony OCR is running on
  2. From the upper right-hand corner of the Symphony OCR main page, click the 'Check for Updates' link

    Check for Updates location

  3. Save the Symphony OCR installer to your desktop or browser
  4. Once the installer has finished downloading, double-click to launch the install
  5. Click "Next", then "I Agree"
  6. Leave the installation folder as the default, then click "Next"
  7. Leave the selected radio button under "How would you like to run Symphony OCR", and click "Next". Alternatively, if you'd like to adjust this then do so now to change the way it runs (Service vs logged in user)
  8. After the installation completes, leave the "Start Symphony OCR" checkbox checked, then click "Finish"
  9. Symphony OCR will launch

That's all there is to it!

...

4. Symphony Suite Licensing & Updates

Updating your Symphony Suite license / software is a four step process:

  1. Update your Symphony Profiler Software
  2. Update your Symphony Profiler License
  3. Update your Symphony OCR Software
  4. Update your Symphony OCR License

The following are instructions for doing each of these operations:

Updating your Symphony Profiler license / software is a two step process, the first, is to update your Symphony Profiler software, and the second is to update your Symphony Profiler license.  The following are instructions for doing each of these operations:

Update Symphony Profiler Software

These steps assume that you have received an email instructing you to update your Symphony installation.  Depending on the update notification, that email may contain your client code and/or license number.

  1. Connect to the workstation that Symphony Profiler Processor is running on (this is usually the indexer PC)
  2. In the Symphony Profiler Processor application, click Help > Check for Updates.  Save to your desktop
  3. Double-click the Symphony Profiler installer
  4. Click "Next", then "I Agree"
  5. Leave the installation folder as the root of the *network* folder that contains your Symphony Profiler installation (if you are updating, this should be pre-filled), then click "Next"
  6. After the installation completes, leave the "Install Symphony Profiler Processor" checkbox checked, then click "Finish"
  7. The Symphony Profiler Processor installer will now launch
  8. Click "Next"
  9. Leave the installation folder as the default (this will be installing to the local C drive of the workstation), click "Next"
  10. After the installation finishes, leave the "Start Symphony Profiler Processor" checkbox checked, then click "Finish"
  11. The new version of Symphony Profiler Processor will launch

That's all there is to it!

Note for if users have the local workstation component installed on their computers (the component is not technically required, but many still have it and prefer it): Once the back-end is updated, the workstations will receive an update notification the next time Symphony Profiler Workstation is launched (normally when users log in) - to get the Workstation update sooner, you can close Symphony Profiler Workstation and re-launch it, then follow the update prompts.

Update Symphony Profiler License

Starting with version 1.7.28, Symphony Profiler will have an 'Automatic License Update' feature.  Basically, after you've paid your yearly invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony Profiler will see this new license, download and install.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can perform the following steps:

  • Open the Symphony Profiler Processor (from the Indexer workstation)
  • Navigate to Edit -> Preferences
  • Select "Licensing" in the left hand navigation panel
  • Select 'Check for Latest License' button on this page. 

This will manually trigger Symphony Profiler to retrieve the updated license from Trumpet's servers.  As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

Updating your Symphony OCR license / software is a two step process, the first, is to update your Symphony OCR software, and the second is to update your Symphony OCR license.  The following are instructions for doing each of these operations:

Update Symphony OCR Software

These steps assume that you have received an email instructing you to update your Symphony installation.  Depending on the update notification, that email may contain your client code and/or license number.

  1. Connect to the workstation that Symphony OCR is running on
  2. From the upper right-hand corner of the Symphony OCR main page, click the 'Check for Updates' link

    Check for Updates location

  3. Save the Symphony OCR installer to your desktop or browser
  4. Once the installer has finished downloading, double-click to launch the install
  5. Click "Next", then "I Agree"
  6. Leave the installation folder as the default, then click "Next"
  7. Leave the selected radio button under "How would you like to run Symphony OCR", and click "Next". Alternatively, if you'd like to adjust this then do so now to change the way it runs (Service vs logged in user)
  8. After the installation completes, leave the "Start Symphony OCR" checkbox checked, then click "Finish"
  9. Symphony OCR will launch

That's all there is to it!

Update Symphony OCR License

Starting with version 6.4.96, Symphony OCR will have an 'Automatic License Update' feature.  Basically, after you've paid your yearly invoice with Trumpet, a new license is automatically generated.  So if your installation has access to the Trumpet servers, Symphony OCR will see this new license, download and install.

Note:  Symphony will check for a new license once every 3 days under normal circumstances, and once per day when your license is within 30 days of expiring.

If you've paid your invoice (and received notification of a new license) and don't want to wait for the automatic update to kick in, you can perform the following steps:

  • Navigate to the Licensing Page by selecting the Licensing link in the left hand navigation panel of Symphony OCR.
  • Select the "Check for Updated License" link on this page. 
  • This will manually trigger Symphony OCR to retrieve the updated license from Trumpet's servers. 

As mentioned, all of this assumes your installation has access to Trumpet's servers.  If a connection cannot be established, you can always copy/paste your new license into this screen.

When you receive notification from Trumpet that your new license is generated, it is still highly recommended that you A) update your installation to the latest version of the software, and B) verify your license has been updated.

...

5. Update Symphony Suite

Symphony Suite consists of two software components that need to be updated individually.  Here are instructions for each:

Symphony OCR Update Instructions

Symphony Profiler Update Instructions

...

6. Symphony Suite Cloud Updates & Licensing

Because Symphony Suite Cloud (including both Symphony OCR and Symphony Profiler Cloud products) is updated on the cloud servers, no software updates are required.  In addition, the Symphony Suite products' licensing is automatically pushed to the cloud servers, therefore, the license you received can be saved for your records and no action is required on your behalf.


...

Release Notes

1. Release Summary 8.0

Summary

- Updated to the latest version of the 64-bit version of the Abby FRE Engine 

- Symphony OCR versions 8.0 and higher require a 64-bit Operating System

- Documents that are larger than 32512x32512 pixels will be moved to the Too Big list, these will not be processed regardless of the checkForLargePages setting

- Updated SharePoint integration

- Resolved issues with null pointer exceptions

- Out of an abundance of caution, Symphony OCR was updated to ensure there are no log4j dependencies that could be vulnerable to log4shell

To see a complete list of changes, visit:  Change Log

...

2. Change Log

Changes

8.1.67

- No functional changes

8.1.66

- Explicitly disable the following SSL algorithms:
 SSLv3, TLSv1, TLSv1.1, RC4, DES, MD5withRSA, DH keySize < 1024, EC keySize < 224, 3DES_EDE_CBC, K_NULL, C_NULL, M_NULL, DHE_DSS_EXPORT, DHE_RSA_EXPORT, DH_anon_EXPORT, DH_DSS_EXPORT, DH_RSA_EXPORT, RSA_EXPORT, DH_anon, ECDH_anon, RC4_128, RC4_40, DES_CBC, DES40_CBC, DESede, TLS_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_128_CBC_SHA256, TLS_RSA_WITH_AES_128_GCM_SHA256, TLS_RSA_WITH_AES_256_CBC_SHA, TLS_RSA_WITH_AES_256_CBC_SHA256, TLS_RSA_WITH_AES_256_GCM_SHA384, TLS_DHE_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_RSA_WITH_AES_256_CBC_SHA, TLS_DHE_RSA_WITH_AES_256_CBC_SHA256, TLS_DHE_RSA_WITH_AES_256_GCM_SHA384

8.1.64

- Fix for error 0xc0000135 on some Windows 11 machines

8.1.61 -

- Add temp file delete retry loop during storing of some .dat files (workaround for antivirus holding locks on files when it shouldn't be)

8.1.57 -

- Improved performance in how we copy files during database maintenance

8.1.55 -

- Bug fix - Worldox Index Finder not working for sites with Locales that use a date format other than m/dd/yyyy.  Symphony will now use the machine locale to determine the date format used in Worldox index search query strings.

8.1.54 -

- Java install download will now use https://resources.trumpetinc.com instead of software.trumpetinc.com

8.1.53 -

- Added futureModifiedDateDateCutoff config setting (default is 24 hours in the future - 24L*60L*60L*1000L) - any document with modified date in the future by more than this cutoff will be eligbile for immediate processing
 
8.1.52 -

- Added support for specifying folder locations in an optional folders.properties file (in the root folder of the application)
 - appHome
 - work

8.1.50 - 

-  Added File modified time to document detail display

8.1.48 -

- License update checks now go to https://lms.trumpetinc.com instead of https://partners.trumpetinc.com

8.1.46 -

- Bug fix - some TIFFs could result in mis-sized PDF files after OCR

8.1.44 - 

- Change OAuth bounce server from https://extranet.trumpetinc.com to https://oauth.trumpetinc.com
- Change notifications URL to host https://notifications.trumpetinc.com

8.1.42 -

- Improved handling when low level database failures happen (auto rebuild on restart)

8.1.39 -

- Bug fix - Status screen would list "XXXX integration has 0 warnings" when there actually was a warning

8.1.38 -

- Bug fix - if SharePoint connection dropped mid-run, the Connect button didn't show in the SharePoint config screen (user was forced to restart SOCR to get the Connect button in this scenario)

8.1.37 -

- Bug fix - Welcome wizard was showing even after the license was entered

8.1.36 -

- Bug Fix - "RPC Server" error messages during OCR on some Azure servers

8.1.34 -

- Change URL used for sending notification emails to use https://extranet.trumpetinc.com instead of https://webservices.symphonysuite.com

8.1.33 - 

- Bug fix - Application crash with Heap Space error when handling certain types of pages with huge rendered content streams

8.1.32 - 

- SOCR will now use the Windows SSL trust store instead of requiring private certificate registration

8.1.31 - 

- Issue fix - SOCR consumes excessive disk space (many logs/*.hprof files) when out of memory errors result in service restarting frequently
- We now purge all but the latest logs/*.hprof files when we startup 

8.1.29 - 

- Improvements to recovery when the application crashes due to a memory problem. Documents that were actively being analyzed are now moved to a special "Memory Error Suspect" list when the application restarts, instead of attempting to analyze them again.
- Documents that are actively being analyzed are now tracked in a new 'Analysis In Process' list instead of the regular 'In Process' list

8.1.27 -

- Added localhostOnly setting to <webServerConfiguration .... /> config - when true, the user interface will only be visible via 'localhost' or '127.0.0.1' URLs.

8.1.26 -

- Switch heartbeats to send over HTTPS instead of HTTP

8.1.25 -

- Improvement in handling connection timeout issues with Sharepoint (documents will now be marked for reprocessing instead of going to the error list)

8.1.23 -

- Include a few more details in the overall system status (if there is only a single problem, surface it at the top level, and in the heartbeats)

8.1.20 -

- Added File Size and Page Count to CSV list export

8.1.17 - 

- Improve status messages during backup purge (we now give a count of the files that have been purged)


8.1.16 - 

- Backup purge algorithm has been changed to purge based on the backup file's creation date instead of the database document last modified date


8.1.13 - 

- Reduce database contention and improve performance during backup purge (we no longer grab database mutator locks unless the document actually has backups to purge)

8.1.11 -

- NetDocuments Finder enhancement - if there is a failure during finding, the Finder now enters a 60 second recovery loop before continuing with the next cabinet
- NetDocuments Finder enhancement - if the legacy finder had a failure, the error message from that failure stays in the Issues list until the legacy finder completes all cabinets.  For sites with really big legacy backlogs, this can result in the error appearing in the finder Issues list for a long time, even though the Finder is continuing to run.

 

20220128

8.1.10 -

- Bug fix - log output wasn't being written to files (regression introduced in 8.1.9)

8.1.9 -

- Update to ensure that there are no log4j dependencies that could be vulnerable to log4shell (abundance of caution, SOCR isn't public facing)

8.1.8 -

- Bug fix - NullPointerException in SharePoint funding under some situations
- SharePoint finders now report on throughput (docs/hr) while they are finding documents
- Add support for throtting SharePoint searches to a maximum documents per hour. When throttling, the Finder status will give a message that throttling is happening)

8.1.7 -

- Enhancement: Automatically create document record for NetDocuments documents when the user searches for them (i.e. 'Lookup Document', then pasting the ND doc id)

20211217

8.1.6 -

- Adjust 'Reanalyze Not Processed' task so it does NOT do anything with INACCESSIBLE documents

8.1.4 -

- Better error message if Worldox indexed search fails because of too many index hits (error message will now include the text: "Search xxxxxxxxxxxxxx resulted in more results than WDAPI can return. This limit is set by worldox.ini > [Debug] > FindSilentMax=xxxx where xxxx is the limit" )

8.1.3 -

- Bug fix - 'Adding Text to Image Failed - Null' error when processing some PDFs (this is a different issue from what was fixed in 8.1.2)

8.1.2 -

- Bug fix - 'Adding text to image failed - null' error on some malformed PDF documents

20211001

8.1.0 -

- Notification emails in cloud installs no longer include hyperlinks

- Document Details screen now has a Orientation Errors Detected value (to help with diagnosing the regression bug described in the next item) - this is only visible after orientation detection is needed (i.e. document processed by 8.0.0 -> 8.0.11). It will display Yes, No or Unknown (unknown means that orientation analysis still needs to be done - these had better be in the To Analyze queue!)

- Regression bug fix - introduced in 8.0.0 - Pages that required re-orientation prior to OCR did not get re-oriented, resulting in junk OCR results. This build automatically detects these problem documents, rolls them back to pre-ocr state, and re-OCRs them with auto-orientation. No user interaction is required for this to work - users may see a sudden large increase in the number of documents being analyzed, then OCRed

8.0.11 -

- NetDocuments legacy finder can now restart if it is interupted (this should address situations where backlog is very large and networking on ND errors cause the legacy search to fail at some point)

8.0.10 -

- Improvements to MICR processing (some MICR text wasn't being extracted in some documents)

8.0.9 -

- Write enableMicrProcessing to settings.xml even if it is false (makes it a little easier for users to turn MICR processing on)
- Re-enabled support for MICR processing (was disabled as part of the initial 8.0.0 release) - controlled by enableMicrProcessing setting in settings.xml

8.0.8 -

- Re-enable OCR of barcodes (was turned off starting in 8.0.0)

 

20210331

8.0.5 -

- Bug fix - SharePoint integration was completely broken since Feb 17, 2021 (change on Microsoft's end broke existing integration)

8.0.4 -

- Added explicit check for whether the page is greater than the maximum image size supported by the OCR engine - 32512x32512 - if it is, the document is moved to the Too Big list. This happens regardless of the checkForLargePages setting.

8.0.3 -

- updated SharePoint token so it does not expire

8.0.2 - 

- Large page check is now disabled by default. To enable, set <documentPreProcessor checkForLargePages="true" /> in settings.xml

8.0.0 - 

OCR engine now requires 64 bit operating system

 

20201119

7.3.2 -

- Bug fix - large pages were causing SOCR to crash with out of memory errors
- Improve detection of pages that are too large to fit in memory


20200922

7.3.0

- Bump version

7.2.57

- Updated license agreement to refer to Trumpet, LLC instead of Trumpet, Inc.

7.2.56

7.2.55

- Bug fix - Aspose and Nuance files that were already marked as not-safe, continued to be marked as not-safe, even during re-processing

7.2.54

- Bug fix - introduced in 7.2.51 - really big documents were failing with errors about the total content stream being too large. We now limit only the size of a single content stream - not the total across all pages

7.2.53

- Documents in the NotSafe list will be marked for reprocessing when SOCR launches following an install update

7.2.52 - 

- If page content requires more than 1M of RAM to extract, we will mark the page as not needing OCR. This will make the file NOT appear on the corrupted list, and will allow SOCR to process other pages that might need OCR.

7.2.51 - 


- If page content requires more than 1M of RAM to extract, the document is marked as corrupted with a note that reads ""Page " + page + " can not be read into memory - the page is not really corrupted, but cannot be analyzed because of it's size""
- Enhance Aspose.PDF detection so it is not case sensitive

7.2.50 - dev release

- Documents with producer "Aspose.PDF" are no longer marked as 'not safe' (reversal of change made in 7.2.19) - these documents can now be safely processed by SOCR
- Documents with producer "Nuance PDF Creator" are no longer marked as 'not safe' (reversal of change made in 7.2.29) - these documents can now be safely processed by SOCR

7.2.49

- Add Begin Analysis and Analysis complete entries to document history
- Added document history for documents found in INPROCESS during application startup
- Added file size to documents that are in INPROCESS during application startup

7.2.38 - dev release

- Bug fix - inaccurate 'Document is accessible - can't query for reason' error message

7.2.35 -

- For normal installs, a default launch.ini will be installed (if there isn't one already) that sets the max heap space to 1024m


7.2.34 -

- Update learn more hyperlink for NOT_SAFE lists to point at new kbook article

7.2.33 -

- Make Nuance PDF rollback debug screen skip documents that have unknown document sources (previously, the entire operation failed)

7.2.32 -

- Bug fix - Nuance PDF rollback debug screen (added in 7.2.29) could have NullPointerException if the Symphony database has entries added by really old versions of Symphony

7.2.31 -

- Bug fix - Nuance PDF rollback debug screen (added in 7.2.29) could have NullPointerException if the Symphony database has entries added by really old versions of Symphony

7.2.29 -

- Added special handling for Nuance PDF Creator documents (they are moved to Not Processable for the time being while we work through a bug)
- Added Debug screen operation to roll back Nuance PDF Creator documents that Symphony OCRed during effected versions (7.1.0 through 7.2.29)

20200313

7.2.28 -

- Some SharePoint search results had 'null' file sizes - this was causing Finder to fail. We now handle these oddities gracefully.

7.2.27 -

- More informative error logging if SharePoint API fails

7.2.24 -

- Bug fix - corner case - if MSG record is deleted from the SOCR database, analysis of attachments for that MSG fail with NullPointerException

7.2.23 -

- Bug fix - We now check if a document was checked out by a user during the OCR process. If so, the OCR results are thrown away and the document will be reprocessed instead of saving a new version.

7.2.22 -

- Bug fix - reanalzying MSG files was resulting in error java.lang.IllegalStateException: initialFileModifiedTime can only be set once

7.2.21 -

- Added Debug Screen command ("Search for and roll back non-unity CTM problem documents") to identify and roll back documents impacted by the regression that was fixed in 7.2.20

7.2.20 -

- Regression fix - introduced in 7.1.0 - some source PDFs result in OCR invisible text not being placed properly on the page

7.2.19 -

- Special handling for PDFs with producer string containing "Aspose.PDF" - these files can't be safely processed by SOCR yet - files with this producer are placed on the Not Safe list
- Added Not Safe list
- Udjusted the document details screen so it includes the corrupted or not-safe reason (if one is present)
- Added 'Search for and roll back Aspose problem documents' to the debug screen - when pressed, it will iterate through all documents that were modified by SOCR between 7.1.0 and 7.2.19, check their PDF Producer string. Any documents with producer containing "Aspose.PDF" will be marked for roll-back.

7.2.15 -

- Bug fix - if the underlying DMS fails when copying a read-only copy of a document, the Document was being put directly into the Reprocessing list. It now goes into the Inaccessible list.

7.2.14 - dev release

- Bug fix - null pointer exception in exception handler in getDocumentsByStateReverse()

7.2.13 -

- Bug fix (minor) - MSG attachments weren't capturing initial modified time properly - this resulted in a lot of log chatter when the weekly summary routine was running

7.2.12 -

Bug fix - SharePoint user count was including external users

7.2.11 -

Bug fix - digitally signed documents were not being OCRed when settings.xml documentPreProcessor setting allowDigitallySigned set to true

7.2.10 -

Regression Bug fix (introduced in 7.2.9) - MetaJure feature resulted in Folder feature not working properly. This is now fixed.

7.2.9 -

- MetaJure integration licensing (J feature code) now enables Folder DMS integration

 

20190923

7.2.9 -

- MetaJure integration licensing (J feature code) now enables Folder DMS integration

7.2.8 -

- Labeling consistency change - Change instances of 'Invisible Words' to read 'Hidden Words'

7.2.7 -

- Remove extraneous apostrophe from Symphony OCR is arleady running dialog

7.2.6 -

- Add additional handling for 0x80030109 responses during MSG editing (STG_E_DOCFILECORRUPT - The doc file has been corrupted) - when this happens during OCR, we now mark the document as corrupted instead of reprocess

7.2.5 -

- Add handling for 0x80030109 responses during MSG editing (STG_E_DOCFILECORRUPT - The doc file has been corrupted)

7.2.4 -

- Added 'System Memory Information' and 'Disk Information' to log output when application launches

7.2.0 -

- Moved to JRE 11 (and Trumpet private Java runtime, non-oracle)

7.1.8 -

- Work around for PDFs with very deep xref versions (tpt86114)

7.1.7 -

- Bug fix - SOCR hangs when interacting with UNC based very long filenames (>260 characters)

7.1.5 -

- Added Needs Attention document count to warning message in UI and heartbeats

7.1.3 -

- Workaround for invalid PDFs that don't have bounding boxes defined for all pages (bounding box defaults to standard portrait letter in these cases) -note that these PDFs are not compliant to the PDF spec (MediaBox is required), so no guarantee that the resulting text placement will be correct - but in testing on the few problem files we've seen has been successful.

7.1.2 - dev release only

- Mark documents with "CCITT codec error" messages as corrupted instead of needs attention

7.0.30 -

- Added corrupt_db.log file (new log4j.properties) that contains only log errors related to corrupted databases (should allow us to narrow down time window of when corruption occurs)

20190403

7.0.29 -

- Bug fix - Worldox API re-initialization could cause Symphony to crash under Worldox WDU14 (race condition exacerbated by WDU14 changes)

20190225

7.0.28 -

- Bug fix - SharePoint files with single quotes in their names resulted in errors
- Bug fix - timeout/429 errors caused some SharePoint integration calls to fail

20190104

7.0.27 -

- Support for firms using proxy servers that use private certificate authorities to proxy HTTPS traffic
- Custom certificate authorities can now be registered in a Java Key Store file stored at /config/cacerts.private - see document 141530 for instructions on obtaining a certificate and loading it into a private trust store

7.0.26 -

- Some corrupted MSG files were being put on the Reprocessing list (error code 0x8004010f now sends the document to Corrupted)

7.0.24 -

- Process Read Only setting in Folder settings wasn't being honored (resulted in 'Access Is Denied' error after OCR completed for documents with read only attribute set)

7.0.22 -

- Added initial support for multiple libraries with OpenText

7.0.19 -

- Bug fix MSG attachments were being moved to the priority of the parent MSG document record when the attachment was reprocessed (if the MSG was set to Analysis Only priority, but the attachment was set to High, if the attachment was re-analyzed for any reason, the priority of the attachment was switched to Analysis Only)

7.0.17 -

- Workaround for non-compliant PDFs that don't properly specify page size for all pages (MEDIABOX missing)

7.0.16 -

- Enhancement - special analysis handling for pages that have invisible text in the margins (i.e. invisible text stamp)

7.0.15 - 

- Adjust OpenText/eDocs integration to handle documents that are in-use and documents that have been deleted

7.0.14 -

- Adjusting OpenText/eDocs integration to work properly with live site

 

20181017

7.0.13 -

- Bug fix - Worldox integration fails at sites that had UNC paths containing spaces and wdcommon\wdmirror.ini files referencing drive letter based CPs

7.0.12 -

- Bug fix - ND OAuth token wasn't being saved for brand new sites

7.0.11 -

- Changes to NetDocuments OAuth token handling to support upcoming NetDocuments OAuth changes

20180706

7.0.10 -

- Bug fix - 'Process Read Only' setting in Folder configuration screen didn't stick

7.0.8 -

- Bug Fix Occassional 'java.lang.ArrayIndexOutOfBoundsException: 100' error when processing under load - could result in 'Database is corrupted' error message in user interface.

7.0.5 -

- Bug fix - uninstaller wasn't being registered for "run as logged in user" installations


6.6.98 -

- Bug fix - Refresh button in NetDocuments settings screen took user to NetDocuments US Vault login screen. The refresh button now just refreshes the cabinet list.
- Added Renew Connection button to NetDocuments setting screen
6.6.97 - dev release

- Bug fix 'New pages added in past year' on summary screen could show incorrect values for up 12 hours when documents are re-analyzed
- Bug fix 'New pages added in past year' label always showed '(this year)' instead of of '(pages/year)' when we had a full year's worth of data available

 

20180511


6.6.92 -

- Bug fix - NetDocuments made an API change on 4/20/2018 that caused our "Open" links to not take the user to the document in ND

6.6.90 -

- Bug fix - SharePoint sites that had spaces in their name were resulting in "Illegal character in path" Finder errors


6.6.89 -

- Added ability to reprocess documents in the Processing (TOPROCESS) list (just in case they need to be re-analyzed manually)

6.6.88 -

- Statistics screen now displays estimated time to process backlog based on 4 cores (1.2 seconds/page) if there is no OCR performance data to baseline against (i.e. analysis only licenses). The label on the estimate will show "(assuming 4 CPU cores)" in this case.

6.6.87 -

- Bug fix - Fixed OutOfMemoryError under high analysis or OCR load


6.6.85 -

- Bug fix (kinda) - Legacy files in Worldox with ~ at the beginning were appearing in the corrupted list.  Worldox indexed searches were sometimes returning files with ~ at the beginning (these are generally temp files that shouldn't have been part of the indexes and certainly shouldn't be OCRed)

6.6.84 -

- Added accessibility message for Worldox integration informing the user that WD versions prior to 20180412 do not support paths longer than 255 characters
- Added accessibility message for Worldox integration for sites with WD versions later than 20180412 that WD does not support paths longer than 380 characters


6.6.82 -

- Added conditional logic to Worldox integration to allow spaces after filename and before the file extension (WD code running after 20170601 allows spaces)


6.6.81 -

- Improved detection of pages that should be ocr'ed even though they have excessive text in margins
- Improved detection of pages that should not be ocr'ed even if they have full image on the page and rendered text beneath the image
- Added additional columns to Document Details screen, page analysis results


6.6.76-

- Bug fix - setting for limiting maximum number of cores used during OCR was not being honored
- Changed Processor configuration UI for clarity around maximum allowed parallel processing settings

6.6.73 -

- Bug fix - SOCR doesn't shut down at NetDocuments sites that were actively searching for documents

6.6.71 -

- Bug fix - NetDocuments configuration screen not showing error/warning details for Analyzer-only licenses
- Bug fix - NetDocuments integration was having preserveModifiedInfo set to false after setup with Analyzer-only licenses

6.6.69 -

- Added separate NetDocuments connection buttons for US, EU and AU vaults

6.6.68

- Bug fix - Welcome Wizard didn't work properly if there were features in the license that required DMS configuration to work properly (# of ND or SP users, for example)
- Removed the 'Manage' button from Issues list in welcome wizard
- Improved error message for invalid tenant URLs


6.6.64 -

- Bug fix - blank pages without any content stream were marked as corrupted instead of blank

6.6.59 -

- Bug fix - the user interface prevented setting the SharePoint legacy search frequency to 0.  This is now allowed.

6.6.58 -

- If SharePoint legacy search frequency is set to 0, the legacy search will be skipped

6.6.57 -

- Bug fix - Fixed a bug with Rollback so it would work with the first button click (from the document detail page)
- Added the bulk operation "Rollback" to the Processed documents page. When clicked, all documents in the current search will be rolled back to their non-OCRed version
- Added the bulk operation "Reanalyze" to the Reprocessing documents page. When clicked, all documents in the current search will be moved to the Analyzing bucket


6.6.55 -

- Reworked the processing metrics on the Analyzer, Processor and Summary pages to show more useful data in a more user friendly fashion
- Removed all support for Bonus Page tracking
  
6.6.54 -

- Moved to jWDAPI 1.0.22 to have the WorldoxSession fast fail if the user is invalid
- Backed out the previous Worldox invalid user fast fail code

6.6.53 -

- Fixed bug where wrong page tracker was being used by the Analyzer

6.6.52 -

- Modified processor core algorithm to not consider physical cores on the machine. Solely determined by the license now

6.6.51 -

- Fixed bug in the ProcessorManager config that was causing maxThreads to be persisted and thus override calculated maxThreads
- Updated the version check url to be https to work with the new version update https redirection

6.6.50 -

- Tracked WorldoxConnection failure due to invalid user, and quick fail on repeated calls to the connection until a valid user is provided.

6.6.48 -

- Moved to jlicensing 1.0.10 to support new license expiration warning logic

 
6.6.46 -

- Created a ProcessorManager that will not create Processor tasks that do the work. These tasks will be added to
    an executor service so we can run multiples in parallel. A ProcessorManager will be created for each document
    processing type (analysis, ocr, rollback)
- Split the processor mgmt (stop, start) config into a new section, and provided migration for it
- Renamed the processors to AnalyzerProcessor, OCRProcessor and RollbackProcessor. Supporting classes followed suit
- Added a WorkingFolderProvider to track and manage working folders for the managers
- Created ProcessorThreadPoolExecutor for use by the ProcessorManager. It has the ability to block task addition until a thread
    is available
- Refactored the OCRProvider to remove the generic parts, since we only support a since ocr engine
- Refactored the page count handling
- Implemented dripMode in the ProcessorManager
- Deleted OldPageCountFeature
- Made processor factory config classes immutable
- Added more support for Processor status
- Added a maxDripsBeforeHalting setting to ProcessorManager, to allow dripMode to halt after X docs are processed, rather than just 1
- Fixed bug in the Analyzer config screen that wasn't persisting the isAllowMsgAttachments setting
- Added custom message support to the ErrorTracker so we could get better error messages in the UI
- Updated Processor and Analyzer web pages to show a list of documents being processed
- Updated statistic verbage per changes
- Modified page statistics to divide results by the number of running threads


6.6.45e -

- Bug fix - SOCR was checking to make sure 8.3 filename information was available for all versions of Worldox integration.  We now only check if the version is prior to the WDU10 release (which fixed 8.3 realted issues in WDAPI)

6.6.45d -

- Modified calls to NDAPI for creating new versions to ensure we don't modified lastModified info

6.6.45c -

- Performance improvement when retrieving number of active users in NetDocuments integration

6.6.45b -

- Add support for unlimited user count licensing for NetDocuments integration

6.6.44 -

- Bug fix - some malformed PDFs (huge page catalogs, large number of pages) could result in out of memory exceptions

20171006

 6.6.24 -

- Updated to NDAPI 0.0.53 (upgraded document getSize() methods)
- Modified NetDocuments processing to use the new NDAPI getSizeBytes() method for consistent file size checking
- Moved configuration loading of the preprocessor and processor to the end of the config load, to prevent feature initialization issues


6.6.22

- Made SystemStatusProvider get the NetDocs status from the ND source, not connection manager
- Modified NetdocsDocumentSource to store an ignore warnings flag for the new warning
- Modified NetdocsDocumentSource to be Status aware and provide the overall status for NetDocuments
- Modified the Netdocs web page handler to use the new NetdocsDocumentSource getStatus() call rather than the ND conn manager call
- Fixed a bug in NetDocuments processing that was using different file sizes during file searching, resulting in unnecessary downloading of files for reprocessing


6.6.19 -

- Regression bug - Log output wasn't being written to the maestro.log or error.log files.  Introduced in 6.6.12

6.6.18 -

- Added belts and suspenders to OpenText implementation
- Added belts and suspenders to LSSe64 database methods
- Added dripMode to Processor, allowing for the processor to be stopped after a single document is processed
- When OpenText feature is enabled, set the Processor and PreProcessor to not autostart, and to be in dripMode
- Modified OpenText search for files SQL to use enhanced SQL and never return docs that don't have allowed extensions
- Made the OpenText modified flag value customizable, to allow for easier beta testing
- Prevented actual file changes to OpenText files, until beta testing shows us the correct way to make changes
- Modified OpenText lastModified time for files to use the database value rather than the file value


6.6.16 -

- Regression Bug fix - introduced in 6.6.10 - installer creating 'work' folder in the installer exe folder

6.6.13 -

- Fixed null pointer exception in PDFFileAnalyzer
- Fixed null pointer exception in PurgeUnavailableTask

6.6.12 -

- Updated to NetDocumentsAPI v0.0.51 to support document content change without changing modification info for file extension changed (tiff->pdf)


6.6.11 -

- Updated to NetDocumentsAPI v0.0.50 to support document content change without changing modification info


6.6.7 -

- Added a description to the SymphonyOCR Windows service

6.6.5 -

- White labeling is currently only available for SOCRCLOUD licenses. If enabled, the "D" Worldox feature must be disabled to prevent conflicts
- White labeling reports the number of used Worldox seats in the heartbeat, but the license does not rely on seats for validation
- Added a WhiteLabelingFeature, which extends WorldoxFeature, ignores seat checking for validation, but checks the user domain for a valid cloud domain.
- Added a WHITELABEL user count strategy to WorldoxFeature
 

6.6.3 - DevRelease

- The license code for OpenText is "O", and licensing is based on active people in the system (seats)

20170707

6.6.2 -

- Bug fix - some encrypted PDF files weren't being flagged as encrypted during analysis (they were failing during OCR and landing in the Needs Attention list)


6.5.71 -

- Better handling for quasi-invalid PDF files (PDFs that have null AcroForms objects now are handled cleanly) - ClassCastException after processing


6.5.70

- Added a new global "alwaysAnalyzeAndProcessNoImageNoTextPages" setting to the config file, under a new "heuristicComputerProvider" setting.
- The new setting will default to false (existing behavior) but can be manually set in the config file.
- The new setting will provide a way to allow no image no text pages to be forceably processed across the board, instead of manually per document.

20170614


6.5.69

No Changes
 
6.5.68 -

- Fixed UI bugs - Folder configuration 'Add' button wasn't rendering properly.  Some buttons didn't display 'Hand' cursor to indicate they are clickable.

 
6.5.66 -

- Modified the buttons on the NetDocuments approval page to use the same styles as the other pages


6.5.65 -

- Removed the reprocessing of TOO_OLD docs on startup

6.5.64 -

- Modified Analyzer installer to elevate level to admin
- Created a new "createInstaller.cmd" script that creates the Analyzer installer and digitally signs it
- "createInstaller.cmd" relies on two new settings in deployment.properties
    deployment.resources.signature.location=[signature location]
    deployment.resources.signature.secret=[signature password]

6.5.63 -

- Modified the NetdocsFinder to set documents in the Too Old folder to REPROCESS, when the Process legacy documents setting
    is enabled.
- Added TOO_OLD as a new value in DocumentAccessibility enum
- Reprocess TOO_OLD docs on startup   
 
6.5.62 -

- Added styling for disabled buttons
- Fixed issues with paging buttons on Document List page

6.5.61 -

- Removed legacy and deprecated css links, which were overriding our settings
- Added new reset.css to undo button settings set by the browser
- Modified the button style to force a hand pointer when the mouse is over the button
- Removed [] from the NetDocs Log In button
- Returned the Learn More links on the Document List pages to be hyperlinks rather than buttons

6.5.60 -

- Fixed more buttons who missed the style upgrade.
- Modified button styles again to make shadows less invasive

6.5.59 -

- Modified style for Scheduler config Delete buttons to remove the border

20170310

6.5.56 -

- Modified the Analyzer installer script to create a file with "c" version, and to point to the SOCR Pre-release installer

   
6.5.54 -

- Added ability to override the host name used when displaying the user interface.  Settings.xml, webServerConfiguration, hostname (this is not added to the config file by default - you'll have to set it explicitly)

6.5.53 -

- The email of the logged in NetDocuments user now appears after the login name on the NetDocuments configuration screen


6.5.51 -

- Modified Analyzer configuration page to allow for the setting of MSG (email) attachment processing. This setting will sync with the
    same setting on the Processor configuration page, so that both are either on or off.

6.5.50 - dev release

- Bug fix - Symphony OCR hangs when processing some large, complex PDF files.  java.lang.OutOfMemoryError: Java heap space message appears in logs


6.5.49 - dev release

- Added the Windows user name and fully qualified hostname to heartbeats

   
6.5.48 - dev release

- Fixed potential issue with supporting MacRoman character sets (used by PDFs generated on Mac computers) under newer versions of Java


6.5.43

- Updated verbage of SharePoint configuration page, and added ability to set frequency for a legacy file finder


6.5.40

- Added SharePoint URL to the SharePoint configuration screen

6.5.37

- Updated SOCR to work with NetDocumentsAPI v0.0.41
- An audit entry will be added to NetDocuments documents when OCR is completed, rolled back, etc.
- The audit entry for Worldox documents when OCR is completed was modified to be more descriptive. Audit entries were added when
    documents are also rolled back, etc.

6.5.36 -

- Bug fix - [View Timeline] links in Simple View allowed users access to the non-simple UI


6.5.33 -

- Regression bug fix - introduced 6.5.31 - Installation on machines without Java 8 result in 'UnsupportedClassVersionError' popup dialog on launch

6.5.32 -

- Bug fix - OCR and Analyzer working directories had garbage files left behind if SOCR was shut down in the middle of OCR or analysis
- Adjusted default settings for determining maximum page size that will be processed.  This is now specified in total pixels (instead of maxHeightPixels and maxWidthPixels).  The new setting is maxPixels, the default is 36000000, which is a little larger than a C sized sheet of paper.  For backwards compatibility, if the existing configuration has maxHeightPixels and maxWidthPixels set to 10000 or 12000, the default is used, otherwise maxPixels is set to maxHeightPixels x maxWidthPixels


20170217

6.5.30

- Regression bug fix, introduced in 6.5.22 - OCR engine fails to run on Windows XP workstations - error message refers to ADVAPI32.dll procedure entry point RegSetKeyValueA
* Move to SymphonyOCRProcess.exe 6.5.0.30

6.5.28

- Bug fix - out of memory errors when analyzing PDFs that contain an excessive number of embedded fonts
* Move to 5.5.11-SNAPSHOT.jar (disables font caching in PdfContentStreamProcessor if the file has more than 10 fonts)

6.5.25 -

- Bug fix - ShareFile integration was only searching 'Shared Folders' (not 'My Files & Folders' or 'Favorite Folders')

6.5.22

- Added support MICR processing (magnetic ink characters typically found on bank checks).  When enabled, only a single processor core will be used for OCR, so this should only be enabled for sites that truly need MICR processing.

- If MICR processing is enabled, that will be indicated in the Advanced Settings section of the Processor Config screen

6.5.21 -

- Improved task status message when manually initiating Compact Database command in Debug screen (it used to say 'maintenance completed' until the maintenance got underway)

20161014

6.5.20 -

- Refinement to 6.5.19 bug fix - if a mutation failed during processing, the document was being put on Needs Attention.  It now gets put onto Reprocess.

6.5.19 -

- Bug fix - If processing was actively working on a document while nightly database maintenance happened, database corruption (or forced shutdown of SOCR) would occur


6.5.17 -

- Bug fix - files marked as read-only weren't being processed, even though 'Allow processing of read-only files' was enabled


6.5.16 -

- When rolling back, we now reset the previous analysis results
- Additional handling for invalid PDFs that have page rotations other then 0, 90, 180 and 270.  SOCR now handles these pages just like Acrobat (which means that it ignores the angle specification entirely)


20160909

6.5.13 -

- No change

6.5.12 -

- If SOCR encounters an out of memory error, it now kills the application (want to prevent database corruption in the event of a memory problem)

6.5.11 -

- Corrupted file detection enhancement - some rare corrupted PDFs caused documents to appear on the Needs Attention list instead of Corrupted
- Bug fix - post-OCR PDF failures could cause the working directory to become locked - after that, all future processing would wind up in the Reprocessing list

6.5.10 -
 
- Adjustment to NDApi calls to specify NDVaultLocation
- Bug fix 'Engine not initialized — Cannot run program' and 'CreateProcess error=19' and '%1 is not a valid Win32 application' errors on Windows XP machines


6.5.9 -

- Folder document source now adjusts the modified time of the file by 1 minute (this is to make sure that indexing services will see that the file has changed)

6.5.8 -

- Regression bug fix - 'null' error when processing partial documents (only some pages in the PDF needed to be OCRed)


6.5.7 -

- Adjusted Processor error handling behavior - we now pause processing (or analysis) if we encounter more than 10 errors in a 15 minute window (prior behavior was 5 errors in a 60 minute window)

6.5.6 -

- Regression bug fix - "Execution of parallel task failed: Not enough memory! Failed - 0x80004005" error when processing documents that have low image quality.


6.5.5 -

- Bug fix - If Processor was restarted at exactly the wrong time, Processor would display error message "Unable to add pages to page tracker - null. Restart Symphony to clear this error."


6.5.4 -

- Bug fix - PDFs that were missing MediaBox field on a page definition resulted in the file being marked as corrupted.  Technically, these are not valid PDFs, but we can still analyze them by assuming a default page size of 8.5x11"

6.5.3 -

- Bug fix - foreign characters didn't display properly in the SOCR web interface
- Bug fix - some files with foreign characters in filename would fail to OCR and appear in the Needs Attention list with error messages about the file not existing (and the file name shows many question mark symbols)
- Bug fix - files that a user had opened in Folder and Worldox document sources were placed on Inaccessible list, and didn't reprocess until the next day.  Now these files will be placed on the Reprocessing list, and will be processed when they are found again (assuming the user has closed the document by then)

6.5.2 -

- Improved error message if Outlook isn't unavailable to indicate that 32 bit Outlook is required
- Added note about 32 bit Outlook to the Processor config screen

6.5.1 -

- No change

6.4.125 -

- Added support for non-western characters in OCR results

20160707

6.4.124 -

- Bug fix - Some documents winds up in the Reprocessing list repeatedly with "Unable to read for analysis - null" in the history. Null Pointer Exception when scanning some PDFs for digital signatures.

6.4.123 -

- Bug fix - MSG attachments for MSG files in deep folder paths resulted in the document being continuously placed on the Reprocess list

6.4.122-

- Re-enable flush of content pages every 100 pages (this was originally enabled in 6.4.23, but inadvertently disabled in 6.4.30
6.4.121 -

- Certain "page too big" errors were resulting in document being placed in No Image/No Text list instead of the Too Big list

6.4.115 -

- Added support for additional languages (language dictionary files are not being deployed yet, so that needs to be done before we really use this - currently only English, Spanish, Brazil dictionaries are part of the Engine installation - more can be added as needed)
Specifying languages is currently done in settings.xml in the <ocrHandlerProvider languages=""> element. Values should be comma separated, with no spaces. Default is "English".

6.4.110 -

- Add support for custom NetDocuments OAuth connection parameters (this will allow the firm to request that NetDocuments preserve modified by and modified on values during OCR)

6.4.104 -

- Bug fix - ND documents that were on legal hold, archived, signed or approved were repeatedly processed. They will now be placed in the appropriate Not Processed queue.
6.4.103

- Bug fix - detection of non-8.3 paths in Worldox document repositories wasn't working properly on certain NTFS volumes

6.4.102

- Improve error handling and reporting in PracticeMaster integration

6.4.100 - 

- Bug fix - Memory leak in NetDocuments integration (eventually resulting in OutOfMemoryException errors and a heap dump in the logs folder)
- This bug would have also impacted ShareFile integration, so that has been fixed as well

20160201

6.4.99 -

- Added ability to process digitially signed documents (note that this *will* invalidate the digital signature) - this is enabled by adding allowDigitallySigned="true" to the documentPreProcessor element in settings.xml

6.4.97 -

- If manual license update check fails, we now display the error message (before, it was showing an exception trace)

6.4.96 -

- Bug fix - grace period wasn't working properly
- Added new scheduler entry for auto-checking for license udpates from Trumpet license server. These updates are auto-scheduled for a random time on a weekday. The update check won't actually happen unless at least 3 days have passed since the last update check, or if the license status is in a warning or error state (in which case it'll check once per day at the scheduled time)
- Bug fix - ND and SF finders weren't starting up on brand new sites (had to stop and restart SOCR after entering the license number)
- Updated help hyperlinks for Not Processed lists
- Bug fix - heartbeats were including pages left, even for unlimited page licenses
- Notification configuration now says "When there are warnings or errors" instead of "When there are warnings"
- Added Check for Updated License button on Licensing page

20151214

6.4.94 -

- Bug fix - Analyzer and Processor failed to start - error message about 'The application has failed to start because its side-by-side configuration is incorrect'

6.4.92 -

- Better handling if the processor or preprocessor throws an uncaught error - we now shut the processor down and report an error

6.4.91 -

- Bug fix - if backupFileRoot was pointing at an old SOCR installation directory (e.g. SOCR folders in C:\Program Files\ instead of C:\Program Files (x86)), backups and processing would fail. We now detect this situation and change the setting to the relative .\work\backupfiles value.

6.4.90 - dev release

- Adding debug lines to troubleshoot failed backup copy

6.4.89 -

- Bug fix - NetDocuments support for EU data centers wasn't allowing login to the EU site
- Added support for AU data center (still have to manually adjust in the settings.xml file)

6.4.88 -

- Bug fix - NetDocuments support for EU data centers wasn't directing to EU login page

6.4.87 -

- Database compaction algorithm will now purge any document records that were damaged by small database corruptions

6.4.86 -

- Bug fix - on machines running on unreliable networks, network hiccups in the middle of analysis could case SOCR to crash to ground (error message EXCEPTION_IN_PAGE_ERROR (0xc0000006) )
- Bug fix - backup results were being writting to the <install directory>\work\backupfiles folder instead of the app working folder override
- Made GeneratePerformanceSummary task do nothing (it's really not needed anymore), removed GENERATE_PERFORMANCESUMMARY_TOPROCESS and GENERATE_PERFORMANCESUMMARY_PROCESSED from any existing scheduler configuration
- Change default schedule time for "Re-analyze Re-Process lists" to be 11:30pm every day, instead of 12:00 every day
- Change default schedule time for "Purge backups" task to be 4:00am every day, instead of 12:00 every day

6.4.85 -

- Added Processing Time (ms) to CSV export of document list

6.4.84 -

- Bug fix (Case 35047) - When starting or restarting SOCR with LSSe64, the LSSe64 finder would often fail to start, the LSSe64 finder now starts reliably.
- Bug fix (Case 35024) - SOCR Was processing documents with extensions that weren't PDF, TIF or MSG and adding these to the Corrupted list, such documents are now ignored.

6.4.83 -

- Bug fix - PDFs with small number of pages but really big image content (i.e. hi resolution photographs embedded in the PDF) could cause OCR to fail with an out of memory error
- Bug fix - When entering text into the LSSe64 DB password field, it was not masked with ****. This is fixed by using a HTML "password" input type rather than "text".
- Bug fix - SOCR for LSSe64 was attempting to process non-image documents (e.g. .DOCX files) and this led to them appearing in the corrupted documents list. SOCR for LSSe64 has been modified to now only process documents with a PDF, TIFF or MSG extension.

6.4.82 -

- Adding support for NetDocuments EU data center (just added to settings.xml at this point - not in UI yet)

6.4.81 -

- Added ability to filter found documents be explicit dates (this is done by editing the settings.xml file, finderHandler section, cutoffTimeHigh and cutoffTimeLow values)

6.4.80 -

- We now send heartbeat when the user changes their license

6.4.79 -

- Analyzer now ignores stroke and fill color operators in PDF (rg and RG) (we were seeing some PDF files where these operators weren't being properly used, but that failure isn't going to prevent us from determining whether the file is safe to process, so we'll just ignore those types of errors)

6.4.78 -

- Document processor is now stopped and started during backup purges

6.4.77 -

- Doc source type for practice master wrong due to inheriting folder finder task's characteristics. Added a simple override to address this.
- The check for a filename being within a folder pathnames was errant, the logic needed to be reversed.

6.4.76 -

- Better error handling for NetDocuments API timeouts

6.4.75 -

- Better logging on 0xfffffffd msgedit errors

6.4.74 -

- Replaced 'Advanced' side menu with 'Debug' link in upper right corner of Welcome screen only
- Adjust build script so it reads from ${user.home}/.m2/deployment.properties instead of properties being defined explicitly in settings.xml

6.4.73 -

- Better error message for Outlook installation issues (msgedit error 0xfffffffd)

6.4.72 -

- Cosmetic fixes throughout (PracticeMaster instead of Practice Master, Advanced heading in PM config screen appeared twice)

6.4.71 -

- Added whether we are running as a service or not ('Service: true') to heartbeat status

6.4.70 -
-Refactored Practice Master code to reduce complexity.
-Improved validation and error reporting when user makes erroneous configuration changes to Practice Master.
-Simplified the generation and processing of the HTML delivered to user's browser for Practice Master.

6.4.69 -

- Improved error handling if a database commit fails (database is marked as corrupted and is stopped so further damage can't occur)

6.4.68 -

- Added Advanced Processor configuration setting to control the maximum number of cores that will be used during OCR. If left empty, we will use all available cores (up to 4). Right now, the setting must be empty or a number between 1 and 4.

6.4.67 -

- Switch resource server to http://resources.trumpetinc.com

6.4.66 -

- Bug fix - OutOfMemory errors when processing really large NetDocuments documents
- Move to NetDocsAPI-0.0.14.jar

6.4.65 -

- Switch heartbeat server to http://heartbeat.trumpetinc.com/heartbeat/sendheartbeat.jsp

6.4.64 -

- Added Worldox validation message for version WDAPI.20150624.1852 indicating that version of WD has a bug when processing legacy documents

6.4.63 -

- Improvement to orientation correction code to avoid analyzer lockups when hitting huge pages

6.4.62 -

- Bug - divide by zero exception in Processor configuration screen when all pages of a trial license are consumed

6.4.61 -

- Moved to sswebservices-0.0.3.jar (logging enhancement)

6.4.60 -

- Bug fix - deleting a document record from the Document Detail view resulted in ClassCastException instead of returning the user to the Welcome screen

6.4.59 -

- Improvements to Active User count strategy (we will now fail if the WD license isn't valid or if the version of WD doesn't support active user count determination

20150923

6.4.58 -

- Bug fix: Some sites running Server 2008 R2 in rare configurations (SMB1 with loopback mapped drives) would kernel fault with a Blue Screen of Death (operating system bug triggered by behavior in one of SOCR's analysis modules). This is now fixed.

6.4.49 -

- Added specific error message if a Worldox document couldn't be processed because of missing 8.3 filename information

6.4.48 -

- Added a message to the end of the maintenance screen indicating that maintenance is complete

6.4.44 -

- New feature: Users can now roll back to the un-ocred version of the document as long as the document hasn't been modified since it was OCRed (this is available even if the short term retained versions have been purged) 

- Enhancement when processing huge file (thousands of pages) - disk space usage is now limited to around 12 GB during processing (prior, it could expand indefinitely - approximately 12MB per page)

6.4.43 -

- Bumped the maxWidthPixels and maxHeightPixels values from 10,000 to 12,000 - there were a lot of engineering drawings that were just barely above the 10K limit (i.e. 10804x7212)

6.4.42 -

- Minor bug fix - LSSe64 integration was trying to connect to the SQL database, even if the license wasn't activated for LSSe64 (move to lazy loading of connection)

6.4.41 -

- Changed logging when profile group isn't available via WDAPI to be debug logging (logs were getting flooded when a PG was removed from processing)

6.4.40 -

- Added debug lines troubleshooting FOLDER document source issue
- REGRESSION - Bug fix - all documents at Folder Tree sites were being sent to Unavailable list.

6.4.38 -

- Bug fix - Null Pointer Exception in some corner cases when saving changes to NetDocuments integration configuration

6.4.37 -
- Added support for Practice Master DMS which uses a simple file system folder based document storage strategy.
- Added support for validating SOCR max licensed users against Practice Master's own configuration file's max licensed users.
- Small number of internal code refactorings with no functional visibility.

6.4.36 -

- Bug fix - clicking on Open links for Worldox documents resulted in an empty tab opening in the web browser

6.4.35 -

- Ensure we validate the SOCR licensed user count against the LSSE64 licensed user count.

6.4.34 -

- Fix bug in which null was returned for a document's priority level.

6.4.33 -

- Increase lengths of input fields for DB credentials and restart the LSSe64 finder when we change credentials.

6.4.32 -

- Disable a test and adjust the logic of another test in a suspect area, add a native paths and finally also remove an old import and up the overall version number.

6.4.31 -

- Added support for flagging certain versions of Worldox as having problems (see WorldoxConnection#getStatusResult() )

20150721
6.4.30 -

- Changed installer so Run as a Service message indicates that it won't work with Worldox sites

6.4.29 -

- NetDocuments compatibility fix - sites that didn't have legacy processing enabled weren't finding documents to process since the most recent ND update

6.4.28 -

- Bug fix - introduced in 6.4.11 - Files with mismatched extensions (e.g. a PDF with a TIF extension) wound up in an infinite 'analyzing' loop
- add ability to specify location of dev mode abbyy home
- adjusted how development mode path determination is made (com.trumpetinc.development.abbyybinfolder system property)

6.4.26 -

- Bug fix - the backupFileRoot was being stored as an absolute path instead of a relative path. End result is that copying configuration from a 32 bit machine to a 64 bit machine resulted in the default backup location incorrectly pointing at C:\Program Files\ instead of C:\Program Files (x86)\

6.4.25 -

- Added a setting to settings.xml to set a maximum page count limit on what documents will be processed. Documents with more than the limit of pages will be put onto the Too Big list. Setting is not configurable through the UI - you must edit settings.xml directly and add the following to the existing <documentProcessor ..... /> element: maximumPageCount="300" (or whatever page count limit you wish to set)

6.4.24 -

- Bug fix - if MSG handling was enabled on a machine without Outlook installed, it was not possible to disable MSG processing (though it looked like it was disabled in the Processor config screen). End result was a permanent warning message about the MSG sub-system not working

6.4.23 -

- Accuracy improvements in OCR engine
- Bug fix - out of memory errors when processing really big PDF files (thousands of pages)

6.4.22 -

- Improve OCR accuracy in some corner case scenarios

6.4.21 -

- Increased accuracy of OCR engine by making it slightly slower

6.4.20 -

- Added better error message if NetDocuments login doesn't have permissions to query user count
- Add note to NetDocuments Login button indicating that the user must be an NetDocuments Admin

6.4.19 -

- Added caching to progress graph display - this writes the graph data to disk every 5 minutes, and reads that data on launch. This should allow us to display the graph right away, without there being a refresh period (which could be quite long at sites with lots of documents)

6.4.17 - Dev release

- Experimental functionality for emitting OCRed PDF page content incrementally - should fix problems with running out of memory when processing really big files
- Right now, this flushes every single page - before moving to pre-release we probably should change that so it flushes every 50 or 100 pages

6.4.14 -

- Bug fix - issues when reading MSG files could result in user interface displaying RuntimeException stack trace

6.4.13 -

- Added ability to override maximum heap that Symphony OCR will use. This is done in a launch.ini file that must be stored in the SOCR application directory. The setting is controlled in the [JVM] MaxHeap=512m setting (this is the default - 512 MB heap). It can be increased - so for example, MaxHeap=1024m would increase it to 1GB.

6.4.12 -

- Fix issue where integration with old Worldox GX2 sites failed with 0xfffffff error

6.4.11 -

- Issue fix - web based DMSes can lose connectivity. When this happens, documents pending analysis and processing wind up being moved to the Unavailable list. When they become available again, SOCR was re-analyzing the files. This caused a lot of unnecessary downloading.
- If a document hasn't been changed since last analysis, it will not be re-analyzed unless the user explicitly clicked Re-Analyze in the Document Detail screen or the Document List screen

6.4.9 -

- Suppress error log entry that is logged if graph generation is interrupted (not adding any value - this is normal behavior)

6.4.8 -

- Make SOCR so it only complains about spaces in a profile group's base path if the profile is defined on a mapped network drive (i.e. doesn't start with \\)

6.4.7 -

- Added Ingore capability to all documents in the Backlog list (in case users want to ignore documents that haven't been analyzed and/or processed yet)

6.4.6 -

- Removed warning message added in 6.4.5 - it looks like WDAPI won't actually block if the PG is set to read-only

6.4.5 -

- Added warning message if a selected Worldox profile group is marked read only

6.4.4 -

- Add history when file has too many pages to process with the current license

6.4.3 -

- Bug fix - huge PDF files (>10K pages) processed during free trials would cause SOCR Processor to stop. These documents will now be moved to REPROCESS

6.4.2 -

- Make Statistics panel on main page show number of OCR backlog documents as well as pages

Summary 6.3

Here's an overview of the major changes:

  • Unlimited page count processing: For firms that have a per-user annual subscription, you now have the freedom to scan and save all of your documents without worrying about your page count processing limit (unique exceptions may apply).
  • Email notifications: Configure Symphony OCR to send daily status emails to alert your Symphony administrator of errors or problems OCRing documents.
  • Doc ID search: You can now use the document filepath or Doc ID to determine whether a document has been OCRed.
  • Streamlined home page: The Symphony OCR home (summary) page has been redesigned with a cleaner, simpler look.

1/20/2015

6.3.65 -

- Bug fix - in rare situations, SOCR could fail to launch with 'ConcurrentModificationException' stack trace

6.3.63 -

- Bug fix - install as a service always resulted in 0x421 error
- Adjusted label on username field to indicate domain\user
- Bug fix - installer wasn't always remembering the correct installation path

6.3.62 -

- Bug fix - scheduler task wasn't working properly if tasks were scheduled for later the same day
- Bug fix - changes to scheduler configuration weren't taking effect until SOCR was restarted
- Display issue - label on Stop OCR Processor task was missing 'OCR'
- Small tweaks to layout of scheduler interface (moved the activity, time and days fields around so they are more intuitive, changed the order that activities appear in the drop down list so they are in order most likely to be used)

6.3.60 -

- Installer - display appropriate header text in screen that prompts for the user to run as

6.3.59 -

- NetDocuments Create Versions setting was turned off by default - it is now turned on by default
- Fixed documentation link on option to enable legacy processing

6.3.58 -

- Workaround for NetDocuments Invalid Hashable error when trying to get the display path of a document (ND changed their API)

6.3.57 -

- Make display of Ignore, Reprocess, Delete and Adjust Priority so they are consistent between the document lists and document details views

6.3.56 -

- Fix 'communication timeout' errors when sending nightly notificatons

6.3.53 -

- Installer bug fix - cloud installs were launching SOCRTray after the installer finished (now it correctly launches SOCR.exe)

6.3.52 - 

- Bug fix - ShareFile integration would say that it wasn't connected when it clearly was

6.3.51 -

- Removed finding of TIF files (changing file extension of an existing document causes duplicate files to be created in ShareFile) - we can bring this back in when we make version creation optional

6.3.50 -

- First iteration of ShareFile integration
- Added ShareFileAPI 0.0.2-SNAPSHOT

6.3.49 -

- Added a log (logs/autorotate.log) to capture the number of pages that were auto-rotated during processing - this is disabled by default - to enable, edit log4j.properties and change the "log4j.logger.autorotatelog=ERROR, AUTOROTATEFILE" line to "log4j.logger.autorotatelog=INFO, AUTOROTATEFILE"

6.3.47 -

- NetDocuments configuration screen had two Basic Settings sections - these have been merged

6.3.46 -

- NetDocuments configuration screen now has an option to "Look for legacy documents". Disabled by default. When enabled, the ND integration will find all documents. Otherwise only documents modified in the past 7 days will be included in the Find phase.

6.3.45 -

- Notification emails now include the name of the SOCR machine in the subject

6.3.44 -

- SOCR now tracks the original file modified date of each document it finds. This is the date used in reporting backlog metrics. End result is that if the DMS forces the modified date to change during processing, the backlog progress graphs will still display the number of pages added over time properly

6.3.43 -

- Changed NetDocuments 'Create versions' option to default to 'true'

6.3.42 -

- SOCR uninstaller will now shut down existing running instances of SOCR (both running as user AND running as service)
- SOCR will no longer give the "run as user/run as service" dialog for cloud installs (default will be "run as user")
- Startup shortcut wasn't being removed during uninstall

6.3.41 -

- Adding support for running SOCR as a windows service
- When running as a windows service, it is not possible to shut down SOCR from inside the user interface - shutdown must be performed using Windows Services
- Added SymphonyOCRTray.exe - this is an applet that runs and puts the SOCR icon in the system tray when SOCR is running as a service

6.3.40 -

- Bug fix - HTML in Needs Attention document lists could be misrendered if the reason contained <<snip>>

6.3.38 -

- Special handling for ND files that were emailed directly into ND (ND forces us to create a version of these types of files)

6.3.37 -

- Installer wasn't adjusting the modified date on the sample images
- Improvements in error/warnings condition reporting during typical ND initial configuration use-case

6.3.36 -

- Ignore button missing in Details screen for Too Big documents
- Add knowledgebook hyperlink for ND config screen
- Added 'Processor > Basic Settings > Automatically rotate pages to proper orientation' option. Enabled by default. If turned off, SOCR will not adjust the page orientation in the output PDF.

6.3.35 -

- Display the repository name along with the cabinet name in the NetDocuments configuration screen
- Tweak system status display so we can click into the license screen if there are licensing issues

6.3.34 -

- Bug fix - if NetDocs document meta data wasn't available for computing workspace path, an IllegalStateException was being thrown

6.3.33 - 

- Bug fix - nullpointerexception if unable to get path from workspace information for ND document
- Enhancement - Changed Lookup By Path to 'Lookup Document'. Users can now type in the doc ID of the document (as it appears in the source document management system). SOCR will query the DMS to locate the actual path of the document and display the details.

6.3.32 -

- Buf fix - ND integration wasn't searching for MSG files

6.3.31 -

- Cloud installer now auto-detects that we are on a WD Cloud terminal server and sets the default install location to "<path to CID Folder>\blah"

6.3.30 -

- Bug fix - files in FOLDER finder were still being processed, even if the folder was marked as inactive
- Added Create Versions option to NetDocuments integration
- MSG files from NetDocuments will now always create a version if the current file only has a single version (this is a special requirement from ND)
- If a file in ND is detected with an incorrect extension, a new version will be created with the correct extension

6.3.29 -

- Installer is now 'cloud aware' (safe to install into the WD Cloud environment)

6.3.28 -

- Summary: Don't display 'Current OCR throughput' data unless OCR has actually happened
- Bug fix - SSCLOUD and SOCRCLOUD licenses weren't working

6.3.27 -

- Added Powered By Abbyy and Trumpet logos to maintenance screen
- Removed msg files from NetDocs finder - NetDocs doesn't do full text indexing of MSG files, so there's no point in processing them

6.3.26 -

- Added spacing between table cells in document list display (prevent path and page number from being too close together)
- Improved error message if ND workspace configuration isn't set up properly
- We now track when the ND refresh token is valid through (1 year) and display a warning to the user 15 days prior to them needing to manually re-authenticate the SOCR -> ND connection
- We now track when the ND authentication token is valid through (24 hours, or 45 minutes of inactivity, whichever comes first) and auto-reset the connection instead of waiting for it to fail on an actual call
- Changed icon to indicate OCR (differentiate between S-Pro Workstation sys tray icon)
- Changed the UI for setting frequency for backlog searches to it works in days instead of hours
- Changed default search frequency for backlog searches to be 7 days

6.3.25 -

- NetDocuments configuration screen now hides the settings unless the connection to ND is established
- If the connection to ND is not established a button appears for the user to explicitly connect Symphony to ND


6.3.24 -

- Bug fix - NetDocuments Finder wasn't auto-starting after connecting to ND
- Bug fix - heartbeats were being initialized and sent before everything was configured - this caused the 000000 Worldox user to be used for the first several minutes of the application running, even if a different user was configured
- Worldox 'open' links will now use wdox:// hyperlinks instead of generating wdl files if WD is newer than 8/15/2014

6.3.23 -

- Added Open link for NetDocuments document records

6.3.22 -

- Added support for NetDocuments DMS
- Added Progress Details screen (detail hyperlink next to system summary progress bar on Welcome page)

6.3.19 -

- Added display path in addition to the canonical path for each document. Filtering from the document lists will be performed against the display path. If the display path is different from the canonical path, the canonical path will be displayed as an additional attribute in the Detail screen of the document record
- Made email attachments so their display path is the name of the attachment instead of the awkward 0000000001 number

6.3.18 -

- Changed sample scan and msg files to be more fun (drink recipes)
- Bug fix - problems with MSG handling support weren't being detected immediately at launch (they only appeared after several minutes)

6.3.17 -

- Notification warning message will now display the full message in the Welcome screen instead of just 'Notifications have problems'

6.3.16 -

- Tweaked wording on No Image/No Text document list 'What's this' description

6.3.15 -

- Added ability for user to force processing of No Image/No Text documents. This can be initiated using the 'Enable Processing' button on the Document Detail screen, or using the 'Enable Processing' bulk action button on the Document List screen. These buttons only appear for documents that are in the No Image/No Text list.

6.3.14 -

- Added e-mail notifications (see new Notifications screen)
- Added overall progress bar to Summary (welcome) screen
- Removed the License Info section of the Summary screen for unlimited page count sites
- Added 'pages processed in past year' to the Statistics area of the Summary page
- Bug fix - SOCR was loading it's configuration twice during launch
- On startup, SOCR will now kill any lingering WBAPI.EXE instances that were started by other instances of SOCR
- Add Simple View links to Summary (welcome) screen - this will display a view of the summary page that doesn't contain any links to other areas of SOCR
- Changed the 'Basic View' link on the Search Summary screens to be 'Simple View', and moved it to the upper right corner of the Search Summary page

6.3.13 -

- Added a 10 day grace period if the Worldox user count goes above the licensed Symphony user count. During this grace period, the license issue will display as an Error, but processing will continue. After the 10 days, the issue displays as an Error and processing will stop.

6.3.11 -

- Improved note behavior from 6.3.10 to encourage users to actaully pay attention

6.3.10 -

- Added a note reminding users to enable email attachment indexing in their DMS next to the 'Process MSG (email) attachments' setting on the Processor configuration screen. This note hides/shows depending on whether MSG processing is disabled or enabled

6.3.9 -

- Regression bug fix - the hidden file bug fix from 6.3.6 got reintroduced in 6.3.7

6.3.6 -

- Bug fix - files that were marked as Hidden would be OCRed, but the conversion results couldn't be returned to the file system
- Bug fix - in some extremely rare instances, SOCR could generate a corrupted file (charset encoding issue)

6.3.5 -

- Added additional error trapping for corrupted MSG files (0x8004011b error code)
- If internals of MSG cause attachments to be inaccessible, the document record will now be put in the Inaccessible list (old behavior was to put it in the Reprocessing list)

6.3.4 -

- Added low disk space warning and error to backup manager. By default, these are set to 1.5 GB for warnings, and 1GB for error
- If disk space for backups drops below 'error' disk space level, documents will be moved to the Reprocessing list
- Levels can be adjusted by manually editing settings.xml and adding errorUsableSpace and warnUsableSpace parameters to the <backupManager .... /> element
- This check only occurs if backups are enabled

Summary 6.0-6.2

Here's an overview of the major changes:

  • Added support for email attachments - now, attachments will automatically be OCRed when the email is saved to Worldox.
  • OCR activity is now logged in the Worldox Audit Trail, so that you can note the full OCR history of a document within Worldox
  • Better page count management (for firms that have a large backlog of files to OCR)
  • New document prioritization levels
  • Ability to specify when each profile group should be OCRed
  • Better backlog reporting (explicitly display page usage in past year, recommended license size)
  • Ability to report on backlog progress for each profile group
  • Streamlined installation (Client ID and Partner ID data entry eliminated)
  • Page count will now reset on anniversary date, not January 1
  • User interface overhaul with consistent page layout and links to online knowledge books
  • Consistent heartbeat error/warning reporting

 

6.3.2 -

- Add support for unlimited page count Trumpet licenses (P0 feature in the Trumpet license)

6.3.1 -

- UI improvement - MSG analysis was showing page X of Y of the previous PDF analysis status, even though there weren't any pages being analyzed

6.1.62 -

- SOCR installer was creating empty config, data, logs and work folders in the directory containing the setup executable

6.1.56 -

- Bug fix - some database states weren't reporting in the system status
- If database fails to open, it's state is switched back to Closed before throwing an exception
- Graph generation failed with exception trace if there was no data

6.1.55 -

- Bug fix - SOCR was hard coding the full path of the backup folder, instead of using relative paths
- Bug fix - Backup manager would fail to make backups if the user migrated the configuration from a 32 bit machine to a 64 bit machine - this now gets automatically corrected

6.1.53 -

- If we fail to open database on launch or on scheduled rebuild, we now do a hard fail - present an error dialog on screen, then kill SOCR (with exit code 999)
- If we fail to re-open the database after scheduled maintenance, we now do a hard fail

6.1.52 -

- When OCR results are returned to Worldox, post an audit trail entry (Save)

6.1.51 -

- If documents.lg file is corrupted, we now attempt to delete it

6.1.50 -

- Backlog throttling algorithm adjustments (undoing some of the 6.1.48 changes)
- SOCR now bases it's default reserve on an assumed license duration that is 13/12 of the actual license duration (for a 380 day license, this equates to an additional 42 days). This will cover cases where licenses are entered prior to the license start date, at the price of slightly higher initial backlog processing before throttling kicks in.

6.1.49 -

- Changes to document priority in Worldox and Folders screens now run as a background task with progress displayed in the standard background task frame
- Changes to the Processor Config screen that result in bulk changes to documents (moving Wrong Type to Analyzing, or moving unprocessed email message to Analyzing, etc...) now run as background tasks

6.1.48 -

- Backlog throttling algorithm adjustments
- If there is more than 30 days worth of data, SOCR will now dynamically compute the reserve capacity (130% of the average number of OCRable pages added over the past year).
- If there is insufficient data, SOCR will now reserve 3/4 of of future processing capacity for new pages
- The minimum default reserve capacity is 50 pages/day (this can be overridden using overclocking)
- Not a change, just a reminder: In all cases, if the reserve capacity isn't used on a given day, those pages become available for backlog processing
- Split up the Processor setting blocks (Backup retention and backlog throttling settings are now grouped separately)
- Changed wording on backlog throttling / processing capacity reserve settings (plus but the checkbox and pages/day input field on the same line)
- The counts for the pages added in the past year are now stored to disk so we don't have to compute them immediately on startup (they are refreshed once per day)

6.1.46 -

- If MsgEdit.exe doesn't return results within 60 seconds, we now destroy the sub-process, abandon the call and throw an error

6.1.45 -

- Bug fix - bulk operation buttons were showing by default, they are now hidden until the user clicks the Show Bulk Operations button
- Bug fix - Email messages that contained MSG attachments that had double quotes and periods near the end of their names, and in-turn contained attachments resulted in errors during processing

6.1.44 -

- Added new process priority level: Analysis Only (no OCR) - these documents will be analyzed and will stay in the Processing list, but will not be processed
- Changed label on Very High and High processing priorty to have "(no throttling)" at the end

6.1.43 -

- Bug fix - background tasks would disappear from UI before they finished running (only on some browsers)

6.1.42 -

- Added new setting to Worldox config (only in settings.xml, not UI): autoMapDisconnectedDrivesEnabled if true (the default), any disconnected drives are mapped. If false, no drive mapping will be attempted.

6.1.41 -

- Bug fix - Worldox connection was getting reset immediately after launch
- Tweaking display of pages left in Processor feature on License screen (displayed inaccurate data for Jan 1 resetting licenses)


6.1.38 -

- Bug fix - backlog throttling wasn't working properly.  In some cases, it would allow runaway, unthrottled processing of backlog
- License detail screen now displays additional information about the license
- License detail screen Features list now gives info about how many pages have been used and how many are remaining
- Added columns to the CSV document export for priority, last modified

6.1.37 -

- Bug fix - sites that had UNC mapped profile groups where the UNC share was no longer valid would wind up with no PGs being found at all

6.1.36 -

- Added pages left readout to Processor config screen

6.1.35 -

- Documents with pages larger than a threshold (10,000x10,000 pixels by default) are now placed in a 'Too big' document list.  The size limits are configured in the PreProcessor section of the setings.xml file - maxWidthPixels, maxHeightPixels

6.1.34 -

- Bug fix - processing files with huge numbers of pages (>1000) could result in 'CreateProcess error=206, The filename or extension is too long' error
- Processor and Analyzer will now allow up to 5 errors in a one hour period before pausing processing
- Handle 0x80030050 errors (STG_E_FILEALREADYEXISTS) - these files are now flagged as corrupted
- Handle 0x80030005 errors (STG_E_ACCESSDENIED) - these files are now flagged as restricted
- Handle 0x80004005, 0x800300fa errors - these are flagged as corrupted now
- Added 'Reason' to Needs Attention list

6.1.33 -

- the order that PGs are searched is now driven by the default priority assigned to that group (high priority searched before low priority)
- the order that folder searches are added to the finder task is driven by the default priority assigned to each root folder

6.1.32 -

- documents in CORRUPTED list will no longer be auto-reanalyzed after every update
- Corrupted MSG files were being put into the Email Messages list instead of the Corrupted list

6.1.31 -

- Document List views now support Background Tasks display for bulk operations (Delete, Change State, Change Priority)

6.1.31 -

- Bug fix - attachments with tiff extensions (i.e. 4 characters) were not being handled properly (continuously put back on REPROCESS list)

6.1.29 -

- Backup purge now removes files from work\backupfiles folder tree if those files aren't referenced by a Document record

6.1.28 -

- MsgEdit failed to write output file if attachment names contained unicode characters
- New Background Task sub-system has been added - currently integrated into Processor Config (and parts of Advanced and Scheduler Config)
- Backup purge is now implemented as a Background Task

6.1.27 -

- Bug fix - MsgEdit.exe crashes sometimes
- Sub-attachment names are now prefixed with the sub-email message name that they came from
- Msg working folder is now flushed each time we invoke MsgEdit.exe
- SOCR will attempt to connect network drives for Worldox profiles that have been disconnected
- WDAPI32.DLL will now be completely unloaded when we reset the Worldox connection

6.1.24 -

- Fix for 'Premature end of file' and 'Content is not allowed in trailing section' errors when processing MSG files (MsgEdit wasn't closing results.txt properly).  MsgEdit 1.0.0.7

6.1.23 -

- If PDF was corrupted, but could be rebuilt during analysis, we now still mark the PDF as corrupted (error message is 'PDF is partially corrupted - but it can probably be repaired in Acrobat then resbumitted for processing').  These types of PDFs can't be modified in 'append mode' to place the invisible text layer, so it makes no sense to continue processing them (even though technically we can OCR them)
- Restricted documents now only go to the Encrypted/Restricted list if they would have otherwise been processed
- Digitally signed documents now go to the new Digitally Signed list if they would have otherwise been processed
- Digitally signed and encrypted state is now displayed in Document detail screen (if the document is signed and/or encrypted)
- Added list descriptions for a few document lists that were missing descriptions
- If document modified date is more than 1 day in the future, we process the document (instead of putting it in the re-process list) - we had some sites that had whacked modified dates (like 5 years in the future) on documents and SOCR was thinking that they were modified recently so kept putting them in the re-process list

6.1.19 -

- Better error message if something goes wrong during PDF analysis (include the filename and pagenumber)

6.1.17 -

- Add document name to warning logging when inline image parsing of pages in a PDF fails
- Added better description to the email related document lists (there was no description before)
- Better error handling if Outlook wasn't installed on the workstation (or isn't working for some other reason) - warning now appears in system status if there is a problem, AND MSG handling has been enabled
- We now have a new list for unprocessed email messages (MSG files go into here if MSG processing isn't enabled, or if Outlook isn't installed properly)
- When MSG handling is changed from disabled to enabled, SOCR will now mark unprocessed email messages and unprocessed email attachments for re-processing
- On launch, if MSG handling is enabled, SOCR will now mark unprocessed email messages and unprocessed email attachments for re-processing

6.1.14 -

- change installer - the Java bundle id download link is now BundleId=81819 (Java 7_u45)
- change installer - the Java installation now completely runs in silent mode - the user doesn't have to click through Java installation screens, and they aren't taken to a web site to test the Java install after it completes
- change installer - the Java installation is configured to NOT integrate with the web browser on the machine

6.1.13 -

- Message dialog when user launches SOCR multiple times is friendlier - and clicking OK on it displays the UI of the already running instance.

6.1.12 -

- First pre-release with MSG attachment handling

6.1.10 -

- Bug fix - sometimes after saving processor changes, analyzer would wind up not running

6.1.8 -

- Change labels on Processor config screen
- Adjusted 'Change state to' history messages to display the 'pretty name' instead of the CONTAINER, ERROR, etc... enum name
- Display attachment name in document detail screen
- Bug fix - when processing TIFF attachments, the attachment record in the parent document was still refering to the .TIF file extension - we now update the analysis results (which is where the attachment information lives) whenever we rename an attachment

6.1.7 -

- Bug fix - TIFF email attachments were staying with tif file extension even after being converted to PDF
- Bug fix - TIFF email attachments were being named 000000 instead of retaining the original name (with new extension)

6.1.5 -

- MSG files weren't being found in Worldox finders
- Eliminated 'allowedExtensions' setting in Worldox and Folder finder configuration - replaced with 'disallowedExtensions' - this isn't surfaced in the UI, but if there are firms that don't want us finding MSG files or what-not, we can set this

6.1.3 -

- Added 'Refresh' button next to Worldox PG list (allow users to see changes to the PG lists made since SOCR launched)

6.1.2 -

- Added 'Allow processing of email attachments' setting to Processor config
- Cleaned up Processor config UI a bit

6.1.1 -

- First build with support for MSG handling

6.0.17 -

- Bug fix - if the file in the underlying file system has no modified date set, database corruption could result when the document is added to the database.  See ticket 20124 for details.

6.0.15 -

- Bug fix - if a database corruption occurs, and the user does a database reset (renamed documents.db and documents.lg) before a rebuild can happen, all future launches fail

6.0.14 -

- Bug fix - profile groups with base paths containing spaces could prevent Finder from working, even if the profile group wasn't selected for processing

6.0.13 -

- Changed Folder feature description to "Windows folder tree integration"  (old description referred to 'processing' which could cause confusion)

6.0.12 -

- Bug fix - Warning wasn't showing if no Worldox PGs were selected

6.0.10 -

- Licensing no longer warns about unknown feature codes in new license format

6.0.8 -

- Bug - null pointer exception during page count calculations for old license type if Abbyy engine fails to initialize

6.0.7 -

- Move to 10.5.0.58b engine installer

6.0.6 -

- Added support for OCR of spanish and brazlian portugease documents

6.0.5 -

- Bug fix - older sites that were using the Auto Select PGs checkbox wound up with no PGs selected after upgrading.  We removed the Auto Select PGs option when we moved to version 6, so a compatibility shim was needed for those sites

6.0.4 -

- Bug fix - heartbeats were reporting the OCR backlog size incorrectly (often times not reporting it at all)

6.0.3 -

- Bug fix - long running finder tasks were showing 'Waiting for other tasks to complete' when they were actually running

6.0.2 -

- Simple view of search summary screen (the bar graph progress dialog) no longer has hyperlinks on the cabinet paths (this was allowing users to easily get to a non-simple view of things)

Summary 6.0

Symphony 6.0 brings a major change to the workstation user interface.  Here's an overview of the major changes:

  • New document prioritization levels
  • Ability to specify when each profile group should be OCRed
  • Better backlog reporting (explicitly display page usage in past year, recommended license size)
  • Ability to report on backlog progress for each profile group
  • Streamlined installation (Client ID and Partner ID data entry eliminated)
  • Page count will now reset on anniversary date, not January 1
  • User interface overhaul with consistent page layout and links to online knowledge books
  • Consistent heartbeat error/warning reporting

6.0.1 -

- Moved all changes from versions 5.3.13 to 5.4.51 to version 6.

5.4.51 -

- Bug fix - page count and document count usage displayed in the heartbeats was flipping (pages,documents then documents,pages) every time a new document was processed.

5.4.50 -

- Added debug output to track when processor and pre-processor are started and stopped

5.4.49 -

- Tweak to Unavailable list text - changed to Moved/Unavailable

5.4.48 -

- Heartbeats now include the actual status (WARN, ERROR) before the details

5.4.47 -

- Added download URL for engine installer if auto-download fails
- Bug fix - in Worldox and Folder config screen, the 'apply to existing documents' checkbox were displaying initially, even in modern browsers
- Bug fix - in Worldox and Folder config screens, clicking the 'select all' checkbox had no effect
- Worldox API session IDs are now generated using current clock time - avoid issues with accidentally reconnecting to old (corrupted) WDAPI

5.4.46 -

- Bug fix - Worldox configuration screen could fail to apply changes with NullPointerException in rare situations

5.4.45 -

- Bug fix - setting changes could not be saved if config.xml file didn't exist

5.4.44 -

- Bug fix - Heartbeat sender was crashing SOCR on launch if license wasn't populated

5.4.43 -

- Updated to jWDAPI 20130508 - adding ability to detect when WDAPI doesn't load (vs has errors)
- Much better error message when we aren't able to load the Worldox API: "Unable to load Worldox libraries - please close Symphony, launch and close Worldox, then re-launch Symphony.  Error details: Worldox API not initialized. Worldox must be launched and closed at least once for a given Windows login (this registers the Worldox programming interface). If you have recently updated Worldox, you may need to launch and close Worldox one time to get the update downloaded."

5.4.42 -

- Bug fix - heartbeat wasn't sending appid
- Switched to using new heartbeat post type (no client or partner ID)
- Added link to WD config screen from license error message (ticket 18458)
- Added 'Send Heartbeat Now' link to the License page (under Advanced section)

5.4.41 -

- Fix type-on in search summary screen

5.4.40 -

- Bug fix - analysis of some PDF files was showing incorrect image ratios (crop and media box extents issue)
- Bug fix - visible text outside the crop box was being included in the visible text counts - this text is now being excluded fro mthe visible text counts
- Added ability for SOCR to generate thumbnail images of pages that don't have one already (disabled by default)
- Added 'Generate Thumbnails' setting to Processor configuration

5.4.39 -

- S-OCR now tracks page counts based on the license renewal date (will take effect at the next license renewal)
- When the new page count system is active, the display of remaining processing capacity reflects the renewal date in MMM, yyyy format (or MMMM d, yyyy format if the license is for less than a year)
- Added New pages per year calculation to the welcome screen
- Added 'Recommended license capacity' to welcome screen if the current license isn't big enough to handle the backlog and one year's new documents
- Added explicit link for displaying All Weeks of the Backlog Summary screen
- Fixes a number of display issues with backlog throttling warnings and other error and warning messages
- Heartbeat now includes page usage summary for each year (only take effect when the new page count tracker is active, so it'll be awhile until we have good data on this)

5.4.38 -

- Add debug lines to troubleshoot issue with PG's not being found
- Remove config\TRE_settings.reg from installer
- Bug fix - an extra heartbeat sender was being created, and it was putting error messages in the log files

5.4.37 -

- SOCR now saves a backup of the previous settings.xml file (in the config\bak folder) every time the user changes settings.  These backups are retained for 90 days.

5.4.36 -

- Change headings on document detail screen to have 'Control' separate from 'Details'

5.4.35 -

- Change label on Processor Config 'Performance' section to be 'Performance  (since last restart)'

5.4.34 -

- Make summary screen progress bar show percent complete instead of percent remaining

5.4.33 -

- UI tweaks on summary screen

5.4.32 -

- Legacy schedule entries "RUNFINDER_QUERY", "RUNFINDER_SPIDER" are now discarded (previously, they would appear as "Operation (RUNFINDER_QUERY) not available" - because these particular tasks will never be available, we will discard them
- Improved look and feel of summary screen

5.4.31 -

- Bug fix - Heartbeats weren't sending on a regular basis (they only sent when the application launched).  Heartbeats should now be sent once per hour.

5.4.30 -

- Added SYMPHONYANALYZER license type

5.4.29 -

- Bug fix - documents in very low and low priority weren't being processed at all (even if backlog throttling wasn't active)

5.4.28 -

- Bug fix - queue analysis was messing things up if file modified date was greater than when the queue analyzer was first created
- Bug fix - Show/hide bulk operations wasn't working in Internet Explorer

5.4.27 -

- Bug fix - huge PDF files caused out of memory exceptions during processing

5.4.26 -

- Bug fix - invalid document modified times could cause queue analyzer to mis-calculate
- Bug fix - errors during document mutator notifications could cause DB to actually be corrupted

5.4.25 -

- Ignore All is now available in the Processing list bulk operations

5.4.24 -

- bug fix - missing files could cause process summary graph to be improperly computed ( java.lang.ArrayIndexOutOfBoundsException: -1 error )

5.4.23 -

- Enabled heartbeat sending using the old heartbeat format (otherwise heartbeats aren't showing up)

5.4.22 -

- Search Summary screen now displays a progress bar for each profile group (or folder)

5.4.21 -

- Bug fix - if two web requests were hitting at exactly the same time, they could conflict and result in ClassCastException errors.  This may also address problems where sometimes clicking a link didn't seem to always work.  This problem has been in the code since forever - I'm glad we got it fixed
- Bug fix - document in NEW list on Tia's VM testing - we now reprocess anything in the NEW list for good measure
- Fixed two knowledge article links (they were pointing at admin.php instead of index.php)

5.4.20 -

- Bug fix - log file had error messages related to Scheduler during initialization
- Bug fix - Summary screen was only showing results for first 50 documents
- Added display of which list the Summary is for (added 'in Processing list' to the end of the search criteria)

5.4.19 -

- Bug fix /maestro/do/status wasn't showing the status level
- We now re-analyze any document in the DELETED list when SOCR launches (documents should never be in the DELETED list on launch, this is a cleanup from an earlier bug) - these document records will almost certainly wind up being on the UNAVAILABLE list.

5.4.18 -

- Show Bulk Operations button now changes it's text to 'Hide Bulk Operations' if appropriate
- Bug fix - the bulk operations were always showing in certain versions of Internet Explorer (even with Javascript turned on)

5.4.17 -

- Bug fix - installer was forcing Cleaner Recovery every time SOCR was updated (yuck)

5.4.16 -

- Bug fix - in document list, clicking Reanalyze on a document, then applying a filter to the nexts screen resulted in errors
- If installer is unable to clear out existing files, it now gives the user Retry and Cancel buttons (instead of just aborting the installation)

5.4.15 -

- Bug fix - loading pre-5.4 databases could result in a failed launch with error in logs: com.trumpetinc.maestro.MaestroApp  - Problem during initialization - will attempt database maintenence when we launch again - -1757588944

5.4.14 -

- Bug fix - Reanalyze button on individual files was resulting in stack trace error
- If filter is specified as empty, or ending with a backslash, we now add an * to the end

5.4.13 -

- Removed Detail link from document list - details are now obtained by clicking on the filename itself
- Added 'Show Bulk Operations' button - clicking this displays a panel with the bulk operations on it
- Added 'Bulk Operations' pane with individual buttons for performing bulk operations on the filtered list results
- Changed 'View' to 'Open' for Worldox documents (will open the document in Worldox)
- Re-arranged per-document operation buttons so they lay out nicer for Folder and Worldox sourced documents ('Open' is now at the end of the list)


5.4.12 -

- Redesign Folder configuration screen - individual items can be enabled/disabled, added default priority setting, simplified adding new folders
- Added Default Priority to Worldox configuration screen
- Added 'Reprioritize existing documents' checkbox to Folder and Worldox configuration screens (this appears when the priority is changed)
- Added View Summary link to Folder configuration screen
- Added What's this section to Lookup by Path screen

5.4.11 -

- Advanced config option: New webServerConfigration -> listenPort setting - controls which port the internal SOCR web server will listen on.  Default is 14722.

5.4.10 -

- All links in the "What's this" sections of each page now open in separate tabs
- Fix capitalization of Processor, Analyzer and Finder in scheduler task names

5.4.9 -

- Lookup By Path - added Tip line
- Scheduler - removed Run Now button (doing that feature properly would require a lot of work - not worth it right now)
- Analyzer config screen, changed 'Settings' to 'Advanced Settings'
- In Licensing screen, Change Analyzer feature to just read 'Analyzer' and Processor feature to just read 'Processor'

5.4.8 -

- Scheduler will no longer schedule tasks that the license doesn't allow (prior, if a task was configured, then the license changed, the task would still be scheduled for execution)
- Scheduler Configuration screen only displays schedule entries that are for tasks that are allowed by the license
- Scheduler Entry Edit screen now only displays task types that are allowed by the license
- Added bulk priority change buttons to document list screens (not sure if this is the best UI for this...)
- Bug fix - Search Criteria in Search Summary page weren't showing the search description
- Search Criteria in Search Summary page can be clicked to get a list of documents meeting that criteria (i.e. all documents in a particular PG) - this will eventually allow users to adjust priority on all documents in a PG, for example

5.4.7 -

- Change 'read only' to 'read-only' in WD config screen
- Added Save button to Worldox basic settings section
- Save button in Processor and Analyzer screens are now below the Settings areas (consistent with other screens)
- Scheduler list now has a 'Delete' button for removing schedule entries.  The Edit Schedule Entry screen no longer has a 'delete' button
- Bug fix - "Client ID not set" error on clean install
- Configuration file settings:
   - documentProcessor autostart="true"  - if 'false', the processor will not start when S-OCR starts (useful for troubleshooting)
   - documentPreProcessor autostart="true"  - if 'false', the analyzer will not start when S-OCR starts (useful for troubleshooting)
   - MaestroConfig checkDatabaseIntegrityOnLaunch="false"  - if 'true', SOCR will check the database integrity as it launches (problems are logged as fatal errors to the maestro.log file, an error dialog will appear on screen and the launch will fail) - this slows the launch down, and shouldn't be used in production, but it would be very good to have this turned on in our test environments
- Bug fix - caption on Search Summary screen said 'Search Summary for Search Summary' - it now properly says the name of the search summary (e.g. Worldox profile groups)
- Ability to change individual document priority from the document list
- Bug fix - backlog calculation wasn't looking back 5 days (for all intents and purposes, anything older than today was considered to be part of backlog)
- "Backlog throttling active" warning message in Processor configuration screen now displays the actual date of the backlog cutoff
- Processing and Analyzing lists now have headings for processing priority groupings

5.4.6 -

- Bug fix - Scheduler is coming up without default entries on clean installs
- Date column in Processing and To Analyzing lists displayed the last time the file was processed.  For these two lists, this column will now display the modified date of the file.
- Added mechanism for controlling processing priority (Very Low, Low, Normal, High and Very High) - documents are grouped by their processing priority (and ordered by file modified date inside each priority group).  Groups Very Low and Low are always considered to be part of the backlog.  Group Normal documents are part of the backlog if they are older than 5 days.
- Detail:  Database rebuild is required by processing priority implementation - this will happen automatically the first time the DB is opened
- Worldox PG analysis results can now be obtained by clicking the new View Summary link in the PG selection list  

5.4.5 -

- Bug fix - adding an emty string as a folder gave a null pointer exception
- Added hyerplink for support docs to Folder config screen
- UI improvements to monitored folder add screen

5.4.4 -

- Bug fix - clicking on View on a document that didn't originate in Worldox caused Worldox to try to view that document.  The View button is no longer displayed unless the document originated in WD.

5.4.3 -

- Bug fix - warning messages related to folder processing were showing up in the Finder screen, even though Folder processing wasn't enabled by the license.
- Bug fix - If license disallowed Folder Finder, then a new license was entered that allowed Folder Finder, the Folder Finder did not get turned on

5.4.2 -

- Bug fix - old Worldox selected PG configuration wasn't being carried over into the new system
- Bug fix - "0 is 0 or negative" error when analyzing zero sized PDF files - these now get properly flagged as being corrupted
- Bug fix - the warning "No Worldox profile groups are selected" error shouldn't display "Configure Worldox" link when the user is on the Configure Worldox screen.  Same for folder screen.
- Bug fix - during installation, errors pop up about not being able to configure firewall exclusions - for now, we are going to remove this from the installer - it's not worth the hassle if it's not going to work robustly

5.4.1 -

- Bug fix - backlog throttling was activating improperly on short trial licenses because it was using december 31 of the current year as the license reset point in the calculations - Now, if the license duration is less than 90 days, the backlog throttling algorithm will use the expiration date of the license itself (instead of Dec 31 of the current year)
- Installer will now configure Symphony OCR rule in windows firewall
- Breaking change:  maximum file modifed date cutoff in the Finder is no longer honored - any sites that have this value set will start 'finding' older file.  This does NOT impact the maximum file age setting in the processor.  Note that this setting was removed from the UI awhile back (unless the user had it actually set)
- Breaking change: Auto select PGs has been removed - we should probably have users check the selected PGs after they apply this update
- Launcher has additional error messages (Should provide more feedback in some cases when the Java runtime is corrupted)
- Finer grained history notes if document is inaccessible
- Breaking change: It is no longer possible to force a document record to be created by typing it's path into the Search By Path dialog - if the document doesn't exist in the database already, it will no longer be created.  See http://forum.trumpetinc.local/viewtopic.php?f=2&t=905&p=7271#p7271
- Major new feature: Ability to page through list results
- Major new feature: Ability to filter list results.  All operations like Reanalize All, Ignore All will apply to the filtered list
- Added a new document list - Unavailable for documents that have their document source (worldox, folder, etc...) become unavailable (this can happen b/c of licensing, b/c the source is offline, or if the configuration of the source no longer includes the document - i.e. if the user changes the selected PGs in the Worldox DMS source)
- Major new feature: Support for new Folder DMS types
- Complete overhaul of Finder implementation to support pluggable DMS types (these are called Document Sources)
- Finder will now check existing document records for reprocessing, regardless of which list they are in (this check used to only be made against the document modified date - now it is made against the modified date AND file size)
- Uncommon document lists are now shown under a 'Other Lists' heading, if those lists have any documents
- If the analyzer or processor finds that a document is unavailable to the DMS (i.e. file was deleted, DMS is offline, etc...), those documents are moved to UNVAILABLE (they used to be moved to DELETED) - see
- New feature: Ability to run a scheduler task by clicking 'Run now' in the scheduler configuration screen
- Complete overhaul of how the scheduler works - we now run tasks at a specific time, instead of running at a certain frequency between a start and stop time
- Breaking change: some existing scheduler entries will show as "Operation (XXXXXX) not available" - this will definitely happen for RUNFINDER_SPIDER and RUNFINDER_QUERY, which are no longer part of the scheduler
- Errors when unable to connect to Worldox are more readable
- Moved to new license feature scheme (based on letter codes - see http://forum.trumpetinc.local/viewtopic.php?f=2&t=885&p=7154&hilit=symphony+ocr+license#p7192 )
- Bug fix - documents could wind up in a DELETED list
- Bug fix - retained backups weren't being purged for all documents (if documents weren't on the PROCESSED list, they wouldn't have backups purged).  Now the purge considers all document lists.
- The version of S-OCR that actually processed a document is now embedded in the PDF.  If we re-analyze the PDF, that version will be displayed in the document details under the 'Marked' heading
- S-OCR now uses 'append mode' when adding invisible text to PDF files.  In append mode, the changes to the PDF are added to the end of the original PDF file.  This makes it possible to completely recover the original PDF by just deleting the last XXX bytes off the file.  Testing shows that the file size is not adversley impacted by this (in many cases, the resulting file is actually smaller)
- Bug fix - SOCR was OCRing documents that had been configured to allow Adobe Reader Form Filling (in the process of doing this, the form is no longer savable by Reader).  The Analyzer will now detect this and move the file to encrypted/restricted.
- Move to itext-5.4.1-20130310.jar - adds ability to see if a PDF has usage rights (Reader form filling enabled)
- Complete overhaul of the web pages that make up the UI - trying to make them much more consistent
- Change in behavior when processing TIF files - we now preserve the document record (we just change the document path in the record).  In the past, we created a new Document record for the resulting PDF, and left the record for the TIF file hanging around.  This meant a lot of bookkeeping with keeping track of which document got converted to which other document.  This has all been stripped clean.
- Document backup restore has been rewritten so the restored document record is preserved if the file extension is different (we used to create a new document record)

5.3.13 -

- Bug fix - cleaner was not cleaning files that had been split by Symphony Profiler.  This has now been fixed - any documents in the Processed queue will be re-analyzed and re-cleaned again

5.3.12 -

Added logs/cleaned_files.log output with details of every file that is cleaned

5.3.11 -

Critical bug fix - Symphony OCR was misplacing OCRed text on pages that had to be rotated during processing.  In most cases, the text was placed completely off the page, making it possible to search *for* a document, but not search within the page (or copy text from the page).  This issue was introduced by a change in a 3rd party library on 6/20/2012 in S-OCR version 5.2.58.  This issue is now fixed.

After applying this update, any site currently running 5.2.58 through 5.3.8 will automatically enter a special recovery mode.  In this mode, all documents processed since 6/20/2012 will be investigated to see if any pages have text that lays outside the visual page boundaries.  If so, the invisible text on those pages will be removed, and the page will be re-OCRed.  This operation will not count against the site's annual page processing count.

The cleaning operation can be triggered manually by re-analyzing any document in the Processed list.  If the document is identified as having the problem, it will be moved to a new Backlog list named 'Cleaning'.  A special module ('Cleaner') will process this list.  The Cleaner module can be manually stopped and started from the Advanced screen.

5.3.10 -

Bug fix - Analyzer was identifying pages that had text right on the edge as being candidates for cleaning

Automatically give the client "bonus" processing capacity for any pages that we wind up cleaning  

Heartbeat now includes # of bonus pages for the current year (if > 0)

5.3.9 -

Critical bug fix - Symphony OCR was misplacing OCRed text on pages that had to be rotated during processing.  In most cases, the text was placed completely off the page, making it possible to search *for* a document, but not search within the page (or copy text from the page).  This issue was introduced by a change in a 3rd party library on 6/20/2012 in S-OCR version 5.2.58.  This issue is now fixed.

Any site currently running 5.2.58 through 5.3.8 will automatically enter a special recovery mode.  In this mode, all documents processed since 6/20/2012 will be investigated to see if any pages have text that lays outside the visual page boundaries.  If so, the invisible text on those pages will be removed, and the page will be re-OCRed

The cleaning operation can be triggered manually by re-analyzing any document in the Processed list.  IF the document is identified as having the problem, it will be moved to a new Backlog list named 'Cleaning'.  A special module ('Cleaner') will process this list.  The Cleaner module can be manually stopped and started from the Advanced screen.

For most sites, the automatic recovery mode should take care of everything without the user involvement

Added automatic detection of files that had misplaced OCR text.   

5.3.8 -

bug fix - If pages had to be rotated (CW, CCW or upside down scans) during processing, the resultant invisible text was not being placed properly on the page (in many cases, the text is entirely off the page).  Introduced in version 5.2.58 (when we moved to Abbyy 10).  This fix does not address documents that have already been scanned - we are working on that.

5.3.7 -

Bug fix - Analyze Worldox Profile Groups was presenting totals for all documents, not just the OCR backlog

5.3.6 -

Advanced screen now has Analyze Worldox Profile Groups command - presents a summary table with the total number of documents and pages in each profile group

5.3.5 -

Added support for files with really long filenames (>260 characters) (due to limitations in Worldox, files must still have 8.3 filename equivalents)

5.3.3 -

Added /wait=X command line argument (wait number seconds) - /wait=5 will wait 5 seconds before really launching the application

5.3.2 -

Bug fix - small corruption in library caused S-OCR to crash in some rare cases (legacy profile groups)
Updated jWDAPI.jar and jWDAPI.dll - 20121005

 

Summary 5.2

  • Better processing of documents with over 1,000 pages
  • Improved OCR status messages
  • Several bug fixes and enhancements

Changes

5.2.77-

Added additional 80004005 message checks: "This image file format is not supported", "Unknown error while opening"
Bug fix: Files were being left behind after processing, causing the Symphony PC's disks to fill up - this was introduced in 5.2.58 - recommend that any site on 5.2.58 or higher update

5.2.76 -

Error 0x80004005 with error output of "Invalid PDF file" or "PDF data is corrupted" now causes the document to get moved to the Corrupted list, instead of the Needs Attention list

5.2.75 -

Bug fix: Crashes and documents winding up in Needs Attention list when processing really big files (> 1000 pages)

5.2.74 -

Bug fix: Some errors during OCR could result in an on-screen crash dialog (MaestroOCRProcess has encountered a problem and needs to close).  This dialog prevents further processing until Close is clicked.


5.2.73 -

/maestro/do/status screen now includes a line that says whether the backlog is "large" or not (>2000 pages)

5.2.72 -

Bug fix: some documents that were prevented from being modified by NTFS security weren't being moved to the INACCESSIBLE list (they were being placed on the REPROCESS list in a continuous loop)

5.2.71 -

If we fail to open a file for analysis (i.e. b/c of file system security), we now discard previous analysis results (this will, hopefully, trap the case where document security was changed AFTER we did analysis - right now, we are using cached analysis results, so we don't see that the security has changed)
Bug fix: In some older sites, the processing priority for a document was set to 0 - this caused an infinite loop that prevented those documents from ever getting processed

5.2.70 -

Bug fix: estimated time to process backlog calculation displayed incorrectly if the time was less than a day, but more than an hour

5.2.69 -

Bug fix: TIF files were being placed in the Needs Attention list with error "Adding text to image failed - ASDFHKWSEEWQI\Trumpet\Symphony.....\image_1.tif not found".  Also, Processor could show error "TifImagePath is empty" errors.

5.2.67 -

Bug fix: Text content of files OCRed *after* they were picked up by the text indexer weren't being added to the text indexes during nightly rebuilds

5.2.66 -

License status will now show an error if the license doesn't specify the allowed number of pages

5.2.65 -

Fix null pointer exception when displaying Processor config in some corner case scenarios

5.2.64 -

Bug fix - null pointer exception if document factory was closed due to an error, then Symphony is quit (this could prevent Symphony from quiting)
Bug fix - Compact database now purges document records that have the same path  

5.2.63 -

Bug fix - StringLong key would have exceeded 300K error messages in logs
New /maintenance command line argument - forces a full compact of the database as S-OCR launches (may be useful in cases where database corruption prevents S-OCR from launching)

5.2.61 -

Improved OCR status messages - they will now display the page that is being worked on (if a page number is available)

5.2.60 -

Bug fix - [0x80020005] - ERROR: Type mismatch error when processing documents that contain barcodes

5.2.58 -

Bug fix: "Comparison method violates its general contract!" error when analyzing some PDF files
Changed MaestroOCRProcess.exe to be SymphonyOCRProcess.exe

5.2.56 -

Bug fix - when compacting database, if finder process was running, the system would lock up after the compact completed

5.2.55 -

Added Compact Database scheduled task to scheduler - by default it runs at midnight on Wednesday

5.2.54 -

Added database status to heartbeat
Added database rebuild progress messages to the advanced screen

5.2.53 -

Heartbeats will be sent even if license is invalid
Heartbeat includes information about license expiration, etc...
Changed license configuration screen so error and warning messages are more prevalent
Added a license warning (to the user interface and the heartbeats) if the license will expire in the next 30 days
Bug fix: If no PGs are selected, system status should show ERROR

5.2.51 -

Fix for "TIFF files winding up in Needs Attention" list - special handling for old-style TIFF files that use an unsupported JPEG compression format

5.2.50 -

Make sample image have modified date of 'today'
During install, we now log the version of installer to Trumpet-UpdateHistory.txt
Admin Guide link now points to Trumpet knowledge book instead of PDF
If the Windows user doesn't have sufficient permissions for OCR engine to work, we now produce meaningful error messages

5.2.48 -

Better handling of partially corrupted database indexes during rebuilds (should fix Java memory/heap space errors on rebuilds in some situations)

5.2.47 -

Added 'Ignore Specified' button to Advanced screen (allows bulk setting of Ignore items)

5.2.46 -

REPROCESS queue is automatically re-processed on launch
Advanced screen addition - you can now put in a bunch of file paths and click 'Process Specified Documents Immediately' to get them all to process at higher priority than other documents.  If the document is NEW or REPROCESS, it'll get moved to the PREPROCESS (analysis) queue, otherwise it stays in the queue it's in already (but with adjusted priority).  If the document is in a Post-Process document list, nothing will happen to it.
To Analyze queue is now ordered by 'Process Priority' (instead of when the document was found)
Improvements to user interface responsiveness (trying to eliminate problems where the user clicks a link, and nothing happens)
Adjusted labels on welcome screen to indicate that times to process backlog are *estimates*

5.2.45 -

Corrupted documents will now be reprocessed immediately after applying a code update
Added a new /maestro/do/status page that gives a plain text status of the system (useful for parsing by monitoring software)

5.2.44 -

Fix for "Colorspace not supported" and "Color depth not supported" corrupted PDF warnings

5.2.43 -

Fix for database index corruption issues (this update will trigger a rebuild of the queue metric tracker)
Optimize performance when doing bulk document state changes
Fix for 'Dictionary key is not a name' corrupted file issue
Adjusted HTML layout of UI screen so it renders properly on old versions of IE

5.2.42 -

clicking on backlog summary graph gives a larger image with all weeks (instead of just 104 weeks)
Fixed 'Unsupported color space' PDF corruption errors

5.2.40 -

Maintenance screen now refreshes once per second
Maintenance screen now has better progress messages
Added an animation to the maintenance screen

5.2.39 -

Adjusted graph y-axis so it automatically displays a more pleasing scale
Display a blank graph if no data is available

5.2.38 -

Fixed issue with 0 sized files showing ArrayIndexOutOfBoundsException: 0 error

5.2.35 -

Adjustments to Check for Updates to allow user to get latest production or pre-release updates

5.2.34 -

Fix 'Page N is corrupted - NNN' message on files in the corrupted document list

5.2.30 -

High performance support for huge files (>2GB)

5.2.29 -

Bug fix - Interuptions to pre-processing were causing the document to show as corrupted - the document will now be moved to Reprocess
Bug fix - exceptions during processing resulted in file being left in In Process queue

 

5.2.28 -

We have found a regression issue introduced in version 5.2.26 that can (in some rare cases) result in corrupted S-OCR database indexes.  This corruption can result in S-OCR refusing to launch.  We recommend that anyone currently running 5.2.26 or 5.2.27 update to 5.2.28 immediately.

  • Fix bug that could result in database index corruption
  • Force rebuild of database indexes for older sites (correct issues caused by the database index corruption bug) - this could cause 'Maintaining Indexes' screen to display for a little while during the first launch - just be patient!
  • Misc. bug fixes and tweaks
...

3. Release Summary 7.0

Summary

- Resolved issue with the "Uninstaller" not being available when installed as a logged-in user

- Changed NetDocuments OAuth token handling to support NetDocuments OAuth changes

- Added support for firms using proxy servers that use private certificate authorities to proxy HTTPS traffic.  Custom certificate authorities can now be registered in a Java Key Store file stored at /config/cacerts.private. 

- Updated to private Java Runtime, based on the open source Liberica JDK

- Resolved issues with SharePoint integration including: 
     1) Improved user count detection
     2) Added more informative error logging if SharePoint API fails
     3) Better handling for search results with 'null' file sizes

- Resolved issues with Aspose and Nuance PDF files that were not able to be safely processed by Symphony OCR

- Resolved issues with Java Heap Space crashing Symphony OCR

To see a complete list of changes, visit:  Change Log

...

4. Release Summary: 6.6

Summary 6.6:

  • Added OpenText integration
  • Worldox executable and dll files are now mirrored to a private copy. 
  • Added Basic Setting configuration option for specifying the Worldox network path
  • Added Advanced Setting to display the location of the private mirror folder
  • Updated the Welcome Wizard
  • Adjusted the “Always” email notification to read “Daily”
  • Symphony OCR now checks to ensure that the version of Worldox is prior to the WDU10 release prior to checking 8.3 filename information
  • Added Multi CPU support - Symphony OCR will now process documents in parallel based on the number of CPU cores available (the default is 4)
  • Added the ability to Delete files in the Moved / Unavailable document lists
  • Improved the detection of pages that should be OCR'd even though they have excessive text in margins
  • Improved the detection of pages that should not be OCR'd even if they have full images on the page and rendered text beneath the image.
  • Added a new "Renew Connection" button to NetDocument settings window

...

5. Release Summary: 6.5

Summary 6.5:
  • Folder document source now adjust the time of the modified file to 1 minute to ensure that indexing service see that the file has changed.
  • If Symphony OCR encounters an Out of Memory error, the application will now be shut down.  This is in order to prevent a database corruption in the event of a memory problem
  • Added additional handling for invalid PDF files that have a rotation other than 0, 90, 180 and 270 degrees.  Symphony OCR handles these pages by ignoring the angle specification entirely
  • Improved ShareFile integration by searching not only Shared Folders but also My Files & Folders and Favorite Folders
  • Added support for processing .tiff files in ShareFile
  • Added SharePoint integration
  • Added a Welcome Wizard that appears when Symphony OCR is launched for the first time with no license configured

...

6. Release Summary: 6.4

Summary 6.4

  • Statistics panel of the Summary screen now shows the number of documents in the OCR backlog as well as the number of pages.
  • Resolved issues with progress graph display
  • Added the ability to determine which
  • Added a warning message regarding processing MSG files when Outlook is not installed on the workstation running Symphony OCR
  • Added the ability to set page count limits to allow users to determine the maximum page count prior to being placed in the “Too Big” list via the settings.xml file.  To utilize this feature, open the settings.xml file in notepad, and change the maximumPageCount=”300” where 300 is the maximum page count you want to process
  • Changed installer so that Run as a Service message indicates that this feature is not available for Worldox sites
  • Users can now roll back to the original (non-OCR’d) version of a document as long as that document has not been modified since it was OCR’d
  • Added error message if Worldox document could not be processed because of missing 8.3 filename information
  • Added Advanced Processor configuration setting to control the maximum number of cores that will be used during OCR.  If left empty, we will use all available cores (up to 4).  The setting must be empty or a number between 1 and 4.
  • Added the ability to filter found documents based on dates.  This allows for multiple Symphony OCR licensing on repositories that do not have distinct cabinets / separation.
  • Added support for NetDocument’s AU (Australian) data center
  • Added the ability to process digitally signed document.  This is enabled by adding allowDigitallySigned=”true” to the documentPreProcessor element in the settings.xml file.  NOTE:  This will invalidate the signature
  • Added support for additional languages (currently only English, Spanish, and Brazil dictionaries are part of the engine installation)

...

7. Release Summary: 6.3

Summary 6.3

  • Added low disk space warning error.  If there are less than 1.5 GB of space left on the workstation, an error will be provided, if there is less than 1 GB of space left on the workstation, Symphony OCR will show in an Error state.
  • Added note to remind users to enable email attachment indexing in their DMS if MSG processing is enabled
  • Added 15-day grace period if the number of users exceeds the licensed Symphony user count.  During this grace period, the license issues an error, but processing will continue.  After the grace period has ended, Symphony OCR will no longer process documents.
  • Added the ability to allow the user to force processing of No Image/No Text documents.  If a user wishes to force processing of No Image/No text documents, they can either utilize the “Enable Processing” button on the document detail screen or select “Enable Processing” as a bulk action in the Document List screen. 
  • Added display path in addition to the canonical path for each document.  Filtering document lists will be performed against he display path.
  • Symphony OCR now tracks NetDocuments refresh tokens and issues a warning to the user 15 days prior to the user needing to manually re-authenticate the SOCR / NetDocuments connection.
  • Added the ability to install on Worldox Cloud environment
  • Lookup Document now allows you to type in the document id of a document as it appears in the document management system.
  • Adjusted the “Create Versions” option for processing documents in NetDocuments to True
  • Notification emails now include the name of the workstation that is running Symphony OCR in the subject line

...

8. Release Summary: 6.1

Summary 6.1
  • Added the “Simple” view in the Search Summary screen which allows users to view the status of Symphony OCR without seeing cabinet paths
  • Updated to new version of Abbyy engine (10.5.0.58)
  • Added support for .msg attachment handling
  • Added Digitally Signed List (for documents that have been digitally signed to ensure that the digital signature stays valid, Symphony OCR does not process them.
  • Improved modified date handling – if a document’s modified date is more than one day in the future, Symphony OCR will process the document.
  • Improved the License detail screen to display information about the license
  • Added a new processing priority level:  Analysis Only which allows the documents in a particular portion of the repository to be analyzed but not processed (they are left in the processing list)

...

9. Release Summary: 6.0

Symphony OCR 6.0 brings a major change to the workstation user interface.  Here's an overview of the major changes:

  • New document prioritization levels
  • Ability to specify when each profile group should be OCRed
  • Better backlog reporting (explicitly display page usage in past year, recommended license size)
  • Ability to report on backlog progress for each profile group
  • Streamlined installation (Client ID and Partner ID data entry eliminated)
  • Page count will now reset on anniversary date, not January 1
  • User interface overhaul with consistent page layout and links to online knowledge books
  • Consistent heartbeat error/warning reporting


...

© 2022 Trumpet, Inc., All Rights Reserved