HomeGuides :: Symphony OCRRelease NotesChange Log

2. Change Log



8.0.5 -

- Bug fix - SharePoint integration was completely broken since Feb 17, 2021 (change on Microsoft's end broke existing integration)

8.0.4 -

- Added explicit check for whether the page is greater than the maximum image size supported by the OCR engine - 32512x32512 - if it is, the document is moved to the Too Big list. This happens regardless of the checkForLargePages setting.

8.0.3 -

- updated SharePoint token so it does not expire

8.0.2 - 

- Large page check is now disabled by default. To enable, set <documentPreProcessor checkForLargePages="true" /> in settings.xml

8.0.0 - 

OCR engine now requires 64 bit operating system


7.3.2 -

- Bug fix - large pages were causing SOCR to crash with out of memory errors
- Improve detection of pages that are too large to fit in memory



- Bump version


- Updated license agreement to refer to Trumpet, LLC instead of Trumpet, Inc.



- Bug fix - Aspose and Nuance files that were already marked as not-safe, continued to be marked as not-safe, even during re-processing


- Bug fix - introduced in 7.2.51 - really big documents were failing with errors about the total content stream being too large. We now limit only the size of a single content stream - not the total across all pages


- Documents in the NotSafe list will be marked for reprocessing when SOCR launches following an install update

7.2.52 - 

- If page content requires more than 1M of RAM to extract, we will mark the page as not needing OCR. This will make the file NOT appear on the corrupted list, and will allow SOCR to process other pages that might need OCR.

7.2.51 - 

- If page content requires more than 1M of RAM to extract, the document is marked as corrupted with a note that reads ""Page " + page + " can not be read into memory - the page is not really corrupted, but cannot be analyzed because of it's size""
- Enhance Aspose.PDF detection so it is not case sensitive

7.2.50 - dev release

- Documents with producer "Aspose.PDF" are no longer marked as 'not safe' (reversal of change made in 7.2.19) - these documents can now be safely processed by SOCR
- Documents with producer "Nuance PDF Creator" are no longer marked as 'not safe' (reversal of change made in 7.2.29) - these documents can now be safely processed by SOCR


- Add Begin Analysis and Analysis complete entries to document history
- Added document history for documents found in INPROCESS during application startup
- Added file size to documents that are in INPROCESS during application startup

7.2.38 - dev release

- Bug fix - inaccurate 'Document is accessible - can't query for reason' error message

7.2.35 -

- For normal installs, a default launch.ini will be installed (if there isn't one already) that sets the max heap space to 1024m

7.2.34 -

- Update learn more hyperlink for NOT_SAFE lists to point at new kbook article

7.2.33 -

- Make Nuance PDF rollback debug screen skip documents that have unknown document sources (previously, the entire operation failed)

7.2.32 -

- Bug fix - Nuance PDF rollback debug screen (added in 7.2.29) could have NullPointerException if the Symphony database has entries added by really old versions of Symphony

7.2.31 -

- Bug fix - Nuance PDF rollback debug screen (added in 7.2.29) could have NullPointerException if the Symphony database has entries added by really old versions of Symphony

7.2.29 -

- Added special handling for Nuance PDF Creator documents (they are moved to Not Processable for the time being while we work through a bug)
- Added Debug screen operation to roll back Nuance PDF Creator documents that Symphony OCRed during effected versions (7.1.0 through 7.2.29)


7.2.28 -

- Some SharePoint search results had 'null' file sizes - this was causing Finder to fail. We now handle these oddities gracefully.

7.2.27 -

- More informative error logging if SharePoint API fails

7.2.24 -

- Bug fix - corner case - if MSG record is deleted from the SOCR database, analysis of attachments for that MSG fail with NullPointerException

7.2.23 -

- Bug fix - We now check if a document was checked out by a user during the OCR process. If so, the OCR results are thrown away and the document will be reprocessed instead of saving a new version.

7.2.22 -

- Bug fix - reanalzying MSG files was resulting in error java.lang.IllegalStateException: initialFileModifiedTime can only be set once

7.2.21 -

- Added Debug Screen command ("Search for and roll back non-unity CTM problem documents") to identify and roll back documents impacted by the regression that was fixed in 7.2.20

7.2.20 -

- Regression fix - introduced in 7.1.0 - some source PDFs result in OCR invisible text not being placed properly on the page

7.2.19 -

- Special handling for PDFs with producer string containing "Aspose.PDF" - these files can't be safely processed by SOCR yet - files with this producer are placed on the Not Safe list
- Added Not Safe list
- Udjusted the document details screen so it includes the corrupted or not-safe reason (if one is present)
- Added 'Search for and roll back Aspose problem documents' to the debug screen - when pressed, it will iterate through all documents that were modified by SOCR between 7.1.0 and 7.2.19, check their PDF Producer string. Any documents with producer containing "Aspose.PDF" will be marked for roll-back.

7.2.15 -

- Bug fix - if the underlying DMS fails when copying a read-only copy of a document, the Document was being put directly into the Reprocessing list. It now goes into the Inaccessible list.

7.2.14 - dev release

- Bug fix - null pointer exception in exception handler in getDocumentsByStateReverse()

7.2.13 -

- Bug fix (minor) - MSG attachments weren't capturing initial modified time properly - this resulted in a lot of log chatter when the weekly summary routine was running

7.2.12 -

Bug fix - SharePoint user count was including external users

7.2.11 -

Bug fix - digitally signed documents were not being OCRed when settings.xml documentPreProcessor setting allowDigitallySigned set to true

7.2.10 -

Regression Bug fix (introduced in 7.2.9) - MetaJure feature resulted in Folder feature not working properly. This is now fixed.

7.2.9 -

- MetaJure integration licensing (J feature code) now enables Folder DMS integration


7.2.9 -

- MetaJure integration licensing (J feature code) now enables Folder DMS integration

7.2.8 -

- Labeling consistency change - Change instances of 'Invisible Words' to read 'Hidden Words'

7.2.7 -

- Remove extraneous apostrophe from Symphony OCR is arleady running dialog

7.2.6 -

- Add additional handling for 0x80030109 responses during MSG editing (STG_E_DOCFILECORRUPT - The doc file has been corrupted) - when this happens during OCR, we now mark the document as corrupted instead of reprocess

7.2.5 -

- Add handling for 0x80030109 responses during MSG editing (STG_E_DOCFILECORRUPT - The doc file has been corrupted)

7.2.4 -

- Added 'System Memory Information' and 'Disk Information' to log output when application launches

7.2.0 -

- Moved to JRE 11 (and Trumpet private Java runtime, non-oracle)

7.1.8 -

- Work around for PDFs with very deep xref versions (tpt86114)

7.1.7 -

- Bug fix - SOCR hangs when interacting with UNC based very long filenames (>260 characters)

7.1.5 -

- Added Needs Attention document count to warning message in UI and heartbeats

7.1.3 -

- Workaround for invalid PDFs that don't have bounding boxes defined for all pages (bounding box defaults to standard portrait letter in these cases) -note that these PDFs are not compliant to the PDF spec (MediaBox is required), so no guarantee that the resulting text placement will be correct - but in testing on the few problem files we've seen has been successful.

7.1.2 - dev release only

- Mark documents with "CCITT codec error" messages as corrupted instead of needs attention

7.0.30 -

- Added corrupt_db.log file (new log4j.properties) that contains only log errors related to corrupted databases (should allow us to narrow down time window of when corruption occurs)


7.0.29 -

- Bug fix - Worldox API re-initialization could cause Symphony to crash under Worldox WDU14 (race condition exacerbated by WDU14 changes)


7.0.28 -

- Bug fix - SharePoint files with single quotes in their names resulted in errors
- Bug fix - timeout/429 errors caused some SharePoint integration calls to fail


7.0.27 -

- Support for firms using proxy servers that use private certificate authorities to proxy HTTPS traffic
- Custom certificate authorities can now be registered in a Java Key Store file stored at /config/cacerts.private - see document 141530 for instructions on obtaining a certificate and loading it into a private trust store

7.0.26 -

- Some corrupted MSG files were being put on the Reprocessing list (error code 0x8004010f now sends the document to Corrupted)

7.0.24 -

- Process Read Only setting in Folder settings wasn't being honored (resulted in 'Access Is Denied' error after OCR completed for documents with read only attribute set)

7.0.22 -

- Added initial support for multiple libraries with OpenText

7.0.19 -

- Bug fix MSG attachments were being moved to the priority of the parent MSG document record when the attachment was reprocessed (if the MSG was set to Analysis Only priority, but the attachment was set to High, if the attachment was re-analyzed for any reason, the priority of the attachment was switched to Analysis Only)

7.0.17 -

- Workaround for non-compliant PDFs that don't properly specify page size for all pages (MEDIABOX missing)

7.0.16 -

- Enhancement - special analysis handling for pages that have invisible text in the margins (i.e. invisible text stamp)

7.0.15 - 

- Adjust OpenText/eDocs integration to handle documents that are in-use and documents that have been deleted

7.0.14 -

- Adjusting OpenText/eDocs integration to work properly with live site


7.0.13 -

- Bug fix - Worldox integration fails at sites that had UNC paths containing spaces and wdcommon\wdmirror.ini files referencing drive letter based CPs

7.0.12 -

- Bug fix - ND OAuth token wasn't being saved for brand new sites

7.0.11 -

- Changes to NetDocuments OAuth token handling to support upcoming NetDocuments OAuth changes


7.0.10 -

- Bug fix - 'Process Read Only' setting in Folder configuration screen didn't stick

7.0.8 -

- Bug Fix Occassional 'java.lang.ArrayIndexOutOfBoundsException: 100' error when processing under load - could result in 'Database is corrupted' error message in user interface.

7.0.5 -

- Bug fix - uninstaller wasn't being registered for "run as logged in user" installations

6.6.98 -

- Bug fix - Refresh button in NetDocuments settings screen took user to NetDocuments US Vault login screen. The refresh button now just refreshes the cabinet list.
- Added Renew Connection button to NetDocuments setting screen
6.6.97 - dev release

- Bug fix 'New pages added in past year' on summary screen could show incorrect values for up 12 hours when documents are re-analyzed
- Bug fix 'New pages added in past year' label always showed '(this year)' instead of of '(pages/year)' when we had a full year's worth of data available


6.6.92 -

- Bug fix - NetDocuments made an API change on 4/20/2018 that caused our "Open" links to not take the user to the document in ND

6.6.90 -

- Bug fix - SharePoint sites that had spaces in their name were resulting in "Illegal character in path" Finder errors

6.6.89 -

- Added ability to reprocess documents in the Processing (TOPROCESS) list (just in case they need to be re-analyzed manually)

6.6.88 -

- Statistics screen now displays estimated time to process backlog based on 4 cores (1.2 seconds/page) if there is no OCR performance data to baseline against (i.e. analysis only licenses). The label on the estimate will show "(assuming 4 CPU cores)" in this case.

6.6.87 -

- Bug fix - Fixed OutOfMemoryError under high analysis or OCR load

6.6.85 -

- Bug fix (kinda) - Legacy files in Worldox with ~ at the beginning were appearing in the corrupted list.  Worldox indexed searches were sometimes returning files with ~ at the beginning (these are generally temp files that shouldn't have been part of the indexes and certainly shouldn't be OCRed)

6.6.84 -

- Added accessibility message for Worldox integration informing the user that WD versions prior to 20180412 do not support paths longer than 255 characters
- Added accessibility message for Worldox integration for sites with WD versions later than 20180412 that WD does not support paths longer than 380 characters

6.6.82 -

- Added conditional logic to Worldox integration to allow spaces after filename and before the file extension (WD code running after 20170601 allows spaces)

6.6.81 -

- Improved detection of pages that should be ocr'ed even though they have excessive text in margins
- Improved detection of pages that should not be ocr'ed even if they have full image on the page and rendered text beneath the image
- Added additional columns to Document Details screen, page analysis results


- Bug fix - setting for limiting maximum number of cores used during OCR was not being honored
- Changed Processor configuration UI for clarity around maximum allowed parallel processing settings

6.6.73 -

- Bug fix - SOCR doesn't shut down at NetDocuments sites that were actively searching for documents

6.6.71 -

- Bug fix - NetDocuments configuration screen not showing error/warning details for Analyzer-only licenses
- Bug fix - NetDocuments integration was having preserveModifiedInfo set to false after setup with Analyzer-only licenses

6.6.69 -

- Added separate NetDocuments connection buttons for US, EU and AU vaults


- Bug fix - Welcome Wizard didn't work properly if there were features in the license that required DMS configuration to work properly (# of ND or SP users, for example)
- Removed the 'Manage' button from Issues list in welcome wizard
- Improved error message for invalid tenant URLs

6.6.64 -

- Bug fix - blank pages without any content stream were marked as corrupted instead of blank

6.6.59 -

- Bug fix - the user interface prevented setting the SharePoint legacy search frequency to 0.  This is now allowed.

6.6.58 -

- If SharePoint legacy search frequency is set to 0, the legacy search will be skipped

6.6.57 -

- Bug fix - Fixed a bug with Rollback so it would work with the first button click (from the document detail page)
- Added the bulk operation "Rollback" to the Processed documents page. When clicked, all documents in the current search will be rolled back to their non-OCRed version
- Added the bulk operation "Reanalyze" to the Reprocessing documents page. When clicked, all documents in the current search will be moved to the Analyzing bucket

6.6.55 -

- Reworked the processing metrics on the Analyzer, Processor and Summary pages to show more useful data in a more user friendly fashion
- Removed all support for Bonus Page tracking
6.6.54 -

- Moved to jWDAPI 1.0.22 to have the WorldoxSession fast fail if the user is invalid
- Backed out the previous Worldox invalid user fast fail code

6.6.53 -

- Fixed bug where wrong page tracker was being used by the Analyzer

6.6.52 -

- Modified processor core algorithm to not consider physical cores on the machine. Solely determined by the license now

6.6.51 -

- Fixed bug in the ProcessorManager config that was causing maxThreads to be persisted and thus override calculated maxThreads
- Updated the version check url to be https to work with the new version update https redirection

6.6.50 -

- Tracked WorldoxConnection failure due to invalid user, and quick fail on repeated calls to the connection until a valid user is provided.

6.6.48 -

- Moved to jlicensing 1.0.10 to support new license expiration warning logic

6.6.46 -

- Created a ProcessorManager that will not create Processor tasks that do the work. These tasks will be added to
    an executor service so we can run multiples in parallel. A ProcessorManager will be created for each document
    processing type (analysis, ocr, rollback)
- Split the processor mgmt (stop, start) config into a new section, and provided migration for it
- Renamed the processors to AnalyzerProcessor, OCRProcessor and RollbackProcessor. Supporting classes followed suit
- Added a WorkingFolderProvider to track and manage working folders for the managers
- Created ProcessorThreadPoolExecutor for use by the ProcessorManager. It has the ability to block task addition until a thread
    is available
- Refactored the OCRProvider to remove the generic parts, since we only support a since ocr engine
- Refactored the page count handling
- Implemented dripMode in the ProcessorManager
- Deleted OldPageCountFeature
- Made processor factory config classes immutable
- Added more support for Processor status
- Added a maxDripsBeforeHalting setting to ProcessorManager, to allow dripMode to halt after X docs are processed, rather than just 1
- Fixed bug in the Analyzer config screen that wasn't persisting the isAllowMsgAttachments setting
- Added custom message support to the ErrorTracker so we could get better error messages in the UI
- Updated Processor and Analyzer web pages to show a list of documents being processed
- Updated statistic verbage per changes
- Modified page statistics to divide results by the number of running threads

6.6.45e -

- Bug fix - SOCR was checking to make sure 8.3 filename information was available for all versions of Worldox integration.  We now only check if the version is prior to the WDU10 release (which fixed 8.3 realted issues in WDAPI)

6.6.45d -

- Modified calls to NDAPI for creating new versions to ensure we don't modified lastModified info

6.6.45c -

- Performance improvement when retrieving number of active users in NetDocuments integration

6.6.45b -

- Add support for unlimited user count licensing for NetDocuments integration

6.6.44 -

- Bug fix - some malformed PDFs (huge page catalogs, large number of pages) could result in out of memory exceptions


 6.6.24 -

- Updated to NDAPI 0.0.53 (upgraded document getSize() methods)
- Modified NetDocuments processing to use the new NDAPI getSizeBytes() method for consistent file size checking
- Moved configuration loading of the preprocessor and processor to the end of the config load, to prevent feature initialization issues


- Made SystemStatusProvider get the NetDocs status from the ND source, not connection manager
- Modified NetdocsDocumentSource to store an ignore warnings flag for the new warning
- Modified NetdocsDocumentSource to be Status aware and provide the overall status for NetDocuments
- Modified the Netdocs web page handler to use the new NetdocsDocumentSource getStatus() call rather than the ND conn manager call
- Fixed a bug in NetDocuments processing that was using different file sizes during file searching, resulting in unnecessary downloading of files for reprocessing

6.6.19 -

- Regression bug - Log output wasn't being written to the maestro.log or error.log files.  Introduced in 6.6.12

6.6.18 -

- Added belts and suspenders to OpenText implementation
- Added belts and suspenders to LSSe64 database methods
- Added dripMode to Processor, allowing for the processor to be stopped after a single document is processed
- When OpenText feature is enabled, set the Processor and PreProcessor to not autostart, and to be in dripMode
- Modified OpenText search for files SQL to use enhanced SQL and never return docs that don't have allowed extensions
- Made the OpenText modified flag value customizable, to allow for easier beta testing
- Prevented actual file changes to OpenText files, until beta testing shows us the correct way to make changes
- Modified OpenText lastModified time for files to use the database value rather than the file value

6.6.16 -

- Regression Bug fix - introduced in 6.6.10 - installer creating 'work' folder in the installer exe folder

6.6.13 -

- Fixed null pointer exception in PDFFileAnalyzer
- Fixed null pointer exception in PurgeUnavailableTask

6.6.12 -

- Updated to NetDocumentsAPI v0.0.51 to support document content change without changing modification info for file extension changed (tiff->pdf)

6.6.11 -

- Updated to NetDocumentsAPI v0.0.50 to support document content change without changing modification info

6.6.7 -

- Added a description to the SymphonyOCR Windows service

6.6.5 -

- White labeling is currently only available for SOCRCLOUD licenses. If enabled, the "D" Worldox feature must be disabled to prevent conflicts
- White labeling reports the number of used Worldox seats in the heartbeat, but the license does not rely on seats for validation
- Added a WhiteLabelingFeature, which extends WorldoxFeature, ignores seat checking for validation, but checks the user domain for a valid cloud domain.
- Added a WHITELABEL user count strategy to WorldoxFeature

6.6.3 - DevRelease

- The license code for OpenText is "O", and licensing is based on active people in the system (seats)


6.6.2 -

- Bug fix - some encrypted PDF files weren't being flagged as encrypted during analysis (they were failing during OCR and landing in the Needs Attention list)

6.5.71 -

- Better handling for quasi-invalid PDF files (PDFs that have null AcroForms objects now are handled cleanly) - ClassCastException after processing


- Added a new global "alwaysAnalyzeAndProcessNoImageNoTextPages" setting to the config file, under a new "heuristicComputerProvider" setting.
- The new setting will default to false (existing behavior) but can be manually set in the config file.
- The new setting will provide a way to allow no image no text pages to be forceably processed across the board, instead of manually per document.



No Changes
6.5.68 -

- Fixed UI bugs - Folder configuration 'Add' button wasn't rendering properly.  Some buttons didn't display 'Hand' cursor to indicate they are clickable.

6.5.66 -

- Modified the buttons on the NetDocuments approval page to use the same styles as the other pages

6.5.65 -

- Removed the reprocessing of TOO_OLD docs on startup

6.5.64 -

- Modified Analyzer installer to elevate level to admin
- Created a new "createInstaller.cmd" script that creates the Analyzer installer and digitally signs it
- "createInstaller.cmd" relies on two new settings in deployment.properties
    deployment.resources.signature.location=[signature location]
    deployment.resources.signature.secret=[signature password]

6.5.63 -

- Modified the NetdocsFinder to set documents in the Too Old folder to REPROCESS, when the Process legacy documents setting
    is enabled.
- Added TOO_OLD as a new value in DocumentAccessibility enum
- Reprocess TOO_OLD docs on startup   
6.5.62 -

- Added styling for disabled buttons
- Fixed issues with paging buttons on Document List page

6.5.61 -

- Removed legacy and deprecated css links, which were overriding our settings
- Added new reset.css to undo button settings set by the browser
- Modified the button style to force a hand pointer when the mouse is over the button
- Removed [] from the NetDocs Log In button
- Returned the Learn More links on the Document List pages to be hyperlinks rather than buttons

6.5.60 -

- Fixed more buttons who missed the style upgrade.
- Modified button styles again to make shadows less invasive

6.5.59 -

- Modified style for Scheduler config Delete buttons to remove the border


6.5.56 -

- Modified the Analyzer installer script to create a file with "c" version, and to point to the SOCR Pre-release installer

6.5.54 -

- Added ability to override the host name used when displaying the user interface.  Settings.xml, webServerConfiguration, hostname (this is not added to the config file by default - you'll have to set it explicitly)

6.5.53 -

- The email of the logged in NetDocuments user now appears after the login name on the NetDocuments configuration screen

6.5.51 -

- Modified Analyzer configuration page to allow for the setting of MSG (email) attachment processing. This setting will sync with the
    same setting on the Processor configuration page, so that both are either on or off.

6.5.50 - dev release

- Bug fix - Symphony OCR hangs when processing some large, complex PDF files.  java.lang.OutOfMemoryError: Java heap space message appears in logs

6.5.49 - dev release

- Added the Windows user name and fully qualified hostname to heartbeats

6.5.48 - dev release

- Fixed potential issue with supporting MacRoman character sets (used by PDFs generated on Mac computers) under newer versions of Java


- Updated verbage of SharePoint configuration page, and added ability to set frequency for a legacy file finder


- Added SharePoint URL to the SharePoint configuration screen


- Updated SOCR to work with NetDocumentsAPI v0.0.41
- An audit entry will be added to NetDocuments documents when OCR is completed, rolled back, etc.
- The audit entry for Worldox documents when OCR is completed was modified to be more descriptive. Audit entries were added when
    documents are also rolled back, etc.

6.5.36 -

- Bug fix - [View Timeline] links in Simple View allowed users access to the non-simple UI

6.5.33 -

- Regression bug fix - introduced 6.5.31 - Installation on machines without Java 8 result in 'UnsupportedClassVersionError' popup dialog on launch

6.5.32 -

- Bug fix - OCR and Analyzer working directories had garbage files left behind if SOCR was shut down in the middle of OCR or analysis
- Adjusted default settings for determining maximum page size that will be processed.  This is now specified in total pixels (instead of maxHeightPixels and maxWidthPixels).  The new setting is maxPixels, the default is 36000000, which is a little larger than a C sized sheet of paper.  For backwards compatibility, if the existing configuration has maxHeightPixels and maxWidthPixels set to 10000 or 12000, the default is used, otherwise maxPixels is set to maxHeightPixels x maxWidthPixels



- Regression bug fix, introduced in 6.5.22 - OCR engine fails to run on Windows XP workstations - error message refers to ADVAPI32.dll procedure entry point RegSetKeyValueA
* Move to SymphonyOCRProcess.exe


- Bug fix - out of memory errors when analyzing PDFs that contain an excessive number of embedded fonts
* Move to 5.5.11-SNAPSHOT.jar (disables font caching in PdfContentStreamProcessor if the file has more than 10 fonts)

6.5.25 -

- Bug fix - ShareFile integration was only searching 'Shared Folders' (not 'My Files & Folders' or 'Favorite Folders')


- Added support MICR processing (magnetic ink characters typically found on bank checks).  When enabled, only a single processor core will be used for OCR, so this should only be enabled for sites that truly need MICR processing.

- If MICR processing is enabled, that will be indicated in the Advanced Settings section of the Processor Config screen

6.5.21 -

- Improved task status message when manually initiating Compact Database command in Debug screen (it used to say 'maintenance completed' until the maintenance got underway)


6.5.20 -

- Refinement to 6.5.19 bug fix - if a mutation failed during processing, the document was being put on Needs Attention.  It now gets put onto Reprocess.

6.5.19 -

- Bug fix - If processing was actively working on a document while nightly database maintenance happened, database corruption (or forced shutdown of SOCR) would occur

6.5.17 -

- Bug fix - files marked as read-only weren't being processed, even though 'Allow processing of read-only files' was enabled

6.5.16 -

- When rolling back, we now reset the previous analysis results
- Additional handling for invalid PDFs that have page rotations other then 0, 90, 180 and 270.  SOCR now handles these pages just like Acrobat (which means that it ignores the angle specification entirely)


6.5.13 -

- No change

6.5.12 -

- If SOCR encounters an out of memory error, it now kills the application (want to prevent database corruption in the event of a memory problem)

6.5.11 -

- Corrupted file detection enhancement - some rare corrupted PDFs caused documents to appear on the Needs Attention list instead of Corrupted
- Bug fix - post-OCR PDF failures could cause the working directory to become locked - after that, all future processing would wind up in the Reprocessing list

6.5.10 -
- Adjustment to NDApi calls to specify NDVaultLocation
- Bug fix 'Engine not initialized — Cannot run program' and 'CreateProcess error=19' and '%1 is not a valid Win32 application' errors on Windows XP machines

6.5.9 -

- Folder document source now adjusts the modified time of the file by 1 minute (this is to make sure that indexing services will see that the file has changed)

6.5.8 -

- Regression bug fix - 'null' error when processing partial documents (only some pages in the PDF needed to be OCRed)

6.5.7 -

- Adjusted Processor error handling behavior - we now pause processing (or analysis) if we encounter more than 10 errors in a 15 minute window (prior behavior was 5 errors in a 60 minute window)

6.5.6 -

- Regression bug fix - "Execution of parallel task failed: Not enough memory! Failed - 0x80004005" error when processing documents that have low image quality.

6.5.5 -

- Bug fix - If Processor was restarted at exactly the wrong time, Processor would display error message "Unable to add pages to page tracker - null. Restart Symphony to clear this error."

6.5.4 -

- Bug fix - PDFs that were missing MediaBox field on a page definition resulted in the file being marked as corrupted.  Technically, these are not valid PDFs, but we can still analyze them by assuming a default page size of 8.5x11"

6.5.3 -

- Bug fix - foreign characters didn't display properly in the SOCR web interface
- Bug fix - some files with foreign characters in filename would fail to OCR and appear in the Needs Attention list with error messages about the file not existing (and the file name shows many question mark symbols)
- Bug fix - files that a user had opened in Folder and Worldox document sources were placed on Inaccessible list, and didn't reprocess until the next day.  Now these files will be placed on the Reprocessing list, and will be processed when they are found again (assuming the user has closed the document by then)

6.5.2 -

- Improved error message if Outlook isn't unavailable to indicate that 32 bit Outlook is required
- Added note about 32 bit Outlook to the Processor config screen

6.5.1 -

- No change

6.4.125 -

- Added support for non-western characters in OCR results


6.4.124 -

- Bug fix - Some documents winds up in the Reprocessing list repeatedly with "Unable to read for analysis - null" in the history. Null Pointer Exception when scanning some PDFs for digital signatures.

6.4.123 -

- Bug fix - MSG attachments for MSG files in deep folder paths resulted in the document being continuously placed on the Reprocess list


- Re-enable flush of content pages every 100 pages (this was originally enabled in 6.4.23, but inadvertently disabled in 6.4.30
6.4.121 -

- Certain "page too big" errors were resulting in document being placed in No Image/No Text list instead of the Too Big list

6.4.115 -

- Added support for additional languages (language dictionary files are not being deployed yet, so that needs to be done before we really use this - currently only English, Spanish, Brazil dictionaries are part of the Engine installation - more can be added as needed)
Specifying languages is currently done in settings.xml in the <ocrHandlerProvider languages=""> element. Values should be comma separated, with no spaces. Default is "English".

6.4.110 -

- Add support for custom NetDocuments OAuth connection parameters (this will allow the firm to request that NetDocuments preserve modified by and modified on values during OCR)

6.4.104 -

- Bug fix - ND documents that were on legal hold, archived, signed or approved were repeatedly processed. They will now be placed in the appropriate Not Processed queue.

- Bug fix - detection of non-8.3 paths in Worldox document repositories wasn't working properly on certain NTFS volumes


- Improve error handling and reporting in PracticeMaster integration

6.4.100 - 

- Bug fix - Memory leak in NetDocuments integration (eventually resulting in OutOfMemoryException errors and a heap dump in the logs folder)
- This bug would have also impacted ShareFile integration, so that has been fixed as well


6.4.99 -

- Added ability to process digitially signed documents (note that this *will* invalidate the digital signature) - this is enabled by adding allowDigitallySigned="true" to the documentPreProcessor element in settings.xml

6.4.97 -

- If manual license update check fails, we now display the error message (before, it was showing an exception trace)

6.4.96 -

- Bug fix - grace period wasn't working properly
- Added new scheduler entry for auto-checking for license udpates from Trumpet license server. These updates are auto-scheduled for a random time on a weekday. The update check won't actually happen unless at least 3 days have passed since the last update check, or if the license status is in a warning or error state (in which case it'll check once per day at the scheduled time)
- Bug fix - ND and SF finders weren't starting up on brand new sites (had to stop and restart SOCR after entering the license number)
- Updated help hyperlinks for Not Processed lists
- Bug fix - heartbeats were including pages left, even for unlimited page licenses
- Notification configuration now says "When there are warnings or errors" instead of "When there are warnings"
- Added Check for Updated License button on Licensing page


6.4.94 -

- Bug fix - Analyzer and Processor failed to start - error message about 'The application has failed to start because its side-by-side configuration is incorrect'

6.4.92 -

- Better handling if the processor or preprocessor throws an uncaught error - we now shut the processor down and report an error

6.4.91 -

- Bug fix - if backupFileRoot was pointing at an old SOCR installation directory (e.g. SOCR folders in C:\Program Files\ instead of C:\Program Files (x86)), backups and processing would fail. We now detect this situation and change the setting to the relative .\work\backupfiles value.

6.4.90 - dev release

- Adding debug lines to troubleshoot failed backup copy

6.4.89 -

- Bug fix - NetDocuments support for EU data centers wasn't allowing login to the EU site
- Added support for AU data center (still have to manually adjust in the settings.xml file)

6.4.88 -

- Bug fix - NetDocuments support for EU data centers wasn't directing to EU login page

6.4.87 -

- Database compaction algorithm will now purge any document records that were damaged by small database corruptions

6.4.86 -

- Bug fix - on machines running on unreliable networks, network hiccups in the middle of analysis could case SOCR to crash to ground (error message EXCEPTION_IN_PAGE_ERROR (0xc0000006) )
- Bug fix - backup results were being writting to the <install directory>\work\backupfiles folder instead of the app working folder override
- Made GeneratePerformanceSummary task do nothing (it's really not needed anymore), removed GENERATE_PERFORMANCESUMMARY_TOPROCESS and GENERATE_PERFORMANCESUMMARY_PROCESSED from any existing scheduler configuration
- Change default schedule time for "Re-analyze Re-Process lists" to be 11:30pm every day, instead of 12:00 every day
- Change default schedule time for "Purge backups" task to be 4:00am every day, instead of 12:00 every day

6.4.85 -

- Added Processing Time (ms) to CSV export of document list

6.4.84 -

- Bug fix (Case 35047) - When starting or restarting SOCR with LSSe64, the LSSe64 finder would often fail to start, the LSSe64 finder now starts reliably.
- Bug fix (Case 35024) - SOCR Was processing documents with extensions that weren't PDF, TIF or MSG and adding these to the Corrupted list, such documents are now ignored.

6.4.83 -

- Bug fix - PDFs with small number of pages but really big image content (i.e. hi resolution photographs embedded in the PDF) could cause OCR to fail with an out of memory error
- Bug fix - When entering text into the LSSe64 DB password field, it was not masked with ****. This is fixed by using a HTML "password" input type rather than "text".
- Bug fix - SOCR for LSSe64 was attempting to process non-image documents (e.g. .DOCX files) and this led to them appearing in the corrupted documents list. SOCR for LSSe64 has been modified to now only process documents with a PDF, TIFF or MSG extension.

6.4.82 -

- Adding support for NetDocuments EU data center (just added to settings.xml at this point - not in UI yet)

6.4.81 -

- Added ability to filter found documents be explicit dates (this is done by editing the settings.xml file, finderHandler section, cutoffTimeHigh and cutoffTimeLow values)

6.4.80 -

- We now send heartbeat when the user changes their license

6.4.79 -

- Analyzer now ignores stroke and fill color operators in PDF (rg and RG) (we were seeing some PDF files where these operators weren't being properly used, but that failure isn't going to prevent us from determining whether the file is safe to process, so we'll just ignore those types of errors)

6.4.78 -

- Document processor is now stopped and started during backup purges

6.4.77 -

- Doc source type for practice master wrong due to inheriting folder finder task's characteristics. Added a simple override to address this.
- The check for a filename being within a folder pathnames was errant, the logic needed to be reversed.

6.4.76 -

- Better error handling for NetDocuments API timeouts

6.4.75 -

- Better logging on 0xfffffffd msgedit errors

6.4.74 -

- Replaced 'Advanced' side menu with 'Debug' link in upper right corner of Welcome screen only
- Adjust build script so it reads from ${user.home}/.m2/deployment.properties instead of properties being defined explicitly in settings.xml

6.4.73 -

- Better error message for Outlook installation issues (msgedit error 0xfffffffd)

6.4.72 -

- Cosmetic fixes throughout (PracticeMaster instead of Practice Master, Advanced heading in PM config screen appeared twice)

6.4.71 -

- Added whether we are running as a service or not ('Service: true') to heartbeat status

6.4.70 -
-Refactored Practice Master code to reduce complexity.
-Improved validation and error reporting when user makes erroneous configuration changes to Practice Master.
-Simplified the generation and processing of the HTML delivered to user's browser for Practice Master.

6.4.69 -

- Improved error handling if a database commit fails (database is marked as corrupted and is stopped so further damage can't occur)

6.4.68 -

- Added Advanced Processor configuration setting to control the maximum number of cores that will be used during OCR. If left empty, we will use all available cores (up to 4). Right now, the setting must be empty or a number between 1 and 4.

6.4.67 -

- Switch resource server to http://resources.trumpetinc.com

6.4.66 -

- Bug fix - OutOfMemory errors when processing really large NetDocuments documents
- Move to NetDocsAPI-0.0.14.jar

6.4.65 -

- Switch heartbeat server to http://heartbeat.trumpetinc.com/heartbeat/sendheartbeat.jsp

6.4.64 -

- Added Worldox validation message for version WDAPI.20150624.1852 indicating that version of WD has a bug when processing legacy documents

6.4.63 -

- Improvement to orientation correction code to avoid analyzer lockups when hitting huge pages

6.4.62 -

- Bug - divide by zero exception in Processor configuration screen when all pages of a trial license are consumed

6.4.61 -

- Moved to sswebservices-0.0.3.jar (logging enhancement)

6.4.60 -

- Bug fix - deleting a document record from the Document Detail view resulted in ClassCastException instead of returning the user to the Welcome screen

6.4.59 -

- Improvements to Active User count strategy (we will now fail if the WD license isn't valid or if the version of WD doesn't support active user count determination


6.4.58 -

- Bug fix: Some sites running Server 2008 R2 in rare configurations (SMB1 with loopback mapped drives) would kernel fault with a Blue Screen of Death (operating system bug triggered by behavior in one of SOCR's analysis modules). This is now fixed.

6.4.49 -

- Added specific error message if a Worldox document couldn't be processed because of missing 8.3 filename information

6.4.48 -

- Added a message to the end of the maintenance screen indicating that maintenance is complete

6.4.44 -

- New feature: Users can now roll back to the un-ocred version of the document as long as the document hasn't been modified since it was OCRed (this is available even if the short term retained versions have been purged) 

- Enhancement when processing huge file (thousands of pages) - disk space usage is now limited to around 12 GB during processing (prior, it could expand indefinitely - approximately 12MB per page)

6.4.43 -

- Bumped the maxWidthPixels and maxHeightPixels values from 10,000 to 12,000 - there were a lot of engineering drawings that were just barely above the 10K limit (i.e. 10804x7212)

6.4.42 -

- Minor bug fix - LSSe64 integration was trying to connect to the SQL database, even if the license wasn't activated for LSSe64 (move to lazy loading of connection)

6.4.41 -

- Changed logging when profile group isn't available via WDAPI to be debug logging (logs were getting flooded when a PG was removed from processing)

6.4.40 -

- Added debug lines troubleshooting FOLDER document source issue
- REGRESSION - Bug fix - all documents at Folder Tree sites were being sent to Unavailable list.

6.4.38 -

- Bug fix - Null Pointer Exception in some corner cases when saving changes to NetDocuments integration configuration

6.4.37 -
- Added support for Practice Master DMS which uses a simple file system folder based document storage strategy.
- Added support for validating SOCR max licensed users against Practice Master's own configuration file's max licensed users.
- Small number of internal code refactorings with no functional visibility.

6.4.36 -

- Bug fix - clicking on Open links for Worldox documents resulted in an empty tab opening in the web browser

6.4.35 -

- Ensure we validate the SOCR licensed user count against the LSSE64 licensed user count.

6.4.34 -

- Fix bug in which null was returned for a document's priority level.

6.4.33 -

- Increase lengths of input fields for DB credentials and restart the LSSe64 finder when we change credentials.

6.4.32 -

- Disable a test and adjust the logic of another test in a suspect area, add a native paths and finally also remove an old import and up the overall version number.

6.4.31 -

- Added support for flagging certain versions of Worldox as having problems (see WorldoxConnection#getStatusResult() )

6.4.30 -

- Changed installer so Run as a Service message indicates that it won't work with Worldox sites

6.4.29 -

- NetDocuments compatibility fix - sites that didn't have legacy processing enabled weren't finding documents to process since the most recent ND update

6.4.28 -

- Bug fix - introduced in 6.4.11 - Files with mismatched extensions (e.g. a PDF with a TIF extension) wound up in an infinite 'analyzing' loop
- add ability to specify location of dev mode abbyy home
- adjusted how development mode path determination is made (com.trumpetinc.development.abbyybinfolder system property)

6.4.26 -

- Bug fix - the backupFileRoot was being stored as an absolute path instead of a relative path. End result is that copying configuration from a 32 bit machine to a 64 bit machine resulted in the default backup location incorrectly pointing at C:\Program Files\ instead of C:\Program Files (x86)\

6.4.25 -

- Added a setting to settings.xml to set a maximum page count limit on what documents will be processed. Documents with more than the limit of pages will be put onto the Too Big list. Setting is not configurable through the UI - you must edit settings.xml directly and add the following to the existing <documentProcessor ..... /> element: maximumPageCount="300" (or whatever page count limit you wish to set)

6.4.24 -

- Bug fix - if MSG handling was enabled on a machine without Outlook installed, it was not possible to disable MSG processing (though it looked like it was disabled in the Processor config screen). End result was a permanent warning message about the MSG sub-system not working

6.4.23 -

- Accuracy improvements in OCR engine
- Bug fix - out of memory errors when processing really big PDF files (thousands of pages)

6.4.22 -

- Improve OCR accuracy in some corner case scenarios

6.4.21 -

- Increased accuracy of OCR engine by making it slightly slower

6.4.20 -

- Added better error message if NetDocuments login doesn't have permissions to query user count
- Add note to NetDocuments Login button indicating that the user must be an NetDocuments Admin

6.4.19 -

- Added caching to progress graph display - this writes the graph data to disk every 5 minutes, and reads that data on launch. This should allow us to display the graph right away, without there being a refresh period (which could be quite long at sites with lots of documents)

6.4.17 - Dev release

- Experimental functionality for emitting OCRed PDF page content incrementally - should fix problems with running out of memory when processing really big files
- Right now, this flushes every single page - before moving to pre-release we probably should change that so it flushes every 50 or 100 pages

6.4.14 -

- Bug fix - issues when reading MSG files could result in user interface displaying RuntimeException stack trace

6.4.13 -

- Added ability to override maximum heap that Symphony OCR will use. This is done in a launch.ini file that must be stored in the SOCR application directory. The setting is controlled in the [JVM] MaxHeap=512m setting (this is the default - 512 MB heap). It can be increased - so for example, MaxHeap=1024m would increase it to 1GB.

6.4.12 -

- Fix issue where integration with old Worldox GX2 sites failed with 0xfffffff error

6.4.11 -

- Issue fix - web based DMSes can lose connectivity. When this happens, documents pending analysis and processing wind up being moved to the Unavailable list. When they become available again, SOCR was re-analyzing the files. This caused a lot of unnecessary downloading.
- If a document hasn't been changed since last analysis, it will not be re-analyzed unless the user explicitly clicked Re-Analyze in the Document Detail screen or the Document List screen

6.4.9 -

- Suppress error log entry that is logged if graph generation is interrupted (not adding any value - this is normal behavior)

6.4.8 -

- Make SOCR so it only complains about spaces in a profile group's base path if the profile is defined on a mapped network drive (i.e. doesn't start with \\)

6.4.7 -

- Added Ingore capability to all documents in the Backlog list (in case users want to ignore documents that haven't been analyzed and/or processed yet)

6.4.6 -

- Removed warning message added in 6.4.5 - it looks like WDAPI won't actually block if the PG is set to read-only

6.4.5 -

- Added warning message if a selected Worldox profile group is marked read only

6.4.4 -

- Add history when file has too many pages to process with the current license

6.4.3 -

- Bug fix - huge PDF files (>10K pages) processed during free trials would cause SOCR Processor to stop. These documents will now be moved to REPROCESS

6.4.2 -

- Make Statistics panel on main page show number of OCR backlog documents as well as pages

Summary 6.3

Here's an overview of the major changes:

  • Unlimited page count processing: For firms that have a per-user annual subscription, you now have the freedom to scan and save all of your documents without worrying about your page count processing limit (unique exceptions may apply).
  • Email notifications: Configure Symphony OCR to send daily status emails to alert your Symphony administrator of errors or problems OCRing documents.
  • Doc ID search: You can now use the document filepath or Doc ID to determine whether a document has been OCRed.
  • Streamlined home page: The Symphony OCR home (summary) page has been redesigned with a cleaner, simpler look.


6.3.65 -

- Bug fix - in rare situations, SOCR could fail to launch with 'ConcurrentModificationException' stack trace

6.3.63 -

- Bug fix - install as a service always resulted in 0x421 error
- Adjusted label on username field to indicate domain\user
- Bug fix - installer wasn't always remembering the correct installation path

6.3.62 -

- Bug fix - scheduler task wasn't working properly if tasks were scheduled for later the same day
- Bug fix - changes to scheduler configuration weren't taking effect until SOCR was restarted
- Display issue - label on Stop OCR Processor task was missing 'OCR'
- Small tweaks to layout of scheduler interface (moved the activity, time and days fields around so they are more intuitive, changed the order that activities appear in the drop down list so they are in order most likely to be used)

6.3.60 -

- Installer - display appropriate header text in screen that prompts for the user to run as

6.3.59 -

- NetDocuments Create Versions setting was turned off by default - it is now turned on by default
- Fixed documentation link on option to enable legacy processing

6.3.58 -

- Workaround for NetDocuments Invalid Hashable error when trying to get the display path of a document (ND changed their API)

6.3.57 -

- Make display of Ignore, Reprocess, Delete and Adjust Priority so they are consistent between the document lists and document details views

6.3.56 -

- Fix 'communication timeout' errors when sending nightly notificatons

6.3.53 -

- Installer bug fix - cloud installs were launching SOCRTray after the installer finished (now it correctly launches SOCR.exe)

6.3.52 - 

- Bug fix - ShareFile integration would say that it wasn't connected when it clearly was

6.3.51 -

- Removed finding of TIF files (changing file extension of an existing document causes duplicate files to be created in ShareFile) - we can bring this back in when we make version creation optional

6.3.50 -

- First iteration of ShareFile integration
- Added ShareFileAPI 0.0.2-SNAPSHOT

6.3.49 -

- Added a log (logs/autorotate.log) to capture the number of pages that were auto-rotated during processing - this is disabled by default - to enable, edit log4j.properties and change the "log4j.logger.autorotatelog=ERROR, AUTOROTATEFILE" line to "log4j.logger.autorotatelog=INFO, AUTOROTATEFILE"

6.3.47 -

- NetDocuments configuration screen had two Basic Settings sections - these have been merged

6.3.46 -

- NetDocuments configuration screen now has an option to "Look for legacy documents". Disabled by default. When enabled, the ND integration will find all documents. Otherwise only documents modified in the past 7 days will be included in the Find phase.

6.3.45 -

- Notification emails now include the name of the SOCR machine in the subject

6.3.44 -

- SOCR now tracks the original file modified date of each document it finds. This is the date used in reporting backlog metrics. End result is that if the DMS forces the modified date to change during processing, the backlog progress graphs will still display the number of pages added over time properly

6.3.43 -

- Changed NetDocuments 'Create versions' option to default to 'true'

6.3.42 -

- SOCR uninstaller will now shut down existing running instances of SOCR (both running as user AND running as service)
- SOCR will no longer give the "run as user/run as service" dialog for cloud installs (default will be "run as user")
- Startup shortcut wasn't being removed during uninstall

6.3.41 -

- Adding support for running SOCR as a windows service
- When running as a windows service, it is not possible to shut down SOCR from inside the user interface - shutdown must be performed using Windows Services
- Added SymphonyOCRTray.exe - this is an applet that runs and puts the SOCR icon in the system tray when SOCR is running as a service

6.3.40 -

- Bug fix - HTML in Needs Attention document lists could be misrendered if the reason contained <<snip>>

6.3.38 -

- Special handling for ND files that were emailed directly into ND (ND forces us to create a version of these types of files)

6.3.37 -

- Installer wasn't adjusting the modified date on the sample images
- Improvements in error/warnings condition reporting during typical ND initial configuration use-case

6.3.36 -

- Ignore button missing in Details screen for Too Big documents
- Add knowledgebook hyperlink for ND config screen
- Added 'Processor > Basic Settings > Automatically rotate pages to proper orientation' option. Enabled by default. If turned off, SOCR will not adjust the page orientation in the output PDF.

6.3.35 -

- Display the repository name along with the cabinet name in the NetDocuments configuration screen
- Tweak system status display so we can click into the license screen if there are licensing issues

6.3.34 -

- Bug fix - if NetDocs document meta data wasn't available for computing workspace path, an IllegalStateException was being thrown

6.3.33 - 

- Bug fix - nullpointerexception if unable to get path from workspace information for ND document
- Enhancement - Changed Lookup By Path to 'Lookup Document'. Users can now type in the doc ID of the document (as it appears in the source document management system). SOCR will query the DMS to locate the actual path of the document and display the details.

6.3.32 -

- Buf fix - ND integration wasn't searching for MSG files

6.3.31 -

- Cloud installer now auto-detects that we are on a WD Cloud terminal server and sets the default install location to "<path to CID Folder>\blah"

6.3.30 -

- Bug fix - files in FOLDER finder were still being processed, even if the folder was marked as inactive
- Added Create Versions option to NetDocuments integration
- MSG files from NetDocuments will now always create a version if the current file only has a single version (this is a special requirement from ND)
- If a file in ND is detected with an incorrect extension, a new version will be created with the correct extension

6.3.29 -

- Installer is now 'cloud aware' (safe to install into the WD Cloud environment)

6.3.28 -

- Summary: Don't display 'Current OCR throughput' data unless OCR has actually happened
- Bug fix - SSCLOUD and SOCRCLOUD licenses weren't working

6.3.27 -

- Added Powered By Abbyy and Trumpet logos to maintenance screen
- Removed msg files from NetDocs finder - NetDocs doesn't do full text indexing of MSG files, so there's no point in processing them

6.3.26 -

- Added spacing between table cells in document list display (prevent path and page number from being too close together)
- Improved error message if ND workspace configuration isn't set up properly
- We now track when the ND refresh token is valid through (1 year) and display a warning to the user 15 days prior to them needing to manually re-authenticate the SOCR -> ND connection
- We now track when the ND authentication token is valid through (24 hours, or 45 minutes of inactivity, whichever comes first) and auto-reset the connection instead of waiting for it to fail on an actual call
- Changed icon to indicate OCR (differentiate between S-Pro Workstation sys tray icon)
- Changed the UI for setting frequency for backlog searches to it works in days instead of hours
- Changed default search frequency for backlog searches to be 7 days

6.3.25 -

- NetDocuments configuration screen now hides the settings unless the connection to ND is established
- If the connection to ND is not established a button appears for the user to explicitly connect Symphony to ND

6.3.24 -

- Bug fix - NetDocuments Finder wasn't auto-starting after connecting to ND
- Bug fix - heartbeats were being initialized and sent before everything was configured - this caused the 000000 Worldox user to be used for the first several minutes of the application running, even if a different user was configured
- Worldox 'open' links will now use wdox:// hyperlinks instead of generating wdl files if WD is newer than 8/15/2014

6.3.23 -

- Added Open link for NetDocuments document records

6.3.22 -

- Added support for NetDocuments DMS
- Added Progress Details screen (detail hyperlink next to system summary progress bar on Welcome page)

6.3.19 -

- Added display path in addition to the canonical path for each document. Filtering from the document lists will be performed against the display path. If the display path is different from the canonical path, the canonical path will be displayed as an additional attribute in the Detail screen of the document record
- Made email attachments so their display path is the name of the attachment instead of the awkward 0000000001 number

6.3.18 -

- Changed sample scan and msg files to be more fun (drink recipes)
- Bug fix - problems with MSG handling support weren't being detected immediately at launch (they only appeared after several minutes)

6.3.17 -

- Notification warning message will now display the full message in the Welcome screen instead of just 'Notifications have problems'

6.3.16 -

- Tweaked wording on No Image/No Text document list 'What's this' description

6.3.15 -

- Added ability for user to force processing of No Image/No Text documents. This can be initiated using the 'Enable Processing' button on the Document Detail screen, or using the 'Enable Processing' bulk action button on the Document List screen. These buttons only appear for documents that are in the No Image/No Text list.

6.3.14 -

- Added e-mail notifications (see new Notifications screen)
- Added overall progress bar to Summary (welcome) screen
- Removed the License Info section of the Summary screen for unlimited page count sites
- Added 'pages processed in past year' to the Statistics area of the Summary page
- Bug fix - SOCR was loading it's configuration twice during launch
- On startup, SOCR will now kill any lingering WBAPI.EXE instances that were started by other instances of SOCR
- Add Simple View links to Summary (welcome) screen - this will display a view of the summary page that doesn't contain any links to other areas of SOCR
- Changed the 'Basic View' link on the Search Summary screens to be 'Simple View', and moved it to the upper right corner of the Search Summary page

6.3.13 -

- Added a 10 day grace period if the Worldox user count goes above the licensed Symphony user count. During this grace period, the license issue will display as an Error, but processing will continue. After the 10 days, the issue displays as an Error and processing will stop.

6.3.11 -

- Improved note behavior from 6.3.10 to encourage users to actaully pay attention

6.3.10 -

- Added a note reminding users to enable email attachment indexing in their DMS next to the 'Process MSG (email) attachments' setting on the Processor configuration screen. This note hides/shows depending on whether MSG processing is disabled or enabled

6.3.9 -

- Regression bug fix - the hidden file bug fix from 6.3.6 got reintroduced in 6.3.7

6.3.6 -

- Bug fix - files that were marked as Hidden would be OCRed, but the conversion results couldn't be returned to the file system
- Bug fix - in some extremely rare instances, SOCR could generate a corrupted file (charset encoding issue)

6.3.5 -

- Added additional error trapping for corrupted MSG files (0x8004011b error code)
- If internals of MSG cause attachments to be inaccessible, the document record will now be put in the Inaccessible list (old behavior was to put it in the Reprocessing list)

6.3.4 -

- Added low disk space warning and error to backup manager. By default, these are set to 1.5 GB for warnings, and 1GB for error
- If disk space for backups drops below 'error' disk space level, documents will be moved to the Reprocessing list
- Levels can be adjusted by manually editing settings.xml and adding errorUsableSpace and warnUsableSpace parameters to the <backupManager .... /> element
- This check only occurs if backups are enabled

Summary 6.0-6.2

Here's an overview of the major changes:

  • Added support for email attachments - now, attachments will automatically be OCRed when the email is saved to Worldox.
  • OCR activity is now logged in the Worldox Audit Trail, so that you can note the full OCR history of a document within Worldox
  • Better page count management (for firms that have a large backlog of files to OCR)
  • New document prioritization levels
  • Ability to specify when each profile group should be OCRed
  • Better backlog reporting (explicitly display page usage in past year, recommended license size)
  • Ability to report on backlog progress for each profile group
  • Streamlined installation (Client ID and Partner ID data entry eliminated)
  • Page count will now reset on anniversary date, not January 1
  • User interface overhaul with consistent page layout and links to online knowledge books
  • Consistent heartbeat error/warning reporting


6.3.2 -

- Add support for unlimited page count Trumpet licenses (P0 feature in the Trumpet license)

6.3.1 -

- UI improvement - MSG analysis was showing page X of Y of the previous PDF analysis status, even though there weren't any pages being analyzed

6.1.62 -

- SOCR installer was creating empty config, data, logs and work folders in the directory containing the setup executable

6.1.56 -

- Bug fix - some database states weren't reporting in the system status
- If database fails to open, it's state is switched back to Closed before throwing an exception
- Graph generation failed with exception trace if there was no data

6.1.55 -

- Bug fix - SOCR was hard coding the full path of the backup folder, instead of using relative paths
- Bug fix - Backup manager would fail to make backups if the user migrated the configuration from a 32 bit machine to a 64 bit machine - this now gets automatically corrected

6.1.53 -

- If we fail to open database on launch or on scheduled rebuild, we now do a hard fail - present an error dialog on screen, then kill SOCR (with exit code 999)
- If we fail to re-open the database after scheduled maintenance, we now do a hard fail

6.1.52 -

- When OCR results are returned to Worldox, post an audit trail entry (Save)

6.1.51 -

- If documents.lg file is corrupted, we now attempt to delete it

6.1.50 -

- Backlog throttling algorithm adjustments (undoing some of the 6.1.48 changes)
- SOCR now bases it's default reserve on an assumed license duration that is 13/12 of the actual license duration (for a 380 day license, this equates to an additional 42 days). This will cover cases where licenses are entered prior to the license start date, at the price of slightly higher initial backlog processing before throttling kicks in.

6.1.49 -

- Changes to document priority in Worldox and Folders screens now run as a background task with progress displayed in the standard background task frame
- Changes to the Processor Config screen that result in bulk changes to documents (moving Wrong Type to Analyzing, or moving unprocessed email message to Analyzing, etc...) now run as background tasks

6.1.48 -

- Backlog throttling algorithm adjustments
- If there is more than 30 days worth of data, SOCR will now dynamically compute the reserve capacity (130% of the average number of OCRable pages added over the past year).
- If there is insufficient data, SOCR will now reserve 3/4 of of future processing capacity for new pages
- The minimum default reserve capacity is 50 pages/day (this can be overridden using overclocking)
- Not a change, just a reminder: In all cases, if the reserve capacity isn't used on a given day, those pages become available for backlog processing
- Split up the Processor setting blocks (Backup retention and backlog throttling settings are now grouped separately)
- Changed wording on backlog throttling / processing capacity reserve settings (plus but the checkbox and pages/day input field on the same line)
- The counts for the pages added in the past year are now stored to disk so we don't have to compute them immediately on startup (they are refreshed once per day)

6.1.46 -

- If MsgEdit.exe doesn't return results within 60 seconds, we now destroy the sub-process, abandon the call and throw an error

6.1.45 -

- Bug fix - bulk operation buttons were showing by default, they are now hidden until the user clicks the Show Bulk Operations button
- Bug fix - Email messages that contained MSG attachments that had double quotes and periods near the end of their names, and in-turn contained attachments resulted in errors during processing

6.1.44 -

- Added new process priority level: Analysis Only (no OCR) - these documents will be analyzed and will stay in the Processing list, but will not be processed
- Changed label on Very High and High processing priorty to have "(no throttling)" at the end

6.1.43 -

- Bug fix - background tasks would disappear from UI before they finished running (only on some browsers)

6.1.42 -

- Added new setting to Worldox config (only in settings.xml, not UI): autoMapDisconnectedDrivesEnabled if true (the default), any disconnected drives are mapped. If false, no drive mapping will be attempted.

6.1.41 -

- Bug fix - Worldox connection was getting reset immediately after launch
- Tweaking display of pages left in Processor feature on License screen (displayed inaccurate data for Jan 1 resetting licenses)

6.1.38 -

- Bug fix - backlog throttling wasn't working properly.  In some cases, it would allow runaway, unthrottled processing of backlog
- License detail screen now displays additional information about the license
- License detail screen Features list now gives info about how many pages have been used and how many are remaining
- Added columns to the CSV document export for priority, last modified

6.1.37 -

- Bug fix - sites that had UNC mapped profile groups where the UNC share was no longer valid would wind up with no PGs being found at all

6.1.36 -

- Added pages left readout to Processor config screen

6.1.35 -

- Documents with pages larger than a threshold (10,000x10,000 pixels by default) are now placed in a 'Too big' document list.  The size limits are configured in the PreProcessor section of the setings.xml file - maxWidthPixels, maxHeightPixels

6.1.34 -

- Bug fix - processing files with huge numbers of pages (>1000) could result in 'CreateProcess error=206, The filename or extension is too long' error
- Processor and Analyzer will now allow up to 5 errors in a one hour period before pausing processing
- Handle 0x80030050 errors (STG_E_FILEALREADYEXISTS) - these files are now flagged as corrupted
- Handle 0x80030005 errors (STG_E_ACCESSDENIED) - these files are now flagged as restricted
- Handle 0x80004005, 0x800300fa errors - these are flagged as corrupted now
- Added 'Reason' to Needs Attention list

6.1.33 -

- the order that PGs are searched is now driven by the default priority assigned to that group (high priority searched before low priority)
- the order that folder searches are added to the finder task is driven by the default priority assigned to each root folder

6.1.32 -

- documents in CORRUPTED list will no longer be auto-reanalyzed after every update
- Corrupted MSG files were being put into the Email Messages list instead of the Corrupted list

6.1.31 -

- Document List views now support Background Tasks display for bulk operations (Delete, Change State, Change Priority)

6.1.31 -

- Bug fix - attachments with tiff extensions (i.e. 4 characters) were not being handled properly (continuously put back on REPROCESS list)

6.1.29 -

- Backup purge now removes files from work\backupfiles folder tree if those files aren't referenced by a Document record

6.1.28 -

- MsgEdit failed to write output file if attachment names contained unicode characters
- New Background Task sub-system has been added - currently integrated into Processor Config (and parts of Advanced and Scheduler Config)
- Backup purge is now implemented as a Background Task

6.1.27 -

- Bug fix - MsgEdit.exe crashes sometimes
- Sub-attachment names are now prefixed with the sub-email message name that they came from
- Msg working folder is now flushed each time we invoke MsgEdit.exe
- SOCR will attempt to connect network drives for Worldox profiles that have been disconnected
- WDAPI32.DLL will now be completely unloaded when we reset the Worldox connection

6.1.24 -

- Fix for 'Premature end of file' and 'Content is not allowed in trailing section' errors when processing MSG files (MsgEdit wasn't closing results.txt properly).  MsgEdit

6.1.23 -

- If PDF was corrupted, but could be rebuilt during analysis, we now still mark the PDF as corrupted (error message is 'PDF is partially corrupted - but it can probably be repaired in Acrobat then resbumitted for processing').  These types of PDFs can't be modified in 'append mode' to place the invisible text layer, so it makes no sense to continue processing them (even though technically we can OCR them)
- Restricted documents now only go to the Encrypted/Restricted list if they would have otherwise been processed
- Digitally signed documents now go to the new Digitally Signed list if they would have otherwise been processed
- Digitally signed and encrypted state is now displayed in Document detail screen (if the document is signed and/or encrypted)
- Added list descriptions for a few document lists that were missing descriptions
- If document modified date is more than 1 day in the future, we process the document (instead of putting it in the re-process list) - we had some sites that had whacked modified dates (like 5 years in the future) on documents and SOCR was thinking that they were modified recently so kept putting them in the re-process list

6.1.19 -

- Better error message if something goes wrong during PDF analysis (include the filename and pagenumber)

6.1.17 -

- Add document name to warning logging when inline image parsing of pages in a PDF fails
- Added better description to the email related document lists (there was no description before)
- Better error handling if Outlook wasn't installed on the workstation (or isn't working for some other reason) - warning now appears in system status if there is a problem, AND MSG handling has been enabled
- We now have a new list for unprocessed email messages (MSG files go into here if MSG processing isn't enabled, or if Outlook isn't installed properly)
- When MSG handling is changed from disabled to enabled, SOCR will now mark unprocessed email messages and unprocessed email attachments for re-processing
- On launch, if MSG handling is enabled, SOCR will now mark unprocessed email messages and unprocessed email attachments for re-processing

6.1.14 -

- change installer - the Java bundle id download link is now BundleId=81819 (Java 7_u45)
- change installer - the Java installation now completely runs in silent mode - the user doesn't have to click through Java installation screens, and they aren't taken to a web site to test the Java install after it completes
- change installer - the Java installation is configured to NOT integrate with the web browser on the machine

6.1.13 -

- Message dialog when user launches SOCR multiple times is friendlier - and clicking OK on it displays the UI of the already running instance.

6.1.12 -

- First pre-release with MSG attachment handling

6.1.10 -

- Bug fix - sometimes after saving processor changes, analyzer would wind up not running

6.1.8 -

- Change labels on Processor config screen
- Adjusted 'Change state to' history messages to display the 'pretty name' instead of the CONTAINER, ERROR, etc... enum name
- Display attachment name in document detail screen
- Bug fix - when processing TIFF attachments, the attachment record in the parent document was still refering to the .TIF file extension - we now update the analysis results (which is where the attachment information lives) whenever we rename an attachment

6.1.7 -

- Bug fix - TIFF email attachments were staying with tif file extension even after being converted to PDF
- Bug fix - TIFF email attachments were being named 000000 instead of retaining the original name (with new extension)

6.1.5 -

- MSG files weren't being found in Worldox finders
- Eliminated 'allowedExtensions' setting in Worldox and Folder finder configuration - replaced with 'disallowedExtensions' - this isn't surfaced in the UI, but if there are firms that don't want us finding MSG files or what-not, we can set this

6.1.3 -

- Added 'Refresh' button next to Worldox PG list (allow users to see changes to the PG lists made since SOCR launched)

6.1.2 -

- Added 'Allow processing of email attachments' setting to Processor config
- Cleaned up Processor config UI a bit

6.1.1 -

- First build with support for MSG handling

6.0.17 -

- Bug fix - if the file in the underlying file system has no modified date set, database corruption could result when the document is added to the database.  See ticket 20124 for details.

6.0.15 -

- Bug fix - if a database corruption occurs, and the user does a database reset (renamed documents.db and documents.lg) before a rebuild can happen, all future launches fail

6.0.14 -

- Bug fix - profile groups with base paths containing spaces could prevent Finder from working, even if the profile group wasn't selected for processing

6.0.13 -

- Changed Folder feature description to "Windows folder tree integration"  (old description referred to 'processing' which could cause confusion)

6.0.12 -

- Bug fix - Warning wasn't showing if no Worldox PGs were selected

6.0.10 -

- Licensing no longer warns about unknown feature codes in new license format

6.0.8 -

- Bug - null pointer exception during page count calculations for old license type if Abbyy engine fails to initialize

6.0.7 -

- Move to engine installer

6.0.6 -

- Added support for OCR of spanish and brazlian portugease documents

6.0.5 -

- Bug fix - older sites that were using the Auto Select PGs checkbox wound up with no PGs selected after upgrading.  We removed the Auto Select PGs option when we moved to version 6, so a compatibility shim was needed for those sites

6.0.4 -

- Bug fix - heartbeats were reporting the OCR backlog size incorrectly (often times not reporting it at all)

6.0.3 -

- Bug fix - long running finder tasks were showing 'Waiting for other tasks to complete' when they were actually running

6.0.2 -

- Simple view of search summary screen (the bar graph progress dialog) no longer has hyperlinks on the cabinet paths (this was allowing users to easily get to a non-simple view of things)

Summary 6.0

Symphony 6.0 brings a major change to the workstation user interface.  Here's an overview of the major changes:

  • New document prioritization levels
  • Ability to specify when each profile group should be OCRed
  • Better backlog reporting (explicitly display page usage in past year, recommended license size)
  • Ability to report on backlog progress for each profile group
  • Streamlined installation (Client ID and Partner ID data entry eliminated)
  • Page count will now reset on anniversary date, not January 1
  • User interface overhaul with consistent page layout and links to online knowledge books
  • Consistent heartbeat error/warning reporting

6.0.1 -

- Moved all changes from versions 5.3.13 to 5.4.51 to version 6.

5.4.51 -

- Bug fix - page count and document count usage displayed in the heartbeats was flipping (pages,documents then documents,pages) every time a new document was processed.

5.4.50 -

- Added debug output to track when processor and pre-processor are started and stopped

5.4.49 -

- Tweak to Unavailable list text - changed to Moved/Unavailable

5.4.48 -

- Heartbeats now include the actual status (WARN, ERROR) before the details

5.4.47 -

- Added download URL for engine installer if auto-download fails
- Bug fix - in Worldox and Folder config screen, the 'apply to existing documents' checkbox were displaying initially, even in modern browsers
- Bug fix - in Worldox and Folder config screens, clicking the 'select all' checkbox had no effect
- Worldox API session IDs are now generated using current clock time - avoid issues with accidentally reconnecting to old (corrupted) WDAPI

5.4.46 -

- Bug fix - Worldox configuration screen could fail to apply changes with NullPointerException in rare situations

5.4.45 -

- Bug fix - setting changes could not be saved if config.xml file didn't exist

5.4.44 -

- Bug fix - Heartbeat sender was crashing SOCR on launch if license wasn't populated

5.4.43 -

- Updated to jWDAPI 20130508 - adding ability to detect when WDAPI doesn't load (vs has errors)
- Much better error message when we aren't able to load the Worldox API: "Unable to load Worldox libraries - please close Symphony, launch and close Worldox, then re-launch Symphony.  Error details: Worldox API not initialized. Worldox must be launched and closed at least once for a given Windows login (this registers the Worldox programming interface). If you have recently updated Worldox, you may need to launch and close Worldox one time to get the update downloaded."

5.4.42 -

- Bug fix - heartbeat wasn't sending appid
- Switched to using new heartbeat post type (no client or partner ID)
- Added link to WD config screen from license error message (ticket 18458)
- Added 'Send Heartbeat Now' link to the License page (under Advanced section)

5.4.41 -

- Fix type-on in search summary screen

5.4.40 -

- Bug fix - analysis of some PDF files was showing incorrect image ratios (crop and media box extents issue)
- Bug fix - visible text outside the crop box was being included in the visible text counts - this text is now being excluded fro mthe visible text counts
- Added ability for SOCR to generate thumbnail images of pages that don't have one already (disabled by default)
- Added 'Generate Thumbnails' setting to Processor configuration

5.4.39 -

- S-OCR now tracks page counts based on the license renewal date (will take effect at the next license renewal)
- When the new page count system is active, the display of remaining processing capacity reflects the renewal date in MMM, yyyy format (or MMMM d, yyyy format if the license is for less than a year)
- Added New pages per year calculation to the welcome screen
- Added 'Recommended license capacity' to welcome screen if the current license isn't big enough to handle the backlog and one year's new documents
- Added explicit link for displaying All Weeks of the Backlog Summary screen
- Fixes a number of display issues with backlog throttling warnings and other error and warning messages
- Heartbeat now includes page usage summary for each year (only take effect when the new page count tracker is active, so it'll be awhile until we have good data on this)

5.4.38 -

- Add debug lines to troubleshoot issue with PG's not being found
- Remove config\TRE_settings.reg from installer
- Bug fix - an extra heartbeat sender was being created, and it was putting error messages in the log files

5.4.37 -

- SOCR now saves a backup of the previous settings.xml file (in the config\bak folder) every time the user changes settings.  These backups are retained for 90 days.

5.4.36 -

- Change headings on document detail screen to have 'Control' separate from 'Details'

5.4.35 -

- Change label on Processor Config 'Performance' section to be 'Performance  (since last restart)'

5.4.34 -

- Make summary screen progress bar show percent complete instead of percent remaining

5.4.33 -

- UI tweaks on summary screen

5.4.32 -

- Legacy schedule entries "RUNFINDER_QUERY", "RUNFINDER_SPIDER" are now discarded (previously, they would appear as "Operation (RUNFINDER_QUERY) not available" - because these particular tasks will never be available, we will discard them
- Improved look and feel of summary screen

5.4.31 -

- Bug fix - Heartbeats weren't sending on a regular basis (they only sent when the application launched).  Heartbeats should now be sent once per hour.

5.4.30 -

- Added SYMPHONYANALYZER license type

5.4.29 -

- Bug fix - documents in very low and low priority weren't being processed at all (even if backlog throttling wasn't active)

5.4.28 -

- Bug fix - queue analysis was messing things up if file modified date was greater than when the queue analyzer was first created
- Bug fix - Show/hide bulk operations wasn't working in Internet Explorer

5.4.27 -

- Bug fix - huge PDF files caused out of memory exceptions during processing

5.4.26 -

- Bug fix - invalid document modified times could cause queue analyzer to mis-calculate
- Bug fix - errors during document mutator notifications could cause DB to actually be corrupted

5.4.25 -

- Ignore All is now available in the Processing list bulk operations

5.4.24 -

- bug fix - missing files could cause process summary graph to be improperly computed ( java.lang.ArrayIndexOutOfBoundsException: -1 error )

5.4.23 -

- Enabled heartbeat sending using the old heartbeat format (otherwise heartbeats aren't showing up)

5.4.22 -

- Search Summary screen now displays a progress bar for each profile group (or folder)

5.4.21 -

- Bug fix - if two web requests were hitting at exactly the same time, they could conflict and result in ClassCastException errors.  This may also address problems where sometimes clicking a link didn't seem to always work.  This problem has been in the code since forever - I'm glad we got it fixed
- Bug fix - document in NEW list on Tia's VM testing - we now reprocess anything in the NEW list for good measure
- Fixed two knowledge article links (they were pointing at admin.php instead of index.php)

5.4.20 -

- Bug fix - log file had error messages related to Scheduler during initialization
- Bug fix - Summary screen was only showing results for first 50 documents
- Added display of which list the Summary is for (added 'in Processing list' to the end of the search criteria)

5.4.19 -

- Bug fix /maestro/do/status wasn't showing the status level
- We now re-analyze any document in the DELETED list when SOCR launches (documents should never be in the DELETED list on launch, this is a cleanup from an earlier bug) - these document records will almost certainly wind up being on the UNAVAILABLE list.

5.4.18 -

- Show Bulk Operations button now changes it's text to 'Hide Bulk Operations' if appropriate
- Bug fix - the bulk operations were always showing in certain versions of Internet Explorer (even with Javascript turned on)

5.4.17 -

- Bug fix - installer was forcing Cleaner Recovery every time SOCR was updated (yuck)

5.4.16 -

- Bug fix - in document list, clicking Reanalyze on a document, then applying a filter to the nexts screen resulted in errors
- If installer is unable to clear out existing files, it now gives the user Retry and Cancel buttons (instead of just aborting the installation)

5.4.15 -

- Bug fix - loading pre-5.4 databases could result in a failed launch with error in logs: com.trumpetinc.maestro.MaestroApp  - Problem during initialization - will attempt database maintenence when we launch again - -1757588944

5.4.14 -

- Bug fix - Reanalyze button on individual files was resulting in stack trace error
- If filter is specified as empty, or ending with a backslash, we now add an * to the end

5.4.13 -

- Removed Detail link from document list - details are now obtained by clicking on the filename itself
- Added 'Show Bulk Operations' button - clicking this displays a panel with the bulk operations on it
- Added 'Bulk Operations' pane with individual buttons for performing bulk operations on the filtered list results
- Changed 'View' to 'Open' for Worldox documents (will open the document in Worldox)
- Re-arranged per-document operation buttons so they lay out nicer for Folder and Worldox sourced documents ('Open' is now at the end of the list)

5.4.12 -

- Redesign Folder configuration screen - individual items can be enabled/disabled, added default priority setting, simplified adding new folders
- Added Default Priority to Worldox configuration screen
- Added 'Reprioritize existing documents' checkbox to Folder and Worldox configuration screens (this appears when the priority is changed)
- Added View Summary link to Folder configuration screen
- Added What's this section to Lookup by Path screen

5.4.11 -

- Advanced config option: New webServerConfigration -> listenPort setting - controls which port the internal SOCR web server will listen on.  Default is 14722.

5.4.10 -

- All links in the "What's this" sections of each page now open in separate tabs
- Fix capitalization of Processor, Analyzer and Finder in scheduler task names

5.4.9 -

- Lookup By Path - added Tip line
- Scheduler - removed Run Now button (doing that feature properly would require a lot of work - not worth it right now)
- Analyzer config screen, changed 'Settings' to 'Advanced Settings'
- In Licensing screen, Change Analyzer feature to just read 'Analyzer' and Processor feature to just read 'Processor'

5.4.8 -

- Scheduler will no longer schedule tasks that the license doesn't allow (prior, if a task was configured, then the license changed, the task would still be scheduled for execution)
- Scheduler Configuration screen only displays schedule entries that are for tasks that are allowed by the license
- Scheduler Entry Edit screen now only displays task types that are allowed by the license
- Added bulk priority change buttons to document list screens (not sure if this is the best UI for this...)
- Bug fix - Search Criteria in Search Summary page weren't showing the search description
- Search Criteria in Search Summary page can be clicked to get a list of documents meeting that criteria (i.e. all documents in a particular PG) - this will eventually allow users to adjust priority on all documents in a PG, for example

5.4.7 -

- Change 'read only' to 'read-only' in WD config screen
- Added Save button to Worldox basic settings section
- Save button in Processor and Analyzer screens are now below the Settings areas (consistent with other screens)
- Scheduler list now has a 'Delete' button for removing schedule entries.  The Edit Schedule Entry screen no longer has a 'delete' button
- Bug fix - "Client ID not set" error on clean install
- Configuration file settings:
   - documentProcessor autostart="true"  - if 'false', the processor will not start when S-OCR starts (useful for troubleshooting)
   - documentPreProcessor autostart="true"  - if 'false', the analyzer will not start when S-OCR starts (useful for troubleshooting)
   - MaestroConfig checkDatabaseIntegrityOnLaunch="false"  - if 'true', SOCR will check the database integrity as it launches (problems are logged as fatal errors to the maestro.log file, an error dialog will appear on screen and the launch will fail) - this slows the launch down, and shouldn't be used in production, but it would be very good to have this turned on in our test environments
- Bug fix - caption on Search Summary screen said 'Search Summary for Search Summary' - it now properly says the name of the search summary (e.g. Worldox profile groups)
- Ability to change individual document priority from the document list
- Bug fix - backlog calculation wasn't looking back 5 days (for all intents and purposes, anything older than today was considered to be part of backlog)
- "Backlog throttling active" warning message in Processor configuration screen now displays the actual date of the backlog cutoff
- Processing and Analyzing lists now have headings for processing priority groupings

5.4.6 -

- Bug fix - Scheduler is coming up without default entries on clean installs
- Date column in Processing and To Analyzing lists displayed the last time the file was processed.  For these two lists, this column will now display the modified date of the file.
- Added mechanism for controlling processing priority (Very Low, Low, Normal, High and Very High) - documents are grouped by their processing priority (and ordered by file modified date inside each priority group).  Groups Very Low and Low are always considered to be part of the backlog.  Group Normal documents are part of the backlog if they are older than 5 days.
- Detail:  Database rebuild is required by processing priority implementation - this will happen automatically the first time the DB is opened
- Worldox PG analysis results can now be obtained by clicking the new View Summary link in the PG selection list  

5.4.5 -

- Bug fix - adding an emty string as a folder gave a null pointer exception
- Added hyerplink for support docs to Folder config screen
- UI improvements to monitored folder add screen

5.4.4 -

- Bug fix - clicking on View on a document that didn't originate in Worldox caused Worldox to try to view that document.  The View button is no longer displayed unless the document originated in WD.

5.4.3 -

- Bug fix - warning messages related to folder processing were showing up in the Finder screen, even though Folder processing wasn't enabled by the license.
- Bug fix - If license disallowed Folder Finder, then a new license was entered that allowed Folder Finder, the Folder Finder did not get turned on

5.4.2 -

- Bug fix - old Worldox selected PG configuration wasn't being carried over into the new system
- Bug fix - "0 is 0 or negative" error when analyzing zero sized PDF files - these now get properly flagged as being corrupted
- Bug fix - the warning "No Worldox profile groups are selected" error shouldn't display "Configure Worldox" link when the user is on the Configure Worldox screen.  Same for folder screen.
- Bug fix - during installation, errors pop up about not being able to configure firewall exclusions - for now, we are going to remove this from the installer - it's not worth the hassle if it's not going to work robustly

5.4.1 -

- Bug fix - backlog throttling was activating improperly on short trial licenses because it was using december 31 of the current year as the license reset point in the calculations - Now, if the license duration is less than 90 days, the backlog throttling algorithm will use the expiration date of the license itself (instead of Dec 31 of the current year)
- Installer will now configure Symphony OCR rule in windows firewall
- Breaking change:  maximum file modifed date cutoff in the Finder is no longer honored - any sites that have this value set will start 'finding' older file.  This does NOT impact the maximum file age setting in the processor.  Note that this setting was removed from the UI awhile back (unless the user had it actually set)
- Breaking change: Auto select PGs has been removed - we should probably have users check the selected PGs after they apply this update
- Launcher has additional error messages (Should provide more feedback in some cases when the Java runtime is corrupted)
- Finer grained history notes if document is inaccessible
- Breaking change: It is no longer possible to force a document record to be created by typing it's path into the Search By Path dialog - if the document doesn't exist in the database already, it will no longer be created.  See http://forum.trumpetinc.local/viewtopic.php?f=2&t=905&p=7271#p7271
- Major new feature: Ability to page through list results
- Major new feature: Ability to filter list results.  All operations like Reanalize All, Ignore All will apply to the filtered list
- Added a new document list - Unavailable for documents that have their document source (worldox, folder, etc...) become unavailable (this can happen b/c of licensing, b/c the source is offline, or if the configuration of the source no longer includes the document - i.e. if the user changes the selected PGs in the Worldox DMS source)
- Major new feature: Support for new Folder DMS types
- Complete overhaul of Finder implementation to support pluggable DMS types (these are called Document Sources)
- Finder will now check existing document records for reprocessing, regardless of which list they are in (this check used to only be made against the document modified date - now it is made against the modified date AND file size)
- Uncommon document lists are now shown under a 'Other Lists' heading, if those lists have any documents
- If the analyzer or processor finds that a document is unavailable to the DMS (i.e. file was deleted, DMS is offline, etc...), those documents are moved to UNVAILABLE (they used to be moved to DELETED) - see
- New feature: Ability to run a scheduler task by clicking 'Run now' in the scheduler configuration screen
- Complete overhaul of how the scheduler works - we now run tasks at a specific time, instead of running at a certain frequency between a start and stop time
- Breaking change: some existing scheduler entries will show as "Operation (XXXXXX) not available" - this will definitely happen for RUNFINDER_SPIDER and RUNFINDER_QUERY, which are no longer part of the scheduler
- Errors when unable to connect to Worldox are more readable
- Moved to new license feature scheme (based on letter codes - see http://forum.trumpetinc.local/viewtopic.php?f=2&t=885&p=7154&hilit=symphony+ocr+license#p7192 )
- Bug fix - documents could wind up in a DELETED list
- Bug fix - retained backups weren't being purged for all documents (if documents weren't on the PROCESSED list, they wouldn't have backups purged).  Now the purge considers all document lists.
- The version of S-OCR that actually processed a document is now embedded in the PDF.  If we re-analyze the PDF, that version will be displayed in the document details under the 'Marked' heading
- S-OCR now uses 'append mode' when adding invisible text to PDF files.  In append mode, the changes to the PDF are added to the end of the original PDF file.  This makes it possible to completely recover the original PDF by just deleting the last XXX bytes off the file.  Testing shows that the file size is not adversley impacted by this (in many cases, the resulting file is actually smaller)
- Bug fix - SOCR was OCRing documents that had been configured to allow Adobe Reader Form Filling (in the process of doing this, the form is no longer savable by Reader).  The Analyzer will now detect this and move the file to encrypted/restricted.
- Move to itext-5.4.1-20130310.jar - adds ability to see if a PDF has usage rights (Reader form filling enabled)
- Complete overhaul of the web pages that make up the UI - trying to make them much more consistent
- Change in behavior when processing TIF files - we now preserve the document record (we just change the document path in the record).  In the past, we created a new Document record for the resulting PDF, and left the record for the TIF file hanging around.  This meant a lot of bookkeeping with keeping track of which document got converted to which other document.  This has all been stripped clean.
- Document backup restore has been rewritten so the restored document record is preserved if the file extension is different (we used to create a new document record)

5.3.13 -

- Bug fix - cleaner was not cleaning files that had been split by Symphony Profiler.  This has now been fixed - any documents in the Processed queue will be re-analyzed and re-cleaned again

5.3.12 -

Added logs/cleaned_files.log output with details of every file that is cleaned

5.3.11 -

Critical bug fix - Symphony OCR was misplacing OCRed text on pages that had to be rotated during processing.  In most cases, the text was placed completely off the page, making it possible to search *for* a document, but not search within the page (or copy text from the page).  This issue was introduced by a change in a 3rd party library on 6/20/2012 in S-OCR version 5.2.58.  This issue is now fixed.

After applying this update, any site currently running 5.2.58 through 5.3.8 will automatically enter a special recovery mode.  In this mode, all documents processed since 6/20/2012 will be investigated to see if any pages have text that lays outside the visual page boundaries.  If so, the invisible text on those pages will be removed, and the page will be re-OCRed.  This operation will not count against the site's annual page processing count.

The cleaning operation can be triggered manually by re-analyzing any document in the Processed list.  If the document is identified as having the problem, it will be moved to a new Backlog list named 'Cleaning'.  A special module ('Cleaner') will process this list.  The Cleaner module can be manually stopped and started from the Advanced screen.

5.3.10 -

Bug fix - Analyzer was identifying pages that had text right on the edge as being candidates for cleaning

Automatically give the client "bonus" processing capacity for any pages that we wind up cleaning  

Heartbeat now includes # of bonus pages for the current year (if > 0)

5.3.9 -

Critical bug fix - Symphony OCR was misplacing OCRed text on pages that had to be rotated during processing.  In most cases, the text was placed completely off the page, making it possible to search *for* a document, but not search within the page (or copy text from the page).  This issue was introduced by a change in a 3rd party library on 6/20/2012 in S-OCR version 5.2.58.  This issue is now fixed.

Any site currently running 5.2.58 through 5.3.8 will automatically enter a special recovery mode.  In this mode, all documents processed since 6/20/2012 will be investigated to see if any pages have text that lays outside the visual page boundaries.  If so, the invisible text on those pages will be removed, and the page will be re-OCRed

The cleaning operation can be triggered manually by re-analyzing any document in the Processed list.  IF the document is identified as having the problem, it will be moved to a new Backlog list named 'Cleaning'.  A special module ('Cleaner') will process this list.  The Cleaner module can be manually stopped and started from the Advanced screen.

For most sites, the automatic recovery mode should take care of everything without the user involvement

Added automatic detection of files that had misplaced OCR text.   

5.3.8 -

bug fix - If pages had to be rotated (CW, CCW or upside down scans) during processing, the resultant invisible text was not being placed properly on the page (in many cases, the text is entirely off the page).  Introduced in version 5.2.58 (when we moved to Abbyy 10).  This fix does not address documents that have already been scanned - we are working on that.

5.3.7 -

Bug fix - Analyze Worldox Profile Groups was presenting totals for all documents, not just the OCR backlog

5.3.6 -

Advanced screen now has Analyze Worldox Profile Groups command - presents a summary table with the total number of documents and pages in each profile group

5.3.5 -

Added support for files with really long filenames (>260 characters) (due to limitations in Worldox, files must still have 8.3 filename equivalents)

5.3.3 -

Added /wait=X command line argument (wait number seconds) - /wait=5 will wait 5 seconds before really launching the application

5.3.2 -

Bug fix - small corruption in library caused S-OCR to crash in some rare cases (legacy profile groups)
Updated jWDAPI.jar and jWDAPI.dll - 20121005


Summary 5.2

  • Better processing of documents with over 1,000 pages
  • Improved OCR status messages
  • Several bug fixes and enhancements



Added additional 80004005 message checks: "This image file format is not supported", "Unknown error while opening"
Bug fix: Files were being left behind after processing, causing the Symphony PC's disks to fill up - this was introduced in 5.2.58 - recommend that any site on 5.2.58 or higher update

5.2.76 -

Error 0x80004005 with error output of "Invalid PDF file" or "PDF data is corrupted" now causes the document to get moved to the Corrupted list, instead of the Needs Attention list

5.2.75 -

Bug fix: Crashes and documents winding up in Needs Attention list when processing really big files (> 1000 pages)

5.2.74 -

Bug fix: Some errors during OCR could result in an on-screen crash dialog (MaestroOCRProcess has encountered a problem and needs to close).  This dialog prevents further processing until Close is clicked.

5.2.73 -

/maestro/do/status screen now includes a line that says whether the backlog is "large" or not (>2000 pages)

5.2.72 -

Bug fix: some documents that were prevented from being modified by NTFS security weren't being moved to the INACCESSIBLE list (they were being placed on the REPROCESS list in a continuous loop)

5.2.71 -

If we fail to open a file for analysis (i.e. b/c of file system security), we now discard previous analysis results (this will, hopefully, trap the case where document security was changed AFTER we did analysis - right now, we are using cached analysis results, so we don't see that the security has changed)
Bug fix: In some older sites, the processing priority for a document was set to 0 - this caused an infinite loop that prevented those documents from ever getting processed

5.2.70 -

Bug fix: estimated time to process backlog calculation displayed incorrectly if the time was less than a day, but more than an hour

5.2.69 -

Bug fix: TIF files were being placed in the Needs Attention list with error "Adding text to image failed - ASDFHKWSEEWQI\Trumpet\Symphony.....\image_1.tif not found".  Also, Processor could show error "TifImagePath is empty" errors.

5.2.67 -

Bug fix: Text content of files OCRed *after* they were picked up by the text indexer weren't being added to the text indexes during nightly rebuilds

5.2.66 -

License status will now show an error if the license doesn't specify the allowed number of pages

5.2.65 -

Fix null pointer exception when displaying Processor config in some corner case scenarios

5.2.64 -

Bug fix - null pointer exception if document factory was closed due to an error, then Symphony is quit (this could prevent Symphony from quiting)
Bug fix - Compact database now purges document records that have the same path  

5.2.63 -

Bug fix - StringLong key would have exceeded 300K error messages in logs
New /maintenance command line argument - forces a full compact of the database as S-OCR launches (may be useful in cases where database corruption prevents S-OCR from launching)

5.2.61 -

Improved OCR status messages - they will now display the page that is being worked on (if a page number is available)

5.2.60 -

Bug fix - [0x80020005] - ERROR: Type mismatch error when processing documents that contain barcodes

5.2.58 -

Bug fix: "Comparison method violates its general contract!" error when analyzing some PDF files
Changed MaestroOCRProcess.exe to be SymphonyOCRProcess.exe

5.2.56 -

Bug fix - when compacting database, if finder process was running, the system would lock up after the compact completed

5.2.55 -

Added Compact Database scheduled task to scheduler - by default it runs at midnight on Wednesday

5.2.54 -

Added database status to heartbeat
Added database rebuild progress messages to the advanced screen

5.2.53 -

Heartbeats will be sent even if license is invalid
Heartbeat includes information about license expiration, etc...
Changed license configuration screen so error and warning messages are more prevalent
Added a license warning (to the user interface and the heartbeats) if the license will expire in the next 30 days
Bug fix: If no PGs are selected, system status should show ERROR

5.2.51 -

Fix for "TIFF files winding up in Needs Attention" list - special handling for old-style TIFF files that use an unsupported JPEG compression format

5.2.50 -

Make sample image have modified date of 'today'
During install, we now log the version of installer to Trumpet-UpdateHistory.txt
Admin Guide link now points to Trumpet knowledge book instead of PDF
If the Windows user doesn't have sufficient permissions for OCR engine to work, we now produce meaningful error messages

5.2.48 -

Better handling of partially corrupted database indexes during rebuilds (should fix Java memory/heap space errors on rebuilds in some situations)

5.2.47 -

Added 'Ignore Specified' button to Advanced screen (allows bulk setting of Ignore items)

5.2.46 -

REPROCESS queue is automatically re-processed on launch
Advanced screen addition - you can now put in a bunch of file paths and click 'Process Specified Documents Immediately' to get them all to process at higher priority than other documents.  If the document is NEW or REPROCESS, it'll get moved to the PREPROCESS (analysis) queue, otherwise it stays in the queue it's in already (but with adjusted priority).  If the document is in a Post-Process document list, nothing will happen to it.
To Analyze queue is now ordered by 'Process Priority' (instead of when the document was found)
Improvements to user interface responsiveness (trying to eliminate problems where the user clicks a link, and nothing happens)
Adjusted labels on welcome screen to indicate that times to process backlog are *estimates*

5.2.45 -

Corrupted documents will now be reprocessed immediately after applying a code update
Added a new /maestro/do/status page that gives a plain text status of the system (useful for parsing by monitoring software)

5.2.44 -

Fix for "Colorspace not supported" and "Color depth not supported" corrupted PDF warnings

5.2.43 -

Fix for database index corruption issues (this update will trigger a rebuild of the queue metric tracker)
Optimize performance when doing bulk document state changes
Fix for 'Dictionary key is not a name' corrupted file issue
Adjusted HTML layout of UI screen so it renders properly on old versions of IE

5.2.42 -

clicking on backlog summary graph gives a larger image with all weeks (instead of just 104 weeks)
Fixed 'Unsupported color space' PDF corruption errors

5.2.40 -

Maintenance screen now refreshes once per second
Maintenance screen now has better progress messages
Added an animation to the maintenance screen

5.2.39 -

Adjusted graph y-axis so it automatically displays a more pleasing scale
Display a blank graph if no data is available

5.2.38 -

Fixed issue with 0 sized files showing ArrayIndexOutOfBoundsException: 0 error

5.2.35 -

Adjustments to Check for Updates to allow user to get latest production or pre-release updates

5.2.34 -

Fix 'Page N is corrupted - NNN' message on files in the corrupted document list

5.2.30 -

High performance support for huge files (>2GB)

5.2.29 -

Bug fix - Interuptions to pre-processing were causing the document to show as corrupted - the document will now be moved to Reprocess
Bug fix - exceptions during processing resulted in file being left in In Process queue


5.2.28 -

We have found a regression issue introduced in version 5.2.26 that can (in some rare cases) result in corrupted S-OCR database indexes.  This corruption can result in S-OCR refusing to launch.  We recommend that anyone currently running 5.2.26 or 5.2.27 update to 5.2.28 immediately.

  • Fix bug that could result in database index corruption
  • Force rebuild of database indexes for older sites (correct issues caused by the database index corruption bug) - this could cause 'Maintaining Indexes' screen to display for a little while during the first launch - just be patient!
  • Misc. bug fixes and tweaks
This page was: Helpful | Not Helpful

© 2012 Trumpet, Inc., All Rights Reserved