DataONE CN Index Task Processor
------------------------

This component contains the processing logic for building the main Solr/Lucene search index 
for DataONE-registered objects.  This includes: 
(a) parsing definitions for all object formats, 
(b) a task processing daemon with logic for pulling and processing index tasks, 
(c) a Solr client for communicating with solr and managing conflicts.

The processor is a flexible Spring-enable framework that allows easy plugin of parsing definitions
for new object formats, and executing multiple parsers ("subprocessors") for any object format.
The way this works is by maintaining a collection of subprocessors, each of which declaring 
which objectFormats they can process. The logic for this is encapsulated under SolrIndexService 
and SolrIndexServiceV2.

This project is a consummer of the IndexTask repository provided by cn_index_common component.  
The generator's spring configuration (application-context.xml) imports the task-index-context.xml to
provide repository access.

The production configuration context file (classpath:processor-daemon-context.xml) adds production 
references to external configuration including the jdbc.properties and solr.properties.


See the test classes under src/test for example usage and test-context.xml for test runtime configuration.
The integration test - IndexTaskProcessingIntegrationTest makes use of the generator daemon and the processor
daemon classes.  Since these classes load configuration from a production environment, configuration should
be present in /etc/dataone/index (deployed via cn-buildout/dataone-cn-index).  This includes jdbc.properties, hazelcast.xml, solr.properties.


Package design:

the index processor has 4 major components:
1. task prioritization and execution:  org.dataone.cn.index.processor

2. document merging logic for processing updates : org.dataone.cn.indexer.SolrIndexService, SolrIndexServiceV2

3. format-specific parsers that convert metadata, systemMetadata, resourceMap, 
   (annotation, and provenance) files into Solr record format:  org.dataone.cn.indexer.parser & 
   spring-bean configurations in src/main/resources and cn-buildout

4. solr communication utilities:  classes for creating and sending and receiving solr records:  
   org.dataone.cn.indexer.solrhttp  

Logic in the processor package calls the solrIndexService, which calls the parsers, merges changes 
and sends the updates with the solr communication client.


See LICENSE.txt for the details of distributing this software.