DataONE CN Index Task Processor ------------------------ This component contains the processing logic for building the main Solr/Lucene search index for DataONE-registered objects. This includes: (a) parsing definitions for all object formats, (b) a task processing daemon with logic for pulling and processing index tasks, (c) a Solr client for communicating with solr and managing conflicts. The processor is a flexible Spring-enable framework that allows easy plugin of parsing definitions for new object formats, and executing multiple parsers ("subprocessors") for any object format. The way this works is by maintaining a collection of subprocessors, each of which declaring which objectFormats they can process. The logic for this is encapsulated under SolrIndexService and SolrIndexServiceV2. This project is a consummer of the IndexTask repository provided by cn_index_common component. The generator's spring configuration (application-context.xml) imports the task-index-context.xml to provide repository access. The production configuration context file (classpath:processor-daemon-context.xml) adds production references to external configuration including the jdbc.properties and solr.properties. See the test classes under src/test for example usage and test-context.xml for test runtime configuration. The integration test - IndexTaskProcessingIntegrationTest makes use of the generator daemon and the processor daemon classes. Since these classes load configuration from a production environment, configuration should be present in /etc/dataone/index (deployed via cn-buildout/dataone-cn-index). This includes jdbc.properties, hazelcast.xml, solr.properties. Package design: the index processor has 4 major components: 1. task prioritization and execution: org.dataone.cn.index.processor 2. document merging logic for processing updates : org.dataone.cn.indexer.SolrIndexService, SolrIndexServiceV2 3. format-specific parsers that convert metadata, systemMetadata, resourceMap, (annotation, and provenance) files into Solr record format: org.dataone.cn.indexer.parser & spring-bean configurations in src/main/resources and cn-buildout 4. solr communication utilities: classes for creating and sending and receiving solr records: org.dataone.cn.indexer.solrhttp Logic in the processor package calls the solrIndexService, which calls the parsers, merges changes and sends the updates with the solr communication client. See LICENSE.txt for the details of distributing this software.