Pdocutils.nodesdocument)}( rawsourcechildren]hsection)}(hhh](htitle)}(h!Harvester and Harvest List Editorh]hText!Harvester and Harvest List Editor}(hhparenthhhsourceNlineNuba attributes}(ids]classes]names]dupnames]backrefs]utagnamehhh hhh]/var/lib/jenkins/jobs/Metacat_stable/workspace/metacat/docs/user/metacat/source/harvester.rsthKubh paragraph)}(hXBMetacat's Harvester is an optional feature that can be used to automatically retrieve EML documents from one or more custom data management system (e.g., SRB or PostgreSQL) and to insert (or update) those documents to the home repository. The local sites control when they are harvested, and which documents are harvested.h]hXDMetacat’s Harvester is an optional feature that can be used to automatically retrieve EML documents from one or more custom data management system (e.g., SRB or PostgreSQL) and to insert (or update) those documents to the home repository. The local sites control when they are harvested, and which documents are harvested.}(hh/hh-hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKhh hhubh,)}(hXFor example, the Long Term Ecological Research Network (LTER) uses the Metacat Harvester to create a centralized repository of data stored on twenty-six different sites that store EML metadata, but that use different data management systems. Once the data have been harvested and placed into a centralized repository, they are replicated to the KNB network, exposing the information to an even larger scientific community.h]hXFor example, the Long Term Ecological Research Network (LTER) uses the Metacat Harvester to create a centralized repository of data stored on twenty-six different sites that store EML metadata, but that use different data management systems. Once the data have been harvested and placed into a centralized repository, they are replicated to the KNB network, exposing the information to an even larger scientific community.}(hh=hh;hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hK hh hhubh,)}(hXMOnce the Harvester is properly configured, listed documents are retrieved and uploaded on a regularly scheduled basis. You must configure both the home Metacat and the remote sites (aka the "harvest sites") before using this feature. Local sites must also provide the Metacat server with a list of documents that should be harvested.h]hXQOnce the Harvester is properly configured, listed documents are retrieved and uploaded on a regularly scheduled basis. You must configure both the home Metacat and the remote sites (aka the “harvest sites”) before using this feature. Local sites must also provide the Metacat server with a list of documents that should be harvested.}(hhKhhIhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKhh hhubh )}(hhh](h)}(hConfiguring Harvesterh]hConfiguring Harvester}(hh\hhZhhhNhNubah}(h]h!]h#]h%]h']uh)hhhWhhhh*hKubh,)}(hXBefore you can use the Harvester to retrieve documents, you must configure the feature using the settings in the metacat.properties file. Note that you must also configure each site that the Harvester will connect to and retrieve documents from (see section 7.2 for details).h]hXBefore you can use the Harvester to retrieve documents, you must configure the feature using the settings in the metacat.properties file. Note that you must also configure each site that the Harvester will connect to and retrieve documents from (see section 7.2 for details).}(hhjhhhhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKhhWhhubh,)}(hhThe Harvester configuration information is managed in the metacat.properties file, which is located at::h]hgThe Harvester configuration information is managed in the metacat.properties file, which is located at:}(hgThe Harvester configuration information is managed in the metacat.properties file, which is located at:hhvhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKhhWhhubh literal_block)}(h(/WEB_INF/metacat.propertiesh]h(/WEB_INF/metacat.properties}(hhhhubah}(h]h!]h#]h%]h'] xml:spacepreserveuh)hhK!hhWhhhh*ubh,)}(hPThe Harvester properties are grouped together and begin after the comment line::h]hOThe Harvester properties are grouped together and begin after the comment line:}(hOThe Harvester properties are grouped together and begin after the comment line:hhhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hK#hhWhhubh)}(h# Harvester propertiesh]h# Harvester properties}(hhhhubah}(h]h!]h#]h%]h']hhuh)hhK%hhWhhhh*ubh,)}(hTo configure Harvester, edit the metacat.properties and set appropriate values for the harvesterAdministrator and smtpServer property. You may also wish to customize the other Harvester paramaters, each discussed in the table below.h]hTo configure Harvester, edit the metacat.properties and set appropriate values for the harvesterAdministrator and smtpServer property. You may also wish to customize the other Harvester paramaters, each discussed in the table below.}(hhhhhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hK'hhWhhubeh}(h]configuring-harvesterah!]h#]configuring harvesterah%]h']uh)h hh hhhh*hKubh )}(hhh](h)}(h(Harvester Properties and their Functionsh]h(Harvester Properties and their Functions}(hhhhhhhNhNubah}(h]h!]h#]h%]h']uh)hhhhhhh*hK,ubhtable)}(hhh]htgroup)}(hhh](hcolspec)}(hhh]h}(h]h!]h#]h%]h']colwidthK$uh)hhhubh)}(hhh]h}(h]h!]h#]h%]h']colwidthKauh)hhhubh)}(hhh]h}(h]h!]h#]h%]h']colwidthKuh)hhhubhthead)}(hhh]hrow)}(hhh](hentry)}(hhh]h,)}(hPropertyh]hProperty}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hK/hjubah}(h]h!]h#]h%]h']uh)jhj ubj)}(hhh]h,)}(hDescription and Valuesh]hDescription and Values}(hj-hj+ubah}(h]h!]h#]h%]h']uh)h+hh*hK/hj(ubah}(h]h!]h#]h%]h']uh)jhj ubj)}(hhh]h}(h]h!]h#]h%]h']uh)jhj ubeh}(h]h!]h#]h%]h']uh)j hjubah}(h]h!]h#]h%]h']uh)jhhubhtbody)}(hhh](j )}(hhh](j)}(hhh]h,)}(hconnectToMetacath]hconnectToMetacat}(hjahj_ubah}(h]h!]h#]h%]h']uh)h+hh*hK1hj\ubah}(h]h!]h#]h%]h']uh)jhjYubj)}(hhh](h,)}(hXDetermine whether Harvester should connect to Metacat to upload retrieved documents. Set to true (the default) under most circumstances. To test whether Harvester can retrieve documents from a site without actually connecting to Metacat to upload the documents, set the value to false.h]hXDetermine whether Harvester should connect to Metacat to upload retrieved documents. Set to true (the default) under most circumstances. To test whether Harvester can retrieve documents from a site without actually connecting to Metacat to upload the documents, set the value to false.}(hjxhjvubah}(h]h!]h#]h%]h']uh)h+hh*hK1hjsubh,)}(hValues: true/falseh]hValues: true/false}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hK6hjsubeh}(h]h!]h#]h%]h']uh)jhjYubj)}(hhh]h}(h]h!]h#]h%]h']uh)jhjYubeh}(h]h!]h#]h%]h']uh)j hjVubj )}(hhh](j)}(hhh]h,)}(hdelayh]hdelay}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hK8hjubah}(h]h!]h#]h%]h']uh)jhjubj)}(hhh](h,)}(hThe number of hours that Harvester will wait before beginning its first harvest. For example, if Harvester is run at 1:00 p.m., and the delay is set to 12, Harvester will begin its first harvest at 1:00 a.m.h]hThe number of hours that Harvester will wait before beginning its first harvest. For example, if Harvester is run at 1:00 p.m., and the delay is set to 12, Harvester will begin its first harvest at 1:00 a.m.}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hK8hjubh,)}(h Default: 0h]h Default: 0}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hKhjubah}(h]h!]h#]h%]h']uh)jhjubj)}(hhh](h,)}(hThe email address of the Harvester Administrator. Harvester will send email reports to this address after every harvest. Enter multiple email addresses by separating each address with a comma or semicolon (e.g., name1@abc.edu,name2@abc.edu).h](hThe email address of the Harvester Administrator. Harvester will send email reports to this address after every harvest. Enter multiple email addresses by separating each address with a comma or semicolon (e.g., }(hThe email address of the Harvester Administrator. Harvester will send email reports to this address after every harvest. Enter multiple email addresses by separating each address with a comma or semicolon (e.g., hjubh reference)}(h name1@abc.eduh]h name1@abc.edu}(hhhjubah}(h]h!]h#]h%]h']refurimailto:name1@abc.eduuh)jhjubh,name2@abc.edu).}(h,name2@abc.edu).hjubeh}(h]h!]h#]h%]h']uh)h+hh*hK>hjubh,)}(hXValues: An email address, or multiple email addresses separated by commas or semi-colonsh]hXValues: An email address, or multiple email addresses separated by commas or semi-colons}(hj:hj8ubah}(h]h!]h#]h%]h']uh)h+hh*hKBhjubeh}(h]h!]h#]h%]h']uh)jhjubj)}(hhh]h}(h]h!]h#]h%]h']uh)jhjubeh}(h]h!]h#]h%]h']uh)j hjVubj )}(hhh](j)}(hhh]h,)}(h logPeriodh]h logPeriod}(hjchjaubah}(h]h!]h#]h%]h']uh)h+hh*hKDhj^ubah}(h]h!]h#]h%]h']uh)jhj[ubj)}(hhh](h,)}(hX;The number of days to retain Harvester log entries. Harvester log entries record information such as which documents were harvested, from which sites, and whether any errors were encountered during the harvest. Log entries older than logPeriod number of days are purged from the database at the end of each harvest.h]hX;The number of days to retain Harvester log entries. Harvester log entries record information such as which documents were harvested, from which sites, and whether any errors were encountered during the harvest. Log entries older than logPeriod number of days are purged from the database at the end of each harvest.}(hjzhjxubah}(h]h!]h#]h%]h']uh)h+hh*hKDhjuubh,)}(h Default: 90h]h Default: 90}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hKIhjuubeh}(h]h!]h#]h%]h']uh)jhj[ubj)}(hhh]h}(h]h!]h#]h%]h']uh)jhj[ubeh}(h]h!]h#]h%]h']uh)j hjVubj )}(hhh](j)}(hhh]h,)}(h maxHarvestsh]h maxHarvests}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hKKhjubah}(h]h!]h#]h%]h']uh)jhjubj)}(hhh](h,)}(hThe maximum number of harvests that Harvester should execute before shutting down. If the value of maxHarvests is set to 0 or a negative number, Harvester will execute indefinitely.h]hThe maximum number of harvests that Harvester should execute before shutting down. If the value of maxHarvests is set to 0 or a negative number, Harvester will execute indefinitely.}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hKKhjubh,)}(h Default: 0h]h Default: 0}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hKOhjubeh}(h]h!]h#]h%]h']uh)jhjubj)}(hhh]h}(h]h!]h#]h%]h']uh)jhjubeh}(h]h!]h#]h%]h']uh)j hjVubj )}(hhh](j)}(hhh]h,)}(hperiodh]hperiod}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hKQhjubah}(h]h!]h#]h%]h']uh)jhjubj)}(hhh](h,)}(hThe number of hours between harvests. Harvester will run a new harvest every specified period of hours (either indefinitely or until the maximum number of harvests have run, depending on the value of maxHarvests).h]hThe number of hours between harvests. Harvester will run a new harvest every specified period of hours (either indefinitely or until the maximum number of harvests have run, depending on the value of maxHarvests).}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hKQhjubh,)}(h Default: 24h]h Default: 24}(hj$hj"ubah}(h]h!]h#]h%]h']uh)h+hh*hKUhjubeh}(h]h!]h#]h%]h']uh)jhjubj)}(hhh]h}(h]h!]h#]h%]h']uh)jhjubeh}(h]h!]h#]h%]h']uh)j hjVubj )}(hhh](j)}(hhh]h,)}(h smtpServerh]h smtpServer}(hjMhjKubah}(h]h!]h#]h%]h']uh)h+hh*hKWhjHubah}(h]h!]h#]h%]h']uh)jhjEubj)}(hhh](h,)}(hThe SMTP server that Harvester uses for sending email messages to the Harvester Administrator and Site Contacts. (e.g., somehost.institution.edu). Note that the default value only works if the Harvester host machine is configured as a SMTP server.h]hThe SMTP server that Harvester uses for sending email messages to the Harvester Administrator and Site Contacts. (e.g., somehost.institution.edu). Note that the default value only works if the Harvester host machine is configured as a SMTP server.}(hjdhjbubah}(h]h!]h#]h%]h']uh)h+hh*hKWhj_ubh,)}(hDefault: localhosth]hDefault: localhost}(hjrhjpubah}(h]h!]h#]h%]h']uh)h+hh*hK\hj_ubeh}(h]h!]h#]h%]h']uh)jhjEubj)}(hhh]h}(h]h!]h#]h%]h']uh)jhjEubeh}(h]h!]h#]h%]h']uh)j hjVubj )}(hhh](j)}(hhh]h,)}(hAHarvester Operation Properties (GetDocError, GetDocSuccess, etc.)h]hAHarvester Operation Properties (GetDocError, GetDocSuccess, etc.)}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hK^hjubah}(h]h!]h#]h%]h']uh)jhjubj)}(hhh]h,)}(hThe Harvester Operation properties are used by Harvester to report information about performed operations for inclusion in log entries and email messages. Under most circumstances the values of these properties should not be modified.h]hThe Harvester Operation properties are used by Harvester to report information about performed operations for inclusion in log entries and email messages. Under most circumstances the values of these properties should not be modified.}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hK^hjubah}(h]h!]h#]h%]h']uh)jhjubj)}(hhh]h}(h]h!]h#]h%]h']uh)jhjubeh}(h]h!]h#]h%]h']uh)j hjVubeh}(h]h!]h#]h%]h']uh)jThhubeh}(h]h!]h#]h%]h']colsKuh)hhhubah}(h]h!]h#]h%]h']uh)hhhhhhh*hNubeh}(h](harvester-properties-and-their-functionsah!]h#](harvester properties and their functionsah%]h']uh)h hh hhhh*hK,ubh )}(hhh](h)}(h:Configuring a Harvest Site (Instructions for Site Contact)h]h:Configuring a Harvest Site (Instructions for Site Contact)}(hjhjhhhNhNubah}(h]h!]h#]h%]h']uh)hhjhhhh*hKdubh,)}(hXAfter Metacat's Harvester has been configured, remote sites can register and send information about which files should be retrieved. Each remote site must have a site contact who is responsible for registering the site and creating a list of EML files to harvest (the "Harvest List"), as well as for reviewing harvest reports. The site contact can unregister the site from the Harvester at any time.h]hXAfter Metacat’s Harvester has been configured, remote sites can register and send information about which files should be retrieved. Each remote site must have a site contact who is responsible for registering the site and creating a list of EML files to harvest (the “Harvest List”), as well as for reviewing harvest reports. The site contact can unregister the site from the Harvester at any time.}(hjhjhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKfhjhhubh,)}(hTo use Harvester:h]hTo use Harvester:}(hjhj hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKmhjhhubhenumerated_list)}(hhh](h list_item)}(hRegister with Harvesterh]h,)}(hj$h]hRegister with Harvester}(hj$hj&ubah}(h]h!]h#]h%]h']uh)h+hh*hKohj"ubah}(h]h!]h#]h%]h']uh)j hjhhhh*hNubj!)}(hLCompose a Harvest List (you will likely wish to use the Harvest List Editor)h]h,)}(hj;h]hLCompose a Harvest List (you will likely wish to use the Harvest List Editor)}(hj;hj=ubah}(h]h!]h#]h%]h']uh)h+hh*hKphj9ubah}(h]h!]h#]h%]h']uh)j hjhhhh*hNubj!)}(h&Prepare your EML Documents for Harvesth]h,)}(hjRh]h&Prepare your EML Documents for Harvest}(hjRhjTubah}(h]h!]h#]h%]h']uh)h+hh*hKqhjPubah}(h]h!]h#]h%]h']uh)j hjhhhh*hNubj!)}(hReview the Harvester Reports h]h,)}(hReview the Harvester Reportsh]hReview the Harvester Reports}(hjmhjkubah}(h]h!]h#]h%]h']uh)h+hh*hKrhjgubah}(h]h!]h#]h%]h']uh)j hjhhhh*hNubeh}(h]h!]h#]h%]h']enumtypearabicprefixhsuffix.uh)jhjhhhh*hKoubh )}(hhh](h)}(hRegister with Harvesterh]hRegister with Harvester}(hjhjhhhNhNubah}(h]h!]h#]h%]h']uh)hhjhhhh*hKuubh,)}(hTo register a remote site with Harvester, the Site Contact should log in to Metacat's Harvester Registration page and enter information about the site and how it should be harvested.h]hTo register a remote site with Harvester, the Site Contact should log in to Metacat’s Harvester Registration page and enter information about the site and how it should be harvested.}(hjhjhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKwhjhhubj)}(hhh]j!)}(hXUsing a Web browser, log in to Metacat's Harvester Registration page. The Harvester Registration page is inside the skins directory. For example, if the Metacat server that you wish to register with resides at the following URL: :: http://somehost.somelocation.edu:8080/metacat/index.jsp then the Harvester Registration page would be accessed at: :: http://somehost.somelocation.edu:8080/metacat/style/skins/default/harvesterRegistrationLogin.jsp h](h,)}(hUsing a Web browser, log in to Metacat's Harvester Registration page. The Harvester Registration page is inside the skins directory. For example, if the Metacat server that you wish to register with resides at the following URL:h]hUsing a Web browser, log in to Metacat’s Harvester Registration page. The Harvester Registration page is inside the skins directory. For example, if the Metacat server that you wish to register with resides at the following URL:}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hK{hjubh)}(h7http://somehost.somelocation.edu:8080/metacat/index.jsph]h7http://somehost.somelocation.edu:8080/metacat/index.jsp}(hhhjubah}(h]h!]h#]h%]h']hhuh)hhKhjubh,)}(h:then the Harvester Registration page would be accessed at:h]h:then the Harvester Registration page would be accessed at:}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hKhjubh)}(h`http://somehost.somelocation.edu:8080/metacat/style/skins/default/harvesterRegistrationLogin.jsph]h`http://somehost.somelocation.edu:8080/metacat/style/skins/default/harvesterRegistrationLogin.jsp}(hhhjubah}(h]h!]h#]h%]h']hhuh)hhKhjubeh}(h]h!]h#]h%]h']uh)j hjhhhh*hNubah}(h]h!]h#]h%]h']jjjhjjuh)jhjhhhh*hK{ubhfigure)}(hhh](himage)}(hi.. figure:: images/screenshots/image065.jpg :align: center Metacat's Harvester Registration page. h]h}(h]h!]h#]h%]h']uriimages/screenshots/image065.jpg candidates}*jsuh)jhjhh*hKubhcaption)}(h&Metacat's Harvester Registration page.h]h(Metacat’s Harvester Registration page.}(hjhj ubah}(h]h!]h#]h%]h']uh)j hh*hKhjubeh}(h]id1ah!]h#]h%]h']aligncenteruh)jhKhjhhhh*ubj)}(hhh](j!)}(hXEnter your Metacat account information and click Submit to log in to your Metacat from the Harvester Registration page. Note: In some cases, you may need to log in to an anonymous "site" account rather than your personal account so that the registered data will not appear to have been registered by a single user. For example, an information manager (jones) who is registering data created by a team of scientists (jones, smith, and barney) from the Georgia Coastal Ecosystems site might log in to a dedicated account (named with the site's acronym, "GCE") to indicate that the registered data is from the entire site rather than "jones". h](h,)}(hwEnter your Metacat account information and click Submit to log in to your Metacat from the Harvester Registration page.h]hwEnter your Metacat account information and click Submit to log in to your Metacat from the Harvester Registration page.}(hj,hj*ubah}(h]h!]h#]h%]h']uh)h+hh*hKhj&ubh,)}(hXNote: In some cases, you may need to log in to an anonymous "site" account rather than your personal account so that the registered data will not appear to have been registered by a single user. For example, an information manager (jones) who is registering data created by a team of scientists (jones, smith, and barney) from the Georgia Coastal Ecosystems site might log in to a dedicated account (named with the site's acronym, "GCE") to indicate that the registered data is from the entire site rather than "jones".h]hXNote: In some cases, you may need to log in to an anonymous “site” account rather than your personal account so that the registered data will not appear to have been registered by a single user. For example, an information manager (jones) who is registering data created by a team of scientists (jones, smith, and barney) from the Georgia Coastal Ecosystems site might log in to a dedicated account (named with the site’s acronym, “GCE”) to indicate that the registered data is from the entire site rather than “jones”.}(hj:hj8ubah}(h]h!]h#]h%]h']uh)h+hh*hKhj&ubeh}(h]h!]h#]h%]h']uh)j hj#hhhh*hNubj!)}(hXlEnter information about your site and how often you want to schedule harvests and then click the Register button (Figure 7.2). The Harvest List URL should point to the location of the Harvest List, which is an XML file that lists the documents to harvest. If you do not yet have a Harvest List, please see the next section for more information about creating one. h]h,)}(hXkEnter information about your site and how often you want to schedule harvests and then click the Register button (Figure 7.2). The Harvest List URL should point to the location of the Harvest List, which is an XML file that lists the documents to harvest. If you do not yet have a Harvest List, please see the next section for more information about creating one.h]hXkEnter information about your site and how often you want to schedule harvests and then click the Register button (Figure 7.2). The Harvest List URL should point to the location of the Harvest List, which is an XML file that lists the documents to harvest. If you do not yet have a Harvest List, please see the next section for more information about creating one.}(hjRhjPubah}(h]h!]h#]h%]h']uh)h+hh*hKhjLubah}(h]h!]h#]h%]h']uh)j hj#hhhh*hNubeh}(h]h!]h#]h%]h']jjjhjjstartKuh)jhjhhhh*hKubj)}(hhh](j)}(h.. figure:: images/screenshots/image067.jpg :align: center Enter information about your site and how often you want to schedule harvests. h]h}(h]h!]h#]h%]h']uriimages/screenshots/image067.jpgj}j jysuh)jhjkhh*hKubj )}(hNEnter information about your site and how often you want to schedule harvests.h]hNEnter information about your site and how often you want to schedule harvests.}(hj}hj{ubah}(h]h!]h#]h%]h']uh)j hh*hKhjkubeh}(h]id2ah!]h#]h%]h']j!centeruh)jhKhjhhhh*ubh,)}(hXThe example settings in the previous figure instruct Harvester to harvest documents from the site once every two weeks. The Harvester will access the site's Harvest List at URL "http://somehost.institution.edu/~myname/harvestList.xml", and will send email reports to the Site Contact at email address "myname@institution.edu". Note that you can enter multiple email addresses by separating each address with a comma or a semi-colon. For example, "myname@institution.edu,anothername@institution.edu"h](hThe example settings in the previous figure instruct Harvester to harvest documents from the site once every two weeks. The Harvester will access the site’s Harvest List at URL “}(hThe example settings in the previous figure instruct Harvester to harvest documents from the site once every two weeks. The Harvester will access the site's Harvest List at URL "hjhhhNhNubj)}(h7http://somehost.institution.edu/~myname/harvestList.xmlh]h7http://somehost.institution.edu/~myname/harvestList.xml}(hhhjubah}(h]h!]h#]h%]h']refurijuh)jhjubhI”, and will send email reports to the Site Contact at email address “}(hE", and will send email reports to the Site Contact at email address "hjhhhNhNubj)}(hmyname@institution.eduh]hmyname@institution.edu}(hhhjubah}(h]h!]h#]h%]h']refurimailto:myname@institution.eduuh)jhjubh”. Note that you can enter multiple email addresses by separating each address with a comma or a semi-colon. For example, “}(h{". Note that you can enter multiple email addresses by separating each address with a comma or a semi-colon. For example, "hjhhhNhNubj)}(hmyname@institution.eduh]hmyname@institution.edu}(hhhjubah}(h]h!]h#]h%]h']refurimailto:myname@institution.eduuh)jhjubh,anothername@institution.edu”}(h,anothername@institution.edu"hjhhhNhNubeh}(h]h!]h#]h%]h']uh)h+hh*hKhjhhubeh}(h]register-with-harvesterah!]h#]register with harvesterah%]h']uh)h hjhhhh*hKuubh )}(hhh](h)}(h0Compose a Harvest List (The Harvest List Editor)h]h0Compose a Harvest List (The Harvest List Editor)}(hjhjhhhNhNubah}(h]h!]h#]h%]h']uh)hhjhhhh*hKubh,)}(hXThe Harvest List is an XML file that contains a list of documents to be harvested. The list is created by the site contact and stored on the site contact's site at the location specified during the Harvester registration process (see previous section for details). The list can be generated by hand, or you can use Metacat's Harvest List Editor to automatically generate and structure the list to conform to the required XML schema (displayed in figure at the end of this section). In this section we will look at what information is required when building a Harvest List, and how to configure and use the Harvest List Editor. Note that you must have a source distribution of Metacat in order to use the Harvest List Editor.h]hXThe Harvest List is an XML file that contains a list of documents to be harvested. The list is created by the site contact and stored on the site contact’s site at the location specified during the Harvester registration process (see previous section for details). The list can be generated by hand, or you can use Metacat’s Harvest List Editor to automatically generate and structure the list to conform to the required XML schema (displayed in figure at the end of this section). In this section we will look at what information is required when building a Harvest List, and how to configure and use the Harvest List Editor. Note that you must have a source distribution of Metacat in order to use the Harvest List Editor.}(hjhjhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKhjhhubh,)}(hThe Harvest List contains information that helps Metacat identify and retrieve each specified EML file. Each document in the list must be described with a docid, documentType, and documentURL (see table).h]hThe Harvest List contains information that helps Metacat identify and retrieve each specified EML file. Each document in the list must be described with a docid, documentType, and documentURL (see table).}(hjhjhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKhjhhubh,)}(hXTable: Information that must be included in the Harvest List about each EML file +--------------+-------------------------------------------------------------------------------------------------+ | Item | Description | +==============+=================================================================================================+ | docid | The docid uniquely identifies each EML document. Each docid consists of three elements: | | | | | | ``scope`` The document group to which the document belongs | | | ``identifier`` A number that uniquely identifies the document within the scope. | | | ``revision`` Anumber that indicates the current revision. | | | | | | For example, a valid docid could be: demoDocument.1.5, where demoDocument represents | | | the scope, 1 the identifier, and 5 the revision number. | +--------------+-------------------------------------------------------------------------------------------------+ | documentType | The documentType identifies the type of document as EML | | | e.g., "eml://ecoinformatics.org/eml-2.0.0". | +--------------+-------------------------------------------------------------------------------------------------+ | documentURL | The documentURL specifies a place where Harvester can locate and retrieve the | | | document via HTTP. The Metacat Harvester must be given read access to the contents at this URL. | | | e.g. "http://www.lternet.edu/~dcosta/document1.xml". | +--------------+-------------------------------------------------------------------------------------------------+h](hXTable: Information that must be included in the Harvest List about each EML file +————–+————————————————————————————————-+ | Item | Description | +==============+=================================================================================================+ | docid | The docid uniquely identifies each EML document. Each docid consists of three elements: | | | | | | }(hXTable: Information that must be included in the Harvest List about each EML file +--------------+-------------------------------------------------------------------------------------------------+ | Item | Description | +==============+=================================================================================================+ | docid | The docid uniquely identifies each EML document. Each docid consists of three elements: | | | | | | hjhhhNhNubhliteral)}(h ``scope``h]hscope}(hhhjubah}(h]h!]h#]h%]h']uh)jhjubhj The document group to which the document belongs | | | }(hj The document group to which the document belongs | | | hjhhhNhNubj)}(h``identifier``h]h identifier}(hhhj1ubah}(h]h!]h#]h%]h']uh)jhjubhe A number that uniquely identifies the document within the scope. | | | }(he A number that uniquely identifies the document within the scope. | | | hjhhhNhNubj)}(h ``revision``h]hrevision}(hhhjDubah}(h]h!]h#]h%]h']uh)jhjubhXQ Anumber that indicates the current revision. | | | | | | For example, a valid docid could be: demoDocument.1.5, where demoDocument represents | | | the scope, 1 the identifier, and 5 the revision number. | +————–+————————————————————————————————-+ | documentType | The documentType identifies the type of document as EML | | | e.g., “eml://ecoinformatics.org/eml-2.0.0”. | +————–+————————————————————————————————-+ | documentURL | The documentURL specifies a place where Harvester can locate and retrieve the | | | document via HTTP. The Metacat Harvester must be given read access to the contents at this URL. | | | e.g. “http://www.lternet.edu/~dcosta/document1.xml”. | +————–+————————————————————————————————-+}(hXF Anumber that indicates the current revision. | | | | | | For example, a valid docid could be: demoDocument.1.5, where demoDocument represents | | | the scope, 1 the identifier, and 5 the revision number. | +--------------+-------------------------------------------------------------------------------------------------+ | documentType | The documentType identifies the type of document as EML | | | e.g., "eml://ecoinformatics.org/eml-2.0.0". | +--------------+-------------------------------------------------------------------------------------------------+ | documentURL | The documentURL specifies a place where Harvester can locate and retrieve the | | | document via HTTP. The Metacat Harvester must be given read access to the contents at this URL. | | | e.g. "http://www.lternet.edu/~dcosta/document1.xml". | +--------------+-------------------------------------------------------------------------------------------------+hjhhhNhNubeh}(h]h!]h#]h%]h']uh)h+hh*hKhjhhubh,)}(hThe example Harvest List below contains two elements that specify the information that Harvester needs to retrieve a pair of EML documents and upload them to Metacat.h]hThe example Harvest List below contains two elements that specify the information that Harvester needs to retrieve a pair of EML documents and upload them to Metacat.}(hj_hj]hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKhjhhubh)}(hX  demoDocument 1 5 eml://ecoinformatics.org/eml-2.0.0 http://www.lternet.edu/~dcosta/document1.xml demoDocument 2 1 eml://ecoinformatics.org/eml-2.0.0 http://www.lternet.edu/~dcosta/document2.xml h]hX  demoDocument 1 5 eml://ecoinformatics.org/eml-2.0.0 http://www.lternet.edu/~dcosta/document1.xml demoDocument 2 1 eml://ecoinformatics.org/eml-2.0.0 http://www.lternet.edu/~dcosta/document2.xml }(hhhjkubah}(h]h!]h#]h%]h']hhuh)hhKhjhhhh*ubh,)}(hXRather than formatting the list by hand, you may wish to use Metacat's Harvest List Editor to compose and edit it. The Harvest List Editor displays a Harvest List as a table of rows and fields. Each table row corresponds to a single element in the corresponding Harvest List file (i.e., one EML document). The row numbers are used only for visual reference and are not editable.h]hXRather than formatting the list by hand, you may wish to use Metacat’s Harvest List Editor to compose and edit it. The Harvest List Editor displays a Harvest List as a table of rows and fields. Each table row corresponds to a single element in the corresponding Harvest List file (i.e., one EML document). The row numbers are used only for visual reference and are not editable.}(hj{hjyhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKhjhhubh,)}(hTo add a new document to the Harvest List, enter values for all five editable fields (all fields except the "Row #" field). Partially filled-in rows will cause errors that will result in an invalid Harvest List.h]hTo add a new document to the Harvest List, enter values for all five editable fields (all fields except the “Row #” field). Partially filled-in rows will cause errors that will result in an invalid Harvest List.}(hjhjhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKhjhhubh,)}(hXThe buttons at the bottom of the Editor can be used to Cut, Copy, and Paste rows from one location to another. Select a row and click the desired button, or paste the default values (which are specified in the Editor's configuration file, discussed later in this section) into the currently selected row by clicking the Paste Defaults button. Note: Only one row can be selected at any given time: all cut, copy, and paste operations work on only a single row rather than on a range of rows.h]hXThe buttons at the bottom of the Editor can be used to Cut, Copy, and Paste rows from one location to another. Select a row and click the desired button, or paste the default values (which are specified in the Editor’s configuration file, discussed later in this section) into the currently selected row by clicking the Paste Defaults button. Note: Only one row can be selected at any given time: all cut, copy, and paste operations work on only a single row rather than on a range of rows.}(hjhjhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKhjhhubh,)}(h`To run the Harvest List Editor, from the terminal on which the Metacat source code is installed:h]h`To run the Harvest List Editor, from the terminal on which the Metacat source code is installed:}(hjhjhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hMhjhhubj)}(hhh](j!)}(h0Open a system command window or terminal window.h]h,)}(hjh]h0Open a system command window or terminal window.}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hMhjubah}(h]h!]h#]h%]h']uh)j hjhhhh*hNubj!)}(hSet the METACAT_HOME environment variable to the value of the Metacat installation directory. Some examples follow: :: export METACAT_HOME=/home/somePath/metacat h](h,)}(hsSet the METACAT_HOME environment variable to the value of the Metacat installation directory. Some examples follow:h]hsSet the METACAT_HOME environment variable to the value of the Metacat installation directory. Some examples follow:}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hMhjubh)}(h*export METACAT_HOME=/home/somePath/metacath]h*export METACAT_HOME=/home/somePath/metacat}(hhhjubah}(h]h!]h#]h%]h']hhuh)hhM hjubeh}(h]h!]h#]h%]h']uh)j hjhhhh*hNubj!)}(hEcd to the following directory: :: cd $METACAT_HOME/lib/harvester h](h,)}(hcd to the following directory:h]hcd to the following directory:}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hMhjubh)}(hcd $METACAT_HOME/lib/harvesterh]hcd $METACAT_HOME/lib/harvester}(hhhjubah}(h]h!]h#]h%]h']hhuh)hhMhjubeh}(h]h!]h#]h%]h']uh)j hjhhhh*hNubj!)}(hRun the appropriate Harvester shell script, as determined by the operating system: :: sh runHarvestListEditor.sh The Harvest List Editor will open. h](h,)}(hRRun the appropriate Harvester shell script, as determined by the operating system:h]hRRun the appropriate Harvester shell script, as determined by the operating system:}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hMhjubh)}(hsh runHarvestListEditor.shh]hsh runHarvestListEditor.sh}(hhhj)ubah}(h]h!]h#]h%]h']hhuh)hhMhjubh,)}(h"The Harvest List Editor will open.h]h"The Harvest List Editor will open.}(hj9hj7ubah}(h]h!]h#]h%]h']uh)h+hh*hMhjubeh}(h]h!]h#]h%]h']uh)j hjhhhh*hNubeh}(h]h!]h#]h%]h']jjjhjjuh)jhjhhhh*hMubh,)}(hXIf you would like to customize the Harvest List Editor (e.g., specify a default list to open automatically whenever the editor is opened and/or default values), create a file called .harvestListEditor (note the leading dot character). Use a plain text editor to create the file and place the file in the Site Contact's home directory. To determine the home directory, open a system command window or terminal window and type the following:h]hXIf you would like to customize the Harvest List Editor (e.g., specify a default list to open automatically whenever the editor is opened and/or default values), create a file called .harvestListEditor (note the leading dot character). Use a plain text editor to create the file and place the file in the Site Contact’s home directory. To determine the home directory, open a system command window or terminal window and type the following:}(hjShjQhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hMhjhhubh)}(h echo $HOMEh]h echo $HOME}(hhhj_ubah}(h]h!]h#]h%]h']hhuh)hhM%hjhhhh*ubh,)}(hThe configuration file contains a number of optional properties that can make using the Editor more convenient. A sample configure file is displayed below, and more information about each configuration property is contained in the table.h]hThe configuration file contains a number of optional properties that can make using the Editor more convenient. A sample configure file is displayed below, and more information about each configuration property is contained in the table.}(hjohjmhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hM'hjhhubh,)}(h.A sample .harvestListEditor configuration fileh]h.A sample .harvestListEditor configuration file}(hj}hj{hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hM+hjhhubh)}(hdefaultHarvestList=C:/temp/harvestList.xml defaultScope=demo_document defaultIdentifier=1 defaultRevision=1 defaultDocumentURL=http://www.lternet.edu/~dcosta/ defaultDocumentType=eml://ecoinformatics.org/eml-2.0.0h]hdefaultHarvestList=C:/temp/harvestList.xml defaultScope=demo_document defaultIdentifier=1 defaultRevision=1 defaultDocumentURL=http://www.lternet.edu/~dcosta/ defaultDocumentType=eml://ecoinformatics.org/eml-2.0.0}(hhhjubah}(h]h!]h#]h%]h']hhuh)hhM/hjhhhh*ubh,)}(h,Harvest List Editor Configuration Propertiesh]h,Harvest List Editor Configuration Properties}(hjhjhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hM6hjhhubh)}(hhh]h)}(hhh](h)}(hhh]h}(h]h!]h#]h%]h']colwidthKuh)hhjubh)}(hhh]h}(h]h!]h#]h%]h']colwidthK^uh)hhjubj)}(hhh]j )}(hhh](j)}(hhh]h,)}(hPropertyh]hProperty}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hM9hjubah}(h]h!]h#]h%]h']uh)jhjubj)}(hhh]h,)}(h Descriptionh]h Description}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hM9hjubah}(h]h!]h#]h%]h']uh)jhjubeh}(h]h!]h#]h%]h']uh)j hjubah}(h]h!]h#]h%]h']uh)jhjubjU)}(hhh](j )}(hhh](j)}(hhh]h,)}(hdefaultHarvestListh]hdefaultHarvestList}(hj hjubah}(h]h!]h#]h%]h']uh)h+hh*hM;hjubah}(h]h!]h#]h%]h']uh)jhjubj)}(hhh](h,)}(hThe location of a Harvest List file that the Editor will automatically open for editing on startup. Set this property to the path to the Harvest List file that you expect to edit most frequently.h]hThe location of a Harvest List file that the Editor will automatically open for editing on startup. Set this property to the path to the Harvest List file that you expect to edit most frequently.}(hj!hjubah}(h]h!]h#]h%]h']uh)h+hh*hM;hjubh,)}(hPExamples: ``/home/jdoe/public_html/harvestList.xml`` ``C:/temp/harvestList.xml``h](h Examples: }(h Examples: hj-ubj)}(h*``/home/jdoe/public_html/harvestList.xml``h]h&/home/jdoe/public_html/harvestList.xml}(hhhj6ubah}(h]h!]h#]h%]h']uh)jhj-ubh }(h hj-ubj)}(h``C:/temp/harvestList.xml``h]hC:/temp/harvestList.xml}(hhhjIubah}(h]h!]h#]h%]h']uh)jhj-ubeh}(h]h!]h#]h%]h']uh)h+hh*hM?hjubeh}(h]h!]h#]h%]h']uh)jhjubeh}(h]h!]h#]h%]h']uh)j hjubj )}(hhh](j)}(hhh]h,)}(h defaultScopeh]h defaultScope}(hjqhjoubah}(h]h!]h#]h%]h']uh)h+hh*hMChjlubah}(h]h!]h#]h%]h']uh)jhjiubj)}(hhh](h,)}(hThe value pasted into the Editor's Scope field when the Paste Defaults button is clicked. The Scope field should contain a symbolic identifier that indicates the family of documents to which the EML document belongs.h]hThe value pasted into the Editor’s Scope field when the Paste Defaults button is clicked. The Scope field should contain a symbolic identifier that indicates the family of documents to which the EML document belongs.}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hMChjubh,)}(h*Example: xyz_dataset Default: dataseth]h*Example: xyz_dataset Default: dataset}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hMHhjubeh}(h]h!]h#]h%]h']uh)jhjiubeh}(h]h!]h#]h%]h']uh)j hjubj )}(hhh](j)}(hhh]h,)}(hdefaultIdentiferh]hdefaultIdentifer}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hMKhjubah}(h]h!]h#]h%]h']uh)jhjubj)}(hhh]h,)}(hThe value pasted into the Editor's Identifier field when the Paste Defaults button is clicked. The Scope field should contain a numeric value indicating the identifier for this particular EML document within the Scope.h]hThe value pasted into the Editor’s Identifier field when the Paste Defaults button is clicked. The Scope field should contain a numeric value indicating the identifier for this particular EML document within the Scope.}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hMKhjubah}(h]h!]h#]h%]h']uh)jhjubeh}(h]h!]h#]h%]h']uh)j hjubj )}(hhh](j)}(hhh]h,)}(hdefaultRevisionh]hdefaultRevision}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hMOhjubah}(h]h!]h#]h%]h']uh)jhjubj)}(hhh](h,)}(hThe value pasted into the Editor's Revision field when the Paste Defaults button is clicked. The Scope field should contain a numeric value indicating the revision number of this EML document within the Scope and Identifier.h]hThe value pasted into the Editor’s Revision field when the Paste Defaults button is clicked. The Scope field should contain a numeric value indicating the revision number of this EML document within the Scope and Identifier.}(hj hj ubah}(h]h!]h#]h%]h']uh)h+hh*hMOhjubh,)}(hExample: 2 Default: 1h]hExample: 2 Default: 1}(hj hj ubah}(h]h!]h#]h%]h']uh)h+hh*hMShjubeh}(h]h!]h#]h%]h']uh)jhjubeh}(h]h!]h#]h%]h']uh)j hjubj )}(hhh](j)}(hhh]h,)}(hdefaultDocumentTypeh]hdefaultDocumentType}(hj2 hj0 ubah}(h]h!]h#]h%]h']uh)h+hh*hMVhj- ubah}(h]h!]h#]h%]h']uh)jhj* ubj)}(hhh](h,)}(hvThe document type specification pasted into the Editor's DocumentType field when the Paste Defaults button is clicked.h]hxThe document type specification pasted into the Editor’s DocumentType field when the Paste Defaults button is clicked.}(hjI hjG ubah}(h]h!]h#]h%]h']uh)h+hh*hMVhjD ubh,)}(h/Default: ``eml://ecoinformatics.org/eml-2.0.0``h](h Default: }(h Default: hjU ubj)}(h&``eml://ecoinformatics.org/eml-2.0.0``h]h"eml://ecoinformatics.org/eml-2.0.0}(hhhj^ ubah}(h]h!]h#]h%]h']uh)jhjU ubeh}(h]h!]h#]h%]h']uh)h+hh*hMYhjD ubeh}(h]h!]h#]h%]h']uh)jhj* ubeh}(h]h!]h#]h%]h']uh)j hjubj )}(hhh](j)}(hhh]h,)}(hdefaultDocumentURLh]hdefaultDocumentURL}(hj hj ubah}(h]h!]h#]h%]h']uh)h+hh*hM[hj ubah}(h]h!]h#]h%]h']uh)jhj~ ubj)}(hhh](h,)}(hThe URL or partial URL pasted into the Editor's URL field when the Paste Defaults button is clicked. Typically, this value is set to the portion of the URL shared by all harvested EML documents.h]hThe URL or partial URL pasted into the Editor’s URL field when the Paste Defaults button is clicked. Typically, this value is set to the portion of the URL shared by all harvested EML documents.}(hj hj ubah}(h]h!]h#]h%]h']uh)h+hh*hM[hj ubh,)}(hKExample: ``http://somehost.institution.edu/somepath/`` Default: ``http://``h](h Example: }(h Example: hj ubj)}(h-``http://somehost.institution.edu/somepath/``h]h)http://somehost.institution.edu/somepath/}(hhhj ubah}(h]h!]h#]h%]h']uh)jhj ubh Default: }(h Default: hj ubj)}(h ``http://``h]hhttp://}(hhhj ubah}(h]h!]h#]h%]h']uh)jhj ubeh}(h]h!]h#]h%]h']uh)h+hh*hM_hj ubeh}(h]h!]h#]h%]h']uh)jhj~ ubeh}(h]h!]h#]h%]h']uh)j hjubeh}(h]h!]h#]h%]h']uh)jThjubeh}(h]h!]h#]h%]h']colsKuh)hhjubah}(h]h!]h#]h%]h']uh)hhjhhhh*hNubh,)}(hXML Schema for Harvest Listsh]hXML Schema for Harvest Lists}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hMehjhhubh)}(hX This module defines the required information for the harvester to collect documents from the local site. The local system containing this document must give the Metacat Harvester read access to this document. This represents the local document information that is used to inform the Harvester of the docid, document type, and location of the document to be harvested. The complete document identifier to be used by metacat. The docid is a compound element that gives a scope for the identifier, an integer local identifer that is unique within that scope, and a revision. Each revision is assumed to specify a unique, non-changing document, so once a particular revision is harvested, there is no need for it to be harvested again. To trigger a harvest of a document that has been updated, increment the revision number for that identifier. The system prefix of a metacat docid that defines the scope within which the identifier is unique. The local (site specific) portion of the identifier (docid) that is unique within the context of the scope. The revision identifier for this document, indicating a unique document version. The type of document to be harvested, indicated by a namespace string, formal public identifier, mime type, or other type indicator. The documentURL field contains the URL of the document to be harvested. The Metacat Harvester must be given read access to the contents at this URL. h]hX This module defines the required information for the harvester to collect documents from the local site. The local system containing this document must give the Metacat Harvester read access to this document. This represents the local document information that is used to inform the Harvester of the docid, document type, and location of the document to be harvested. The complete document identifier to be used by metacat. The docid is a compound element that gives a scope for the identifier, an integer local identifer that is unique within that scope, and a revision. Each revision is assumed to specify a unique, non-changing document, so once a particular revision is harvested, there is no need for it to be harvested again. To trigger a harvest of a document that has been updated, increment the revision number for that identifier. The system prefix of a metacat docid that defines the scope within which the identifier is unique. The local (site specific) portion of the identifier (docid) that is unique within the context of the scope. The revision identifier for this document, indicating a unique document version. The type of document to be harvested, indicated by a namespace string, formal public identifier, mime type, or other type indicator. The documentURL field contains the URL of the document to be harvested. The Metacat Harvester must be given read access to the contents at this URL. }(hhhj ubah}(h]h!]h#]h%]h']hhuh)hhMihjhhhh*ubeh}(h].compose-a-harvest-list-the-harvest-list-editorah!]h#]0compose a harvest list (the harvest list editor)ah%]h']uh)h hjhhhh*hKubh )}(hhh](h)}(h!Prepare EML Documents for Harvesth]h!Prepare EML Documents for Harvest}(hj! hj hhhNhNubah}(h]h!]h#]h%]h']uh)hhj hhhh*hMubh,)}(hcTo prepare a set of EML documents for harvest, ensure that the following is true for each document:h]hcTo prepare a set of EML documents for harvest, ensure that the following is true for each document:}(hj/ hj- hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hMhj hhubh bullet_list)}(hhh](j!)}(hThe document contains valid EMLh]h,)}(hjB h]hThe document contains valid EML}(hjB hjD ubah}(h]h!]h#]h%]h']uh)h+hh*hMhj@ ubah}(h]h!]h#]h%]h']uh)j hj= hhhh*hNubj!)}(hPThe document is specified in a ```` element in the site's Harvest Listh]h,)}(hjY h](hThe document is specified in a }(hThe document is specified in a hj[ ubj)}(h````h]h }(hhhjc ubah}(h]h!]h#]h%]h']uh)jhj[ ubh% element in the site’s Harvest List}(h# element in the site's Harvest Listhj[ ubeh}(h]h!]h#]h%]h']uh)h+hh*hMhjW ubah}(h]h!]h#]h%]h']uh)j hj= hhhh*hNubj!)}(hJThe file resides at the location specified by its URL in the Harvest List h]h,)}(hIThe file resides at the location specified by its URL in the Harvest Listh]hIThe file resides at the location specified by its URL in the Harvest List}(hj hj ubah}(h]h!]h#]h%]h']uh)h+hh*hMhj ubah}(h]h!]h#]h%]h']uh)j hj= hhhh*hNubeh}(h]h!]h#]h%]h']bulletj uh)j; hh*hMhj hhubeh}(h]!prepare-eml-documents-for-harvestah!]h#]!prepare eml documents for harvestah%]h']uh)h hjhhhh*hMubh )}(hhh](h)}(hReview Harvester Reportsh]hReview Harvester Reports}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)hhj hhhh*hMubh,)}(hXyHarvester sends an email report to the Site Contact after every scheduled site harvest. The report contains information about the performed operations, such as which EML documents were harvested and whether any errors were encountered. Errors are indicated by operations that display a status value of 1; a status value of 0 indicates that the operation completed successfully.h]hXyHarvester sends an email report to the Site Contact after every scheduled site harvest. The report contains information about the performed operations, such as which EML documents were harvested and whether any errors were encountered. Errors are indicated by operations that display a status value of 1; a status value of 0 indicates that the operation completed successfully.}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hMhj hhubh,)}(hWhen errors are reported, the Site Contact should try to determine whether the source of the error is something that can be corrected at the site. Common causes of errors include:Vh]hWhen errors are reported, the Site Contact should try to determine whether the source of the error is something that can be corrected at the site. Common causes of errors include:}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hMhj hhubj< )}(hhh](j!)}(hka document URL specified in the Harvest List does not match the location of the actual EML file on the diskh]h,)}(hj h]hka document URL specified in the Harvest List does not match the location of the actual EML file on the disk}(hj hj ubah}(h]h!]h#]h%]h']uh)h+hh*hMhj ubah}(h]h!]h#]h%]h']uh)j hj hhhh*hNubj!)}(hVthe Harvest List does not contain valid XML as specified in the harvestList.xsd schemah]h,)}(hj h]hVthe Harvest List does not contain valid XML as specified in the harvestList.xsd schema}(hj hj ubah}(h]h!]h#]h%]h']uh)h+hh*hMhj ubah}(h]h!]h#]h%]h']uh)j hj hhhh*hNubj!)}(h~the URL to the Harvest List (specified during registration) does not match the actual location of the Harvest List on the diskh]h,)}(hj h]h~the URL to the Harvest List (specified during registration) does not match the actual location of the Harvest List on the disk}(hj hj ubah}(h]h!]h#]h%]h']uh)h+hh*hMhj ubah}(h]h!]h#]h%]h']uh)j hj hhhh*hNubj!)}(hYan EML document that Harvester attempted to upload to Metacat does not contain valid EML h]h,)}(hXan EML document that Harvester attempted to upload to Metacat does not contain valid EMLh]hXan EML document that Harvester attempted to upload to Metacat does not contain valid EML}(hj$ hj" ubah}(h]h!]h#]h%]h']uh)h+hh*hMhj ubah}(h]h!]h#]h%]h']uh)j hj hhhh*hNubeh}(h]h!]h#]h%]h']j j uh)j; hh*hMhj hhubh,)}(hIf the Site Contact is unable to determine the cause of the error and its resolution, he or she should contact the Harvester Administrator for assistance.h]hIf the Site Contact is unable to determine the cause of the error and its resolution, he or she should contact the Harvester Administrator for assistance.}(hj> hj< hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hMhj hhubeh}(h]review-harvester-reportsah!]h#]review harvester reportsah%]h']uh)h hjhhhh*hMubh )}(hhh](h)}(hUnregister with Harvesterh]hUnregister with Harvester}(hjW hjU hhhNhNubah}(h]h!]h#]h%]h']uh)hhjR hhhh*hMubh,)}(hXTo discontinue harvests, the Site Contact must unregister with Harvester. To unregister:h]hXTo discontinue harvests, the Site Contact must unregister with Harvester. To unregister:}(hje hjc hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hMhjR hhubj)}(hhh](j!)}(hXUsing a Web browser, log in to Metacat's Harvester Registration page. The Harvester Registration page is inside the skins directory. For example, if the Metacat server that you wish to register with resides at the following URL: :: http://somehost.somelocation.edu:8080/metacat/index.jsp then the Harvester Registration page would be accessed at: :: http://somehost.somelocation.edu:8080/metacat/style/skins/default/harvesterRegistrationLogin.jsp h](h,)}(hUsing a Web browser, log in to Metacat's Harvester Registration page. The Harvester Registration page is inside the skins directory. For example, if the Metacat server that you wish to register with resides at the following URL:h]hUsing a Web browser, log in to Metacat’s Harvester Registration page. The Harvester Registration page is inside the skins directory. For example, if the Metacat server that you wish to register with resides at the following URL:}(hjz hjx ubah}(h]h!]h#]h%]h']uh)h+hh*hMhjt ubh)}(h7http://somehost.somelocation.edu:8080/metacat/index.jsph]h7http://somehost.somelocation.edu:8080/metacat/index.jsp}(hhhj ubah}(h]h!]h#]h%]h']hhuh)hhMhjt ubh,)}(h:then the Harvester Registration page would be accessed at:h]h:then the Harvester Registration page would be accessed at:}(hj hj ubah}(h]h!]h#]h%]h']uh)h+hh*hMhjt ubh)}(h`http://somehost.somelocation.edu:8080/metacat/style/skins/default/harvesterRegistrationLogin.jsph]h`http://somehost.somelocation.edu:8080/metacat/style/skins/default/harvesterRegistrationLogin.jsp}(hhhj ubah}(h]h!]h#]h%]h']hhuh)hhMhjt ubeh}(h]h!]h#]h%]h']uh)j hjq hhhh*hNubj!)}(hEnter and submit your Metacat account information. On the subsequent screen, click Unregister to remove your site and discontinue harvests. h]h,)}(hEnter and submit your Metacat account information. On the subsequent screen, click Unregister to remove your site and discontinue harvests.h]hEnter and submit your Metacat account information. On the subsequent screen, click Unregister to remove your site and discontinue harvests.}(hj hj ubah}(h]h!]h#]h%]h']uh)h+hh*hMhj ubah}(h]h!]h#]h%]h']uh)j hjq hhhh*hNubeh}(h]h!]h#]h%]h']jjjhjjuh)jhjR hhhh*hMubeh}(h]unregister-with-harvesterah!]h#]unregister with harvesterah%]h']uh)h hjhhhh*hMubeh}(h]8configuring-a-harvest-site-instructions-for-site-contactah!]h#]:configuring a harvest site (instructions for site contact)ah%]h']uh)h hh hhhh*hKdubh )}(hhh](h)}(hRunning Harvesterh]hRunning Harvester}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)hhj hhhh*hMubh,)}(hX:The Harvester can be run as a servlet or in a command window. Under most circumstances, Harvester is best run continuously as a background servlet process. However, if you expect to use Harvester infrequently, or if wish only to test that Harvester is functioning, it may desirable to run it from a command window.h]hX:The Harvester can be run as a servlet or in a command window. Under most circumstances, Harvester is best run continuously as a background servlet process. However, if you expect to use Harvester infrequently, or if wish only to test that Harvester is functioning, it may desirable to run it from a command window.}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hMhj hhubh )}(hhh](h)}(hRunning Harvester as a Servleth]hRunning Harvester as a Servlet}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)hhj hhhh*hMubh,)}(hTo run Harvester as a servlet:h]hTo run Harvester as a servlet:}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hMhj hhubj)}(hhh](j!)}(hXBRemove the comment symbols around the HarvesterServlet entry in the deployed Metacat web.xml ($TOMCAT_HOME/webapps//WEB-INF). :: h](hdefinition_list)}(hhh]hdefinition_list_item)}(hRemove the comment symbols around the HarvesterServlet entry in the deployed Metacat web.xml ($TOMCAT_HOME/webapps//WEB-INF). h](hterm)}(hCRemove the comment symbols around the HarvesterServlet entry in theh]hCRemove the comment symbols around the HarvesterServlet entry in the}(hj8 hj6 ubah}(h]h!]h#]h%]h']uh)j4 hh*hMhj0 ubh definition)}(hhh]h,)}(hBdeployed Metacat web.xml ($TOMCAT_HOME/webapps//WEB-INF).h]hBdeployed Metacat web.xml ($TOMCAT_HOME/webapps//WEB-INF).}(hjK hjI ubah}(h]h!]h#]h%]h']uh)h+hh*hMhjF ubah}(h]h!]h#]h%]h']uh)jD hj0 ubeh}(h]h!]h#]h%]h']uh)j. hh*hMhj+ ubah}(h]h!]h#]h%]h']uh)j) hj% ubh)}(hXh]hX}(hhhji ubah}(h]h!]h#]h%]h']hhuh)hhMhj% ubeh}(h]h!]h#]h%]h']uh)j hj" hhhh*hNubj!)}(hSave the edited file.h]h,)}(hj h]hSave the edited file.}(hj hj ubah}(h]h!]h#]h%]h']uh)h+hh*hMhj} ubah}(h]h!]h#]h%]h']uh)j hj" hhhh*hNubj!)}(hRestart Tomcat. h]h,)}(hRestart Tomcat.h]hRestart Tomcat.}(hj hj ubah}(h]h!]h#]h%]h']uh)h+hh*hMhj ubah}(h]h!]h#]h%]h']uh)j hj" hhhh*hNubeh}(h]h!]h#]h%]h']jjjhjjuh)jhj hhhh*hMubh,)}(hXAbout thirty seconds after you restart Tomcat, the Harvester servlet will start executing. The first harvest will occur after the number of hours specified in the metacat.properties file. The servlet will continue running new harvests until the maximum number of harvests have been completed, or until Tomcat shuts down (harvest frequency and maximum number of harvests are also set in the Harvester properties).h]hXAbout thirty seconds after you restart Tomcat, the Harvester servlet will start executing. The first harvest will occur after the number of hours specified in the metacat.properties file. The servlet will continue running new harvests until the maximum number of harvests have been completed, or until Tomcat shuts down (harvest frequency and maximum number of harvests are also set in the Harvester properties).}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hMhj hhubeh}(h]running-harvester-as-a-servletah!]h#]running harvester as a servletah%]h']uh)h hj hhhh*hMubh )}(hhh](h)}(h%Running Harvester in a Command Windowh]h%Running Harvester in a Command Window}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)hhj hhhh*hMubh,)}(h%To run Harvester in a Command Window:h]h%To run Harvester in a Command Window:}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hM hj hhubj)}(hhh](j!)}(h0Open a system command window or terminal window.h]h,)}(hj h]h0Open a system command window or terminal window.}(hj hj ubah}(h]h!]h#]h%]h']uh)h+hh*hM hj ubah}(h]h!]h#]h%]h']uh)j hj hhhh*hNubj!)}(hSet the ``METACAT_HOME`` environment variable to the value of the Metacat webapp deployment directory. :: export METACAT_HOME=/home/somePath/metacat h](h,)}(hfSet the ``METACAT_HOME`` environment variable to the value of the Metacat webapp deployment directory.h](hSet the }(hSet the hj ubj)}(h``METACAT_HOME``h]h METACAT_HOME}(hhhj ubah}(h]h!]h#]h%]h']uh)jhj ubhN environment variable to the value of the Metacat webapp deployment directory.}(hN environment variable to the value of the Metacat webapp deployment directory.hj ubeh}(h]h!]h#]h%]h']uh)h+hh*hM hj ubh)}(h*export METACAT_HOME=/home/somePath/metacath]h*export METACAT_HOME=/home/somePath/metacat}(hhhj' ubah}(h]h!]h#]h%]h']hhuh)hhMhj ubeh}(h]h!]h#]h%]h']uh)j hj hhhh*hNubj!)}(hEcd to the following directory: :: cd $METACAT_HOME/lib/harvester h](h,)}(hcd to the following directory:h]hcd to the following directory:}(hjA hj? ubah}(h]h!]h#]h%]h']uh)h+hh*hMhj; ubh)}(hcd $METACAT_HOME/lib/harvesterh]hcd $METACAT_HOME/lib/harvester}(hhhjM ubah}(h]h!]h#]h%]h']hhuh)hhMhj; ubeh}(h]h!]h#]h%]h']uh)j hj hhhh*hNubj!)}(h{Run the appropriate Harvester shell script, as determined by the operating system: :: sh runHarvester.sh $METACAT_HOME h](h,)}(hRRun the appropriate Harvester shell script, as determined by the operating system:h]hRRun the appropriate Harvester shell script, as determined by the operating system:}(hjg hje ubah}(h]h!]h#]h%]h']uh)h+hh*hMhja ubh)}(h sh runHarvester.sh $METACAT_HOMEh]h sh runHarvester.sh $METACAT_HOME}(hhhjs ubah}(h]h!]h#]h%]h']hhuh)hhMhja ubeh}(h]h!]h#]h%]h']uh)j hj hhhh*hNubeh}(h]h!]h#]h%]h']jjjhjjuh)jhj hhhh*hM ubh,)}(hXThe Harvester application will start executing. The first harvest will occur after the number of hours specified in the ``metacat.properties file``. The servlet will continue running new harvests until the maximum number of harvests have been completed, or until you interrupt the process by hitting CTRL/C in the command window (harvest frequency and maximum number of harvests are also set in the Harvester properties).h](hxThe Harvester application will start executing. The first harvest will occur after the number of hours specified in the }(hxThe Harvester application will start executing. The first harvest will occur after the number of hours specified in the hj hhhNhNubj)}(h``metacat.properties file``h]hmetacat.properties file}(hhhj ubah}(h]h!]h#]h%]h']uh)jhj ubhX. The servlet will continue running new harvests until the maximum number of harvests have been completed, or until you interrupt the process by hitting CTRL/C in the command window (harvest frequency and maximum number of harvests are also set in the Harvester properties).}(hX. The servlet will continue running new harvests until the maximum number of harvests have been completed, or until you interrupt the process by hitting CTRL/C in the command window (harvest frequency and maximum number of harvests are also set in the Harvester properties).hj hhhNhNubeh}(h]h!]h#]h%]h']uh)h+hh*hM hj hhubeh}(h]%running-harvester-in-a-command-windowah!]h#]%running harvester in a command windowah%]h']uh)h hj hhhh*hMubeh}(h]running-harvesterah!]h#]running harvesterah%]h']uh)h hh hhhh*hMubh )}(hhh](h)}(hReviewing Harvest Reportsh]hReviewing Harvest Reports}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)hhj hhhh*hM(ubh,)}(hXHarvester sends an email report to the Harvester Administrator after every harvest. The report contains information about the performed operations, such as which sites were harvested as well as which EML documents were harvested and whether any errors were encountered. Errors are indicated by operations that display a status value of 1; a status value of 0 indicates that the operation completed successfully.h]hXHarvester sends an email report to the Harvester Administrator after every harvest. The report contains information about the performed operations, such as which sites were harvested as well as which EML documents were harvested and whether any errors were encountered. Errors are indicated by operations that display a status value of 1; a status value of 0 indicates that the operation completed successfully.}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hM)hj hhubh,)}(hXVThe Harvester Administrator should review the report, paying particularly close attention to any reported errors and accompanying error messages. When errors are reported at a particular site, the Harvester Administrator should contact the Site Contact to determine the source of the error and its resolution. Common causes of errors include:h]hXVThe Harvester Administrator should review the report, paying particularly close attention to any reported errors and accompanying error messages. When errors are reported at a particular site, the Harvester Administrator should contact the Site Contact to determine the source of the error and its resolution. Common causes of errors include:}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hM0hj hhubj< )}(hhh](j!)}(hka document URL specified in the Harvest List does not match the location of the actual EML file on the diskh]h,)}(hj h]hka document URL specified in the Harvest List does not match the location of the actual EML file on the disk}(hj hj ubah}(h]h!]h#]h%]h']uh)h+hh*hM6hj ubah}(h]h!]h#]h%]h']uh)j hj hhhh*hNubj!)}(hVthe Harvest List does not contain valid XML as specified in the harvestList.xsd schemah]h,)}(hjh]hVthe Harvest List does not contain valid XML as specified in the harvestList.xsd schema}(hjhj ubah}(h]h!]h#]h%]h']uh)h+hh*hM7hjubah}(h]h!]h#]h%]h']uh)j hj hhhh*hNubj!)}(h~the URL to the Harvest List (specified during registration) does not match the actual location of the Harvest List on the diskh]h,)}(hjh]h~the URL to the Harvest List (specified during registration) does not match the actual location of the Harvest List on the disk}(hjhj!ubah}(h]h!]h#]h%]h']uh)h+hh*hM8hjubah}(h]h!]h#]h%]h']uh)j hj hhhh*hNubj!)}(hYan EML document that Harvester attempted to upload to Metacat does not contain valid EML h]h,)}(hXan EML document that Harvester attempted to upload to Metacat does not contain valid EMLh]hXan EML document that Harvester attempted to upload to Metacat does not contain valid EML}(hj:hj8ubah}(h]h!]h#]h%]h']uh)h+hh*hM9hj4ubah}(h]h!]h#]h%]h']uh)j hj hhhh*hNubeh}(h]h!]h#]h%]h']j j uh)j; hh*hM6hj hhubh,)}(hErrors that are independent of a particular site may indicate a problem with Harvester itself, Metacat, or the database connection. Refer to the error message to determine the source of the error and its resolution.h]hErrors that are independent of a particular site may indicate a problem with Harvester itself, Metacat, or the database connection. Refer to the error message to determine the source of the error and its resolution.}(hjThjRhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hM;hj hhubeh}(h]reviewing-harvest-reportsah!]h#]reviewing harvest reportsah%]h']uh)h hh hhhh*hM(ubeh}(h]!harvester-and-harvest-list-editorah!]h#]!harvester and harvest list editorah%]h']uh)h hhhhhh*hKubah}(h]h!]h#]h%]h']sourceh*uh)hcurrent_sourceN current_lineNsettingsdocutils.frontendValues)}(hN generatorN datestampN source_linkN source_urlN toc_backlinksjfootnote_backlinksK sectnum_xformKstrip_commentsNstrip_elements_with_classesN strip_classesN report_levelK halt_levelKexit_status_levelKdebugNwarning_streamN tracebackinput_encoding utf-8-siginput_encoding_error_handlerstrictoutput_encodingutf-8output_encoding_error_handlerjerror_encodingUTF-8error_encoding_error_handlerbackslashreplace language_codeenrecord_dependenciesNconfigN id_prefixhauto_id_prefixid dump_settingsNdump_internalsNdump_transformsNdump_pseudo_xmlNexpose_internalsNstrict_visitorN_disable_configN_sourceh* _destinationN _config_files]pep_referencesN pep_base_url https://www.python.org/dev/peps/pep_file_url_templatepep-%04drfc_referencesN rfc_base_urlhttps://tools.ietf.org/html/ tab_widthKtrim_footnote_reference_spacefile_insertion_enabled raw_enabledKsyntax_highlightlong smart_quotessmartquotes_localesNcharacter_level_inline_markupdoctitle_xform docinfo_xformKsectsubtitle_xformembed_stylesheetcloak_email_addressesenvNgettext_compactubreporterNindirect_targets]substitution_defs}substitution_names}refnames}refids}nameids}(jmjjhhjjj j jjj j j j jO jL j j j j j j j j jejbu nametypes}(jmNhNjNj NjNj Nj NjO Nj Nj Nj Nj NjeNuh}(jjh hhWjhj jjjj jj j jL j j jR j j j j j j jbj jjjjku footnote_refs} citation_refs} autofootnotes]autofootnote_refs]symbol_footnotes]symbol_footnote_refs] footnotes] citations]autofootnote_startKsymbol_footnote_startKid_startKparse_messages]hsystem_message)}(hhh]h,)}(h:Enumerated list start value not ordinal-1: "2" (ordinal 2)h]h>Enumerated list start value not ordinal-1: “2” (ordinal 2)}(hhhjubah}(h]h!]h#]h%]h']uh)h+hjubah}(h]h!]h#]h%]h']levelKtypeINFOsourceh*lineKuh)jhjhhhh*hKubatransform_messages] transformerN decorationNhhub.