Manually Adding Object Formats ============================== The DataONE Object Format list is maintained on the Coordinating Nodes for each environment. For a given environment, the object format list needs to be added to a single CN during a fresh install of the CN, and the Metacat application on each CN handles the replication of the list to the other CNs in the environment. The production list is maintained in the dataone-cn-metacat buildout package and is named `objectFormatListV2.xml`_. The insertOrUpdateObjectFormatList.sh_ script is also maintained in the same directory, and provides a convenient way to insert or update the document in Metacat. .. _objectFormatListV2.xml: https://repository.dataone.org/software/cicore/trunk/cn-buildout/dataone-cn-metacat/usr/share/metacat/debian/objectFormatListV2.xml .. _insertOrUpdateObjectFormatList.sh: https://repository.dataone.org/software/cicore/trunk/cn-buildout/dataone-cn-metacat/usr/share/metacat/debian/insertOrUpdateObjectFormatList.sh First time inserts in a new CN environment ------------------------------------------ When a Coordinating Node is first installed, the object format list needs to be inserted into the Metacat database. To do so, on one of the CNs in the environment, issue the following commands: :: $ cd /usr/share/metacat/debian $ sudo chmod +x insertOrUpdateObjectFormatList.sh $ sudo ./insertOrUpdateObjectFormatList.sh objectFormatListV2.xml When prompted for the password, enter the password for the `uid=dataone_cn_metacat,o=DATAONE,dc=ecoinformatics,dc=org` user, which is stored in the SystemPW.txt.gpg file in subversion. Note: We've changed the above DN in the production environment to `cn=dataone_cn_metacat,dc=dataone,dc=org`. Because of this, before executing the script, change the script to have: :: username="cn=dataone_cn_metacat,dc=dataone,dc=org"; Use the password for this DN found in the ProductionPW.txt.gpg file in subversion. Updating the object format list ------------------------------- Before updating the list, consult the `Unfied Digital Format Registry`_ and search for the file format in that registry to help decide what the DataONE formatId should be for the format. It's important to ensure that the format id is unique, as well as versioned in some manner in order to accomodate future iterations of the format. Also look through the existing objectFormatListV2.xml to ensure the format doesn't already exist, perhaps even under a different formatId. To update the list, do an svn checkout of the dataone-cn-metacat package: :: $ svn co https://repository.dataone.org/software/cicore/trunk/cn-buildout/dataone-cn-metacat Modify the objectFormatListV2.xml file by adding new formats according to the `ObjectFormat Type`_. Never modify an existing format, and never delete an existing format. Update the `total` and `count` attributes of the `ObjectFormatList`_ element. It can be helpful to use xmlstarlet to count the total as a cross check: :: $ xmlstarlet sel -t -v "count(//objectFormat/formatId)" Commit the changes: :: $ svn commit objectFormatListV2.xml Copy the new list to the CN you are modifying, and replace the file in /usr/share/metacat/debian/objectFormatListV2.xml. :: $ scp objectFormatListV2.xml cn-dev-ucsb-1.test.dataone.org: $ ssh cn-dev-ucsb-1.test.dataone.org $ sudo cp objectFormatListV2.xml /usr/share/metacat/debian/objectFormatListV2.xml Lastly, run the update script against the new format list document: :: $ cd /usr/share/metacat/debian $ sudo chmod +x insertOrUpdateObjectFormatList.sh $ sudo ./insertOrUpdateObjectFormatList.sh objectFormatListV2.xml After being prompted for the password, the list should be updated in Metacat. You can verify that each CN has the updated list by visiting the Cn's formats REST endpoint: :: https://cn-dev-ucsb-1.test.dataone.org/cn/v2/formats https://cn-dev-unm-1.test.dataone.org/cn/v2/formats https://cn-dev-orc-1.test.dataone.org/cn/v2/formats Maintenance of all format lists ------------------------------- When updating the object format list, it's best to do so in all environments at once because the list only gets initially added when first installing the CN. So, perform the above steps for DEV, SANDBOX, SANDBOX2, STAGE, STAGE2, and PRODUCTION environments. Note that the identifier for the object format list XML document may differ across environments because some environments get wiped clean and re-installed. For instance, in DEV it might be `OBJECT_FORMAT_LIST.1.1` whereas in PRODUCTION it might be `OBJECT_FORMAT_LIST.1.8`. .. _ObjectFormatList: https://releases.dataone.org/online/api-documentation-v1.2.0/apis/Types.html#Types.ObjectFormatList .. _ObjectFormat Type: https://releases.dataone.org/online/api-documentation-v1.2.0/apis/Types.html#Types.ObjectFormat .. _Unfied Digital Format Registry: http://udfr.org