Use of DataONE Java Client Library ================================== *DRAFT* Introduction ------------- The purpose of the DataONE Java Client library is to simplify interaction with DataONE services, primarily handling all of the low-level details of message translation, HTTP/S transmission, and security (x509 certificate handling). In a nutshell, it will automatically create http connections (secure when appropriate) to retrieve or upload information, knowing how to talk to the service apis and return the information in the appropriate java datatype, or throw the correct exception. This library is general-purpose and multi-layered. Meaning it can be used by client tools and member nodes alike for communication within the DataONE ecosystem. An application can use the high-level classes for simple interactions, or some of the mid-level classes for specific use-cases, or even the low-level RestClient for situations where a developer might need access to some of the http request / response internals. Why Use libclient? ------------------ From a technical point of view, there are many subtle details that a developer will encounter when using http for data transmission, and choices to be made on how to accomplish a certain task. Libclient encapsulates not only all of the `DataONE service API methods `_, but also these implementation details and http standards DataONE follows. A brief listing of libclient's scope: #. correct and complete implementation of the current DataONE api #. rfc3986 implementation for url encoding, applied to unicode characters #. request serialization (toXML) and packaging using mime-multipart message bodies #. response deserialization of message body to the return type #. exception detection, deserialization, and throwing based on the http response (this is an especially onerous task, due to how Java handles exceptions in overrides) #. consistent (and informative) exception handling #. interoperability with the CILogon security sub-system (auto-detection of installed CILogon client certificates) #. trust manager configuration - packages DataONE trusted certificate authorities with the software, and provides hooks for certificate updates #. api versioning support #. local caching of objects #. a high-level api to simplify common interactions with DataONE services #. additional client tools for things like building DataONE data packages Finally, libclient_java is a thoroughly tested product used by the core development team for handling node-to-node communication between it's http-based services. Usage ----- In order to use the DataONE Java Client Library in a project, you should include the jar file with all dependencies. The easiest manner to do this is by building a maven project and including the following dependency:: org.dataone d1_libclient_java ${d1_libclient_version} jar Where ${d1_libclient_version} is the latest released version, for example ``1.0.4`` Make certain that you have include the DataONE maven repository into your maven repository settings:: dataone.org http://dev-testing.dataone.org/maven true true If you are not using maven, you can also download the jar with dependencies from `the DataONE releases site `_, and chose the latest, or desired, version. DataONE's Configuration Mechanism --------------------------------- DataONE has implemented the class `Settings `_ to standardize how property files are accessed from our java packages. d1_libclient_java uses this class to get configurations related to controlling aspects of ssl (client trust) and http (socket and connection timeouts), as well as providing a "bootstrap" CN base URL. d1_libclient_java contains default property files that the Settings class loads, organized under the package org.dataone.configuration. Depending on how you package your application, these files from libclient may be accessible, or may be overlooked or even overwritten. Shade jars in particular overwrite files with the same path when combining packages, so developers using libclient_java need to be mindful of these properties files. Configuring d1_libclient_java ----------------------------- There are two ways that Settings allows to set properties for the client in your project. The first and preferred way is by creating a configuration with associated property files; the second is with overriding calls directly from your code. The latter method is useful for debugging, but is discouraged for normal usage - as it decentralizes the configuration, and possibly causes unexpected behavior if the property is set after the class using the property loads it. To use the first method, you will need to make certain that you have included a config.xml configuration file and d1client.properties file in your project under the package org.dataone.configuration: config.xml:: d1client.properties:: certificate.truststore.useDefault=true If using the second method, ensure that the class org.dataone.configuration.Settings has been imported into the class where the method call has been made:: import org.dataone.configuration.Settings; . . . Settings.getConfiguration().setProperty("certificate.truststore.useDefault",true); Use of CILogon and Client Certificates ----------------------------------------------- DataONE uses the services of `CILogon `_ to provide client certificates. d1_libclient_java seamlessly interoperates with the CILogon service mechanisms to simplify client side certificate management. CILogon's workflow currently requires the user to point their browser to the appropriate CILogon service to retrieve a certificate. For the testing environment, use the endpoint: https://test.cilogon.org/?skin=DataONE For production, use the endpoint: https://cilogon.org/?skin=DataONE The downloaded certificate is downloaded to a standard location (``/tmp``) and will be valid for 18 hours. d1_libclient_java looks for the certificate in this standard location, but if you wish to maintain user certificates elsewhere, you will need to tell libclient's `CertificateManager `_ the new location, via CertificateManager.setLocation(). This may be useful, for example, if you want the certificate to be retained after a machine reboot. SSL Validation requirements ------------------------------------ SSL uses a two-way handshake where both client and service need to trust each other's certificate for the connection to be made. DataONE *services* trust CILogon issued certificates, and for certain cases (MN-CN interaction) DataONE issued certificates. In general, they will trust anonymous clients (no client certificate), allowing the connection, but will throw exceptions with expired or untrusted certificates. d1_libclient_java calls will fail with exception message "peer not authenticated" if it doesn't trust the service's certificate. d1_libclient_java *clients* partly delegate trust management to the JVM's Java Security System, which like most browsers has a predefined list of commercial certificate-authorities that they trust. Most all of the DataONE services (Member Nodes and Coordinating Nodes) use certificates signed by one of these commercial CAs known by JSS. However, some do not have these, so in these instances DataONE either signs their certificate, or decides to trust that additional CA. To simplify installation, d1_libclient_java "ships" with the known CAs that DataONE trusts and includes them in the TrustManager implementation. Therefore, your installation should not need to add any additional certificates to the local system. Worse come to worse, though, installing a certificate can be done by adding those public certificates to the local Java Security Java Key Store by way of ``keytool`` Using keytool -------------- The locations for these public keys are as follows: Dev and Testing: If you are testing out a client, then you should retrieve the certificates for testing: https://repository.dataone.org/software/tools/trunk/ca/DataONETestCAChain.crt and https://repository.dataone.org/software/cicore/trunk/cn-buildout/dataone-cn-os-core/usr/share/dataone-cn-os-core/debian/_.test.dataone.org.crt You will need to determine where the Java Key Store is located for your installation. It should be named cacerts in the ``jre/lib/security/`` subdirectory of your java install. For instance, ``/usr/lib/jvm/java-6-sun/jre/lib/security/cacerts`` Hence to install the two above certificates, issue this command from the directory where the downloaded certs are located:: keytool -import -noprompt -alias DataONETestCAChain -file DataONETestCAChain.crt -keystore /usr/lib/jvm/java-6-sun/jre/lib/security/cacerts -storepass changeit keytool -import -noprompt -alias DataONEStarTestCert -file *.test.dataone.org.crt -keystore /usr/lib/jvm/java-6-sun/jre/lib/security/cacerts -storepass changeit (of course the storepass 'changeit' is the default java password, you may change that password for a more secure system) Production: Should you be using the tools against production servers, then retrieve the following certificates: https://repository.dataone.org/software/tools/trunk/ca/DataONECAChain.crt and https://repository.dataone.org/software/cicore/trunk/cn-buildout/dataone-cn-os-core/usr/share/dataone-cn-os-core/debian/_.dataone.org.crt Using the commands:: keytool -import -noprompt -alias DataONECAChain -file DataONECAChain.crt -keystore /usr/lib/jvm/java-6-sun/jre/lib/security/cacerts -storepass changeit keytool -import -noprompt -alias DataONEStarCert -file _.dataone.org.crt -keystore /usr/lib/jvm/java-6-sun/jre/lib/security/cacerts -storepass changeit Configuring Client Trust Management ------------------------------------ You can manage CA certificates yourself by excluding the supplemental CA certificates that libclient_java includes in the trust manager by setting the property ``certificate.truststore.includeD1CAs`` to false via your auth.properties file:: certificate.truststore.includeD1CAs=false Updating DataONE supplemental CA certificates --------------------------------------------- When using the shipped DataONE-trusted CA certificates to augment the TrustManager, after some time, these could become out of date. To be able to handle updates for these certificates, d1_libclient_java first looks for certificates at an auxiliary location. Failing to find certificates there, it will load the ones from the d_libclient_java jar. In this way, both additions and subtractions to the DataONE trusted set can happen. The property that defines this location, and its default value, is:: certificate.truststore.aux.location=/etc/dataone/truststore Other Client Configuration Properties ------------------------------------- Other Important properties to set:: D1Client.CN_URL - defines which environment you will be using:: D1Client.CN_URL=https://cn-dev.dataone.org/cn D1Client.useLocalCache - will direct the client to pull an object from a local filesystem cache instead of making the service API call:: D1Client.useLocalCache=true/false Servers may return slowly to your request. The default timeout period is 30 seconds. Timeouts may be increased by the following properties, all in milliseconds:: D1Client.D1Node.get.timeout=60000 // default client wide timeout D1Client.D1Node.listObjects.timeout=60000 // timeout for listObjects method on MemberNodes and Coordinating Nodes D1Client.D1Node.getLogRecords.timeout=60000 // timeout for getLogRecords method on MemberNodes and Coordinating Nodes D1Client.D1Node.get.timeout=60000 // timeout for get method on MemberNodes and Coordinating Nodes D1Client.D1Node.getSystemMetadata.timeout=60000 // timeout for getSystemMetadata method on MemberNodes and Coordinating Nodes D1Client.CNode.replication.timeout=60000 // timeout for all replication methods on Coordinating Nodes D1Client.CNode.create.timeout=60000 // timeout for create method on Coordinating Nodes D1Client.CNode.registerSystemMetadata.timeout=60000 // timeout for registerSystemMetadata method on Coordinating Nodes D1Client.CNode.search.timeout=60000 // timeout for search method on Coordinating Nodes D1Client.MNode.create.timeout=60000 // timeout for create method on Member Nodes D1Client.MNode.update.timeout=60000 // timeout for update method on Member Nodes D1Client.MNode.replicate.timeout=60000 // timeout for replicate method on Member Nodes D1Client.MNode.getReplica.timeout=60000 // timeout for getReplica method on Member Nodes Miscellaneous settings:: CNode.nodemap.cache.refresh.interval.seconds