=========================================
Troubleshooting CN Cluster Communications
=========================================

The following steps can be followed to verify that communication
between CN instances are working.


Note that ports need to be restored before any of the tests can be performed.

This is done with the following command:

::

    $ sudo /usr/local/bin/togglePortsAndReplication.sh enable


Alternatively, you can check the port status a little crudely with the following.

::

    $ # get the ports that are toggled
    $ grep PORT /usr/local/bin/togglePortsAndReplication.sh 
    PORTS=(5701 5702 5703 389 5432)
    $ ufw status

and look for rules for the ports listed.


1. Confirm Certificate Configuration
=====================================
All DataONE CN components use the same two certificates for inter-CN communications.
If you have already confirmed that LDAP synchronization works via resetting a Node 
attribute, you can assume that the certifcates are installed correctly.

Otherwise: 

1.1 Check the Java trust manager
--------------------------------

check the java keystore with the following:

::

     $ ps -ef | grep tomcat
     tomcat7  12522    1  9 Mar12 ?        02:13:00 /usr/lib/jvm/java-7-openjdk-amd64/bin/java -Djava.util.logging.config.file=/var/lib/tomcat …
     $ cd /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/security

     $ sudo keytool -list -v  -keystore cacerts | grep -A 10 -B 10 DataONE
     *******************************************
     *******************************************
     Alias name: dataoneca
     Creation date: Jun 3, 2014
     Entry type: trustedCertEntry
     Owner: CN=DataONE Test CA, DC=dataone, DC=org
     Issuer: CN=DataONE Test CA, DC=dataone, DC=org
     Serial number: da3263a2a12d0000
     Valid from: Thu Mar 08 03:01:13 UTC 2012 until: Sat Feb 13 03:01:13 UTC 2112
     Certificate fingerprints:
     MD5:  A4:85:56:5D:F2:B3:C7:2D:13:BA:63:24:AA:E2:90:D5
     SHA1: 61:0D:A7:B9:11:AB:BB:0F:6D:B4:47:17:39:C6:53:53:C9:1B:5D:39
     SHA256: B6:AD:7F:13:1D:56:EF:D9:5C:E6:27:3E:2E:4C:D3:ED:39:68:0D:59:CC:CE:82:34:93:DD:83:F2:09:4C:83:D7
     Signature algorithm name: SHA1withRSA
     Version: 3
     Extensions:
     --
     *******************************************
     *******************************************
     Alias name: debian:dataonetestintca.pem
     Creation date: May 30, 2014
     Entry type: trustedCertEntry
     Owner: CN=DataONE Test Intermediate CA, DC=dataone, DC=org
     Issuer: CN=DataONE Test CA, DC=dataone, DC=org
     Serial number: da3263a2a12d0049
     Valid from: Tue Jul 24 03:24:46 UTC 2012 until: Thu Jun 30 03:24:46 UTC 2112
     Certificate fingerprints:
     MD5:  3F:52:FC:44:99:DA:7C:7F:9C:9A:90:95:2B:07:9B:4B
     SHA1: 97:5B:F3:E8:57:89:9D:B1:3D:FA:64:36:FC:23:C4:4F:46:E8:B5:DC
     SHA256: 79:07:78:4B:44:AD:9D:48:16:83:F5:F1:34:29:41:68:3A:EC:E3:0D:0E:AB:C2:3A:C7:9F:B8:6A:8C:8C:94:A9
     Signature algorithm name: SHA1withRSA
     Version: 3
     Extensions:

The Alias name and Creation date values do not matter.  Look for:

  -  the presence of two entries, the root certificate and the intermediate (see Owner and Issuer attributes).
  -  Certificate expiration, the Valid from: and until: dates
  -  fingerprint consistency between CNs, it might mean something is amiss.


1.2 Check the Certificate
-------------------------
The CNs maintain both a client certificate and server certificate, in a standard location
that all CN subcomponents use for communication. Check that the certficates are in the
expected location, and the basic information is correct (Issuer, Validity, and Subject.
Use the following example as a guide to locating and inspecting the certifiates.

::

    $ sudo su
    $ cd /etc/dataone/client
    $ # inspect the server cert
    $ openssl x509 -in certs/cn-dev-orc-1.test.dataone.org.pem -text -noout
    Certificate:
      Data:
        Version: 3 (0x2)
        Serial Number: 15722738799243755648 (0xda3263a2a12d0080)
      Signature Algorithm: sha1WithRSAEncryption
        Issuer: DC=org, DC=dataone, CN=DataONE Test Intermediate CA
        Validity
            Not Before: Mar 11 18:05:21 2014 GMT
            Not After : Mar 10 18:05:21 2017 GMT
        Subject: DC=org, DC=dataone, CN=cn-dev-orc-1.test.dataone.org
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:aa:8c:96:a6:fa:91:73:c7:6d:e7:43:bf:2a:a4:
		    ... etc ...
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Basic Constraints: 
                CA:FALSE
            Netscape Comment: 
                OpenSSL Generated Certificate
            X509v3 Subject Key Identifier: 
                9A:C1:CD:1C:34:7F:30:C3:F4:9E:DC:E0:A9 ... etc ...
            X509v3 Authority Key Identifier: 
                keyid:EF:2E:C1:27:6C:2A:8A:09 ... etc ...

            X509v3 CRL Distribution Points: 

                Full Name:
                  URI:http://releases.dataone.org/crl/DataONETestInt_CRL.pem

                Full Name:
                  URI:http://cn-ucsb-1.dataone.org/crl/DataONETestInt_CRL.pem

                Full Name:
                  URI:http://cn-unm-1.dataone.org/crl/DataONETest_CRL.pem

                Full Name:
                  URI:http://cn-orc-1.dataone.org/crl/DataONETestInt_CRL.pem

      Signature Algorithm: sha1WithRSAEncryption
         23:90:cc:05:a0:e5:b1:2b:11:dc:ee:9a:9b:4d:27:1d:e1:54:
         a9:9e:16:11:9d:64:cf:a6:7d:fd:7d:7d:0f:d0:d9:56:81:33:
         ... etc ...

    $ # inspect the client cert
    $ sudo openssl x509 -in private/urn_node_cnDevORC1.pem -text -noout
    Certificate:
      Data:
        Version: 3 (0x2)
        Serial Number: 15722738799243755649 (0xda3263a2a12d0081)
      Signature Algorithm: sha1WithRSAEncryption
        Issuer: DC=org, DC=dataone, CN=DataONE Test Intermediate CA
        Validity
            Not Before: Mar 11 18:06:14 2014 GMT
            Not After : Mar 10 18:06:14 2017 GMT
          Subject: DC=org, DC=dataone, CN=urn:node:cnDevORC1
          Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:b6:db:aa:63:33:74:3d:1c:8d:1e:ec:1d:e4:3e:
                    71:11:e8:f8:0d:ce:fe:32:87:c3:f0:07:d2:b1:4d:
                    ... etc ...
                Exponent: 65537 (0x10001)
          X509v3 extensions:
            X509v3 Basic Constraints: 
                CA:FALSE
            Netscape Comment: 
                OpenSSL Generated Certificate
            X509v3 Subject Key Identifier: 
                BB:63:8E:63:80:2F:15:3E:42:F8:06:2F:F0:DC:9C:45:32:28:32:70
            X509v3 Authority Key Identifier: 
                keyid:EF:2E:C1:27:6C:2A:8A:09:AB:6C:C3:45:7F:3B:F9:57:D5:16:A9:B3

            X509v3 CRL Distribution Points: 

                Full Name:
                  URI:http://releases.dataone.org/crl/DataONETestInt_CRL.pem

                Full Name:
                  URI:http://cn-ucsb-1.dataone.org/crl/DataONETestInt_CRL.pem

                Full Name:
                  URI:http://cn-unm-1.dataone.org/crl/DataONETest_CRL.pem

                Full Name:
                  URI:http://cn-orc-1.dataone.org/crl/DataONETestInt_CRL.pem

      Signature Algorithm: sha1WithRSAEncryption
         46:06:23:fa:97:b6:8e:8e:ee:b0:c4:78:d0:dd:3f:d8:9f:c1:
         7b:38:4c:af:8a:ea:33:43:20:dd:41:b6:3f:63:08:62:4f:12:
         ... etc ...



2.  Debug SSL Connection issues
================================
If the certificates seem ok, then you might need to observe the connection negotiating
in action.  
To get a better idea of where SSL handshake issues are failing, start by adding the 
following to /usr/share/tomcat7/bin/catalina.sh:

::

    $ sudo pico /usr/share/tomcat7/catalina.sh

    # add this line:
    JAVA_OPTS="$JAVA_OPTS -Djavax.net.debug=ssl:handshake"

and restart tomcat.

This will provide verbose output for all of the steps of the SSL handshake, but not include
the listing of all of the certificates registered to the trust-manager, which is likely to
be quite verbose:  500 CAs x 20 lines each....

To get that information, omit the :handshake specifier in catalina.sh, using this instead:

::

    JAVA_OPTS="$JAVA_OPTS -Djavax.net.debug=ssl"


- A good resource for analyzing the output is:  http://www.smartjava.org/content/how-analyze-java-ssl-errors
- SSL debug output will be part of the log file for the affected applications.
- Be sure to turn-off SSL debugging when you are done, since it is quite verbose in the log files.