Software Development Guidelines
==========================

Consistently designed, structured and formatted code enhances readability,
reusability, and longevity of code, which is a good thing, especially
considering the effort (and expense) that goes into composing software.
This document provides some basic guidelines for the development of code for
DataONE project.


General
-------

- Document the code.  No exceptions.

- It is generally a good idea to indicate the authors of the code (usually in
  the comment block for the file). This is especially helpful a couple years
  from now when trying to determine why something is implemented a particular
  way or to find the expert on the issue being examined.

- Set your editor to use UTF-8 by default.

- Always at least make a list of external dependencies (libraries, tools,
  applications) and where to find them (if not included in version control).

- Avoid OS specific dependencies. If really necessary, always make a note of
  any OS specific dependencies introduced by a block of code, and always
  implement such a block in a way that replacing with another block that might
  be developed in the future would be a relatively trivial matter (e.g. don't
  embed a bunch of win32 specific calls throughout the code- write an
  abstraction layer instead).

- As a general development note, all code should conform to the separation of
  concerns principle by way of modular design.  Code should be grouped together
  into modules that act together to provide specific functionality or take on
  limited responsibilities.

- Write `unit tests`_. Unit testing should be included in any code development
  where feasible. Unit tests should not rely on external services. Unit tests
  should only test a single module of code rather than an entire software
  package. The aim of a unit test is to evaluate the smallest unit of a
  component, such as a method, function or data structure. The best way to
  achieve unit testing is through mock interfaces of other modules.

.. _unit tests: http://en.wikipedia.org/wiki/Test-driven_development


Java
----

Software Methodology
====================
Modularization
--------------
In Java, Design by Interface provides a modular design strategy. The approach 
divides software between interfaces and implementations. Modules should only
interact with one another through interfaces.

Programming Style
=================
The `Java Programming Style Guidelines`_ by GeoSoft provides a very readable,
concise summary of some good conventions to follow while producing Java code.
It is quite similar (identical?) to the document `Code Conventions for the
Java Programming Language`_, though I find the single page layout of the
former to be more convenient.

In general, these conventions should be followed for all new Java code
developed specifically for DataONE. Where code is extending an existing
application, is may be more appropriate to utilize the convention utilized by
other developers working with that project.

Use `JavaDoc conventions`_ for documenting code.

The important aspect is consistency with a goal towards readability. It's easy
 for a compiler to interpret code, generally much harder for people.

.. _Java Programming Style Guidelines: http://geosoft.no/development/javastyle.html

.. _Code Conventions for the Java Programming Language: http://java.sun.com/docs/codeconv/

.. _JavaDoc Conventions: http://java.sun.com/j2se/javadoc/writingdoccomments/


Python
------

The `Style Guide for Python Code`_ (PEP-0008) should be followed for all
Python code, with the following exceptions:

- indentation should **use 2 spaces**, not 4 as suggested by Guido and Barry.
  One space is too short, and four leads to a greater likelihood of having to
  split lines. 

- Never use tabs for indenting. Set your editor to use soft tabs or replace
  tabs with spaces.

- Leave two blank lines between method definitions inside a class.

- If adding multiple classes in a file, separate them with a commented row of
  "="

- To be more consistent with the Java conventions, use the suffix "Exception"
  (rather than "Error" as suggested in PEP-0008) when naming exceptions.

- Follow DocString_ conventions to document code. Use either the EpyDoc_ or
  Sphinx_ approaches, but do not mix them within the same application. In
  general, using Sphinx_ autodoc_ is the preferred approach and will produce
  cleaner documentation than JavaDoc. This implies using reStructuredText
  formatting conventions in the docstrings. `Documenting Python`_ provides an
  overview of using Sphinx for documenting python.


.. _Style Guide for Python Code: http://www.python.org/dev/peps/pep-0008/

.. _EpyDoc: http://epydoc.sourceforge.net/

.. _DocString: http://www.python.org/dev/peps/pep-0257/

.. _Sphinx: http://sphinx.pocoo.org/index.html

.. _autodoc: http://sphinx.pocoo.org/ext/autodoc.html#module-sphinx.ext.autodoc

.. _Documenting Python: http://docs.python.org/documenting/

Java and Maven Setup
====================

Java development is currently compiled on OpenJDK 7, the official Java SE 7
release.  Our coding standards, as of July 2014, however, must conform to
Java 6.  Any software developed for use and comoiled with OpenJDK7 must be
executable with a Java 6 runtime.

On a Debian Linux system,  run the following apt-get commands to install all
needed Java components:

>  sudo apt-get update
>  sudo apt-get --no-install-recommends install openjdk-7-jdk
>  sudo update-java-alternatives -s java-1.7.0-openjdk-amd64
>  sudo apt-get --no-install-recommends install libjaxp1.3-java 
>  sudo apt-get --no-install-recommends install libxerces2-java 
>  sudo apt-get --no-install-recommends install ant 
>  sudo apt-get --no-install-recommends install mvn

As of July 2014, DataONE uses Maven 3 as our dependency management and build
tool. Each Java project of DataONE should follow a base maven project structure
that, at a minimal, contains a pom.xml file that will build via Java a complete
component. 

The following instructions are for Ubuntu or an Ubuntu derivative setup. 
Mac or Windows development environments will have different instructions.
If you are using Ubuntu or an Ubuntu derivative, edit the /etc/environment.

::
    JAVA_HOME="/usr/lib/jvm/java-1.7.0-openjdk-amd64"
    MAVEN_OPTS="-Xms128m -Xmx512m"
    M2_HOME="/usr/share/maven"
    M2="$M2_HOME/bin"

Developing Java Projects with Maven
-----------------------------------

From the development perspective, a developer will need to install maven. If
the developer is using Eclipse and/or Netbeans, there are helpful plug-ins
that preclude downloading and installing maven itself. For eclipse, the
m2eclipse plug-in assists the developer to maintain the maven configuration
files and dependencies needed for a project. An open source book, Developing
with Eclipse and Maven_.

.. _Developing with Eclipse and Maven: http://books.sonatype.com/m2eclipse-book/reference/

with the documentation on the maven website should be all that is required to
get started.

However, once using maven, there are certain standards to which conformance is
expected. The directory structure for a maven project should conform to
the standard maven template. If it is impossible, then the structure may be
altered in a project descriptor.

::
  pom.xml               Maven configuration file
  build.xml             Ant configuration file
  src/                  Application and Testing structure
  src/main/java         Application/Library sources
  src/main/resources    Application/Library resources
  src/main/filters      Resource filter files
  src/main/assembly     Assembly descriptors
  src/main/config       Configuration files
  src/main/webapp       Web application sources
  src/test/java         Test sources
  src/test/resources    Test resources
  src/test/filters      Test resource filter files
  src/site              Site
  target/               Build directory
  target/classes        Classes for Build


It is unlikely all these folders will be needed for every project, but if so,
then please follow the above structure.

It should also be noted that maven will create a local repository of all
downloaded artifacts in a .m2 directory of a developers home directory. If
there is a unique configuration needed for a developers local repository, then
they are saved in a settings.xml file.

Product Version Control
=======================
DataONE source control is managed by various applications in the DataONE
software environment.  Jenkins, maven, subversion, PyPI, debian packaging,
and ant are used in building and controlling the release of
DataONE products.

DataONE follows a standard incremental versioning approach to its software
products whereby new features under development are assigned a version number.
Products are made of multiple components and the components supported by
DataONE are placed under version control as well. 

This document describes in some detail the process to follow when we want 
to release a version and move mainline development to the next increment.
::
        Note: DataONE tracks a RESTful-based web services interface, or API,
        indicates communication compatability between the independently managed 
        nodes of DataONE.  These versions, while not divorced from the 
        software product versioning process, are not ruled by it either.

Baseline assumptions
--------------------
Any software project will eventually result in a software product.
A DataONE product may be composed of one or many sofware components.
Components may be combined to form one of five types of products-
a datatype schema, a software library (such as dataone-common-java),
an executable (such as dataone-cn-daemon), a web service (such as metacat),
or a software distribution package (for use by a software package management
system, such the debian package d1-cn-os-core).

Currently, software components are written in Java, Python, Bash Script,
XML Schema, R, and Perl.  Other software languages are not precluded.
The following baseline assumptions apply to any software produced, and
at any level or type of product composition (component, schema, library,
web application, executable, package).

Software configuration management (SCM) practices
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Subversion (SVN) is the revison control management software of choice.

The SVN trunk is where new feature development takes place. New
features of existing software are written in the trunk. New software
components are written in the trunk. New API releases are developed
in the trunk.  Bug fixes of product releases must be applied to the 
trunk when nessessary. Dead development trees may be deleted from
the trunk.

Once a new feature has undergone sufficient development and testing, a
SVN branch is created of the software component. Branches that
have been created from the trunk may never be deleted.

The only code change in branches are patches that fix functionality

SVN tags are used to mark versions of stable code that meet the DataONE
standard of a Project Deliverable

SVN tags are not modified after they are created. They are considered
static milestones of the software stack.

Tags are never to be deleted.

In rare circumstances, branches may be reserved for divergent or
exploratory development, and typically requires a merge back into
the trunk when complete.  The divergent or exploratory
branch may be deleted after the merge is complete.

Build practices
~~~~~~~~~~~~~~~

DataONE's build automation technology for Java components is Maven

Maven's dependency management, using a DatONE local maven
repository, is used to integrate different DataONE components together.

The DataONE local maven repository is located on maven.dataone.org, and
populated by Jenkins

Testing practices and Continuous Integration
~~~~~~~~~~~~~~~~~

Unit tests are written as part of the component under development,
and should be passing locally- on a developer's local environment-
before committing to the SVN repository

Apache Jenkins is the software product used as our continuous integration
environment.  All components should be built by Jenkins and all tests must
pass on Jenkins before a component is considered verified.

Jenkins jobs are created for each component under development,
for continuous unit testing of committed code.

Jenkins deploys verified components (those that pass all tests)
to the DataONE local maven repository

Integration Testing is primarily accomplished through the dedicated
software product d1_integration

    Test Environments, consisting of a distinct sets of nodes,
    are maintained to accomplish integration testing.
    
    The current DataONE testing environments are dev, sandbox, stage2,
    and stage. It is possible for a developer to establish their own
    locally hosted environment.

        The Dev environment should pull and test software from
        the trunk.  Beta and Stage2 environment should pull and test software
        from the branches. Stage environment should pull and test software from
        the tags.

Product Deliverables
---------------------
DataONE product Deliverables may be acquired through one of two mechanism.
releases.dataone.org provide schema, library and web services.  A debian
repository on jenkins-1.dataone.org (http://jenkins-1.dataone.org/ubuntu precise
universe) may be used for debian software packages distribution.

Software versioning
~~~~~~~~~~~~~~~~~~~
Software versions are applied as strings in various files depending
upon the component build strategy.  For Java Products, the software revision
number is set in the pom.xml file. For Debian packages, the revision
number is set in the control file.  For XML schema, the revision number
is set on the version attribute of the xsd element.  All software components
will have a unique manner in which to apply revision numbers, and
should conform to the best community wide standard.

DataONE follows a sequence-based naming scheme for its 
software revisions. Its sequence-based identifiers are molded from 
the form x.y.z[-qualifier], where 'x' is a major revision, y is a
'minor' revision, z is a 'maintance' patch and the qualifier refers to its
level of stability.  A valid pattern would be constructed such that
/\d+\.\d+\.\d+(?:\-(?:SNAPSHOT)|(?:BETA)|(?:RC\d+))?/ could pass a Perl regular
expression test. 

The 'major' revision number is only increased when a significant change
to the Service Level APIs have taken place such that communication 
between differing major revisions (such as a revision 1 and revision 2)
of the software stack is incompatible. The correlary is that at any major
revision (v1), all API calls are compatible between software implementations.

The 'minor' revision number is only increased when new functionality 
is added to the a release and is backwardly compatible with
the major release of which it is a part. The new  release revision number 
will be x.(y+1).0. 

When only bugfixes are part of the new release there will be a maintenance
release, otherwise known as a patch release. The revision number of the new 
release in this case will be of the form x.y.(z+1).

Qualifiers are added for notification of the stability of the release.
SNAPSHOTS are constantly changing and are only applied in the trunk. SNAPSHOTS
are considered in an ALPHA stage of development. BETA releases are set when
all development activity is feature complete and all tests are passing. BETA
stage of development is aimed at working out bugs and/or performance issues.
BETA software should be run through a series of integration tests on DataONE
nodes established for BETA testing. 'Release Candidate' (RC), is for software
components in a testing phase but tagged. Software at the RC stage has completed
BETA integration testing and has its code frozen.  A RC will not undergo
further code changes. If a RC component fails in the final integration tests
on Staging nodes, then software development must revert to beta phase
development and testing. A RELEASE without qualificaton is a finalized products
and is considered to be deliverable to the public.


Cutting a stable software release from development
==================================================

DataONE follows a structured process of product lifecycle management.
Each product is evaluated for its need for and prioritized in the
development cycle.  Before any coding begins, requirements are gathered
and a product is designed.  Once a design has been approved, formal
development begins by creating software projects. Development is initially
tested in the development environment.  After coding is
complete, the code goes through BETA testing in the
sandbox environments and staging environment.

One staging environment is a release candidate testing while the other is a
simulated production environment.

Only software that has been tagged for production release is issued
into the simulated production environment. The last environment is
production.

Development or Maintenance of a Software Component
--------------------------------------------------

DataONE follows a typical controlled-iteration process for developing
its software products. All initial development should occur in the trunk.
Development in the trunk is strickly new features and
backporting bug fixes.

Unit Tests should be written in order to test fulfillment of the requirements
of the software component.  
After unit tests pass and the software builds with a compiler, the code
may be checked in to subversion.

A jenkins job will compile a version of the trunk with the new code. The
code may then be installed on a development environment.

Build Control Files
~~~~~~~~~~~~~~~~~~~

Each level of software release has at properties file that describes the
location of the software package.  Development, Staging and Production
properties files are maintained in separate directories in the trunk
of the Subversion repository:

Development

    current_dev_control.properties_
.. _current_dev_control.properties: https://repository.dataone.org/software/cicore/trunk/d1_dev_build_control/current_dev_control.properties

Beta

    current_beta_control.properties_
.. _current_beta_control.properties: 
https://repository.dataone.org/software/cicore/trunk/d1_beta_build_control/current_beta_control.properties
  
Production
    current_release_control.properties_
.. _current_release_control.properties: https://repository.dataone.org/software/cicore/trunk/d1_stable_build_control/current_release_control.properties

The format of the control property file is a key value pair in which the
key indicates a named identifier for a component.  The named identifier
must be identical in each environment control file for the a component.
The value of the named identifier is a relative path that indicates the
current working subversion directory for the component.  In the Development
control properties file, the relative path begins at the 'trunk' such that
the named identifier 'D1_COMMON_JAVA', representing the dataone common java
library, has a value of 'trunk/d1_common_java', a relative path, which when
combined with the string 'https://repository.dataone.org/software/cicore/'
provides a full path to the software project.

Similarly for the staging and production control property file, the key 
is a named identifier that must be present in the development  control 
property file (assuming it is still under active development).  The key
must be identical to the one in the other files. However, the value is 
different due to differences in the relative path for each environment.

The relative path to the staging resources begin with the 'branches' token
such that if the active major.minor release of the dataone common java 
library,  denoted by the named identifier 'D1_COMMON_JAVA', is 1.2, then
the relative path would be 'branches/D1_COMMON_JAVA_v1.2'.  Similarly, 
the relative path to the production resources begin with the 'tags' 
token.  However, tags are made with a full revision number conforming
to the scheme, major.minor.maintenance. For example, the 1.2.1 release
of the the dataone common java library would be denoted by the relative
path 'tags/D1_COMMON_JAVA_v1.2.1'. When the relative paths are combined 
with the absolute path to the  cicore subversion repository location:

    https://repository.dataone.org/software/cicore/_
.. _https://repository.dataone.org/software/cicore/: https://repository.dataone.org/software/cicore/
  
The component is discoverable.

Edit the Development build control file
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Any new component will need an entry in the development control file. The
key should be an uppercase version of the component's name. For example,
the the directory that contains the project dataone coordinating nodes'
common library is 'd1_cn_common'.  The key should be named
'D1_CN_COMMON'.  The value should reflect the relative path to the
software project starting from the absolute path of the cicore
subversion repository:

    https://repository.dataone.org/software/cicore/_
.. _https://repository.dataone.org/software/cicore/: https://repository.dataone.org/software/cicore/

Thus for d1_cn_common, the relative path would be 'trunk/cn/d1_cn_common'.


Determine the components and products to release
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You will need to communicate with other developers, look at redmine
tickets and review subversion logs to determine the the extent of
changes that will need to be included in a production release.   These
changes should be noted in a redmine ticket. Ideally every file that has
been modified/added/deleted will be noted in redmine tickets that relate
to the release.

Release Control Tools
---------------------

In order to make branching and tagging easier, I have committed some perl scripts.
If you checkout 

   https://repository.dataone.org/software/tools/trunk/control_release/_
.. _https://repository.dataone.org/software/tools/trunk/control_release/: https://repository.dataone.org/software/tools/trunk/control_release/ 

you will find two perl scripts.  They are both command line executables.


::
        **controlStageAndRelease.pl**

        This executable will branch or tag dataone components as listed in
        the build control  property files maintained in trunk, either singly or in batch.

        To Branch development for staging, the script will pull from the trunk 
        components as noted in 

        https://repository.dataone.org/software/cicore/trunk/d1_dev_build_control/current_dev_control.properties_
.. _https://repository.dataone.org/software/cicore/trunk/d1_dev_build_control/current_dev_control.properties: https://repository.dataone.org/software/cicore/trunk/d1_dev_build_control/current_dev_control.properties
                
        and create a subversion copy at the branch location as listed in

        https://repository.dataone.org/software/cicore/trunk/d1_staging_build_control/current_staging_control.properties_
.. _https://repository.dataone.org/software/cicore/trunk/d1_staging_build_control/current_staging_control.properties: https://repository.dataone.org/software/cicore/trunk/d1_staging_build_control/current_staging_control.properties


        To Tag a release, the script will pull from repository branch components 
        as noted in
        
        https://repository.dataone.org/software/cicore/trunk/d1_staging_build_control/current_staging_control.properties_
.. _https://repository.dataone.org/software/cicore/trunk/d1_staging_build_control/current_staging_control.properties: https://repository.dataone.org/software/cicore/trunk/d1_staging_build_control/current_staging_control.properties
                
         and create a subversion copy at the tag location as listed in
         
         https://repository.dataone.org/software/cicore/trunk/d1_stable_build_control/current_release_control.properties_
.. _https://repository.dataone.org/software/cicore/trunk/d1_stable_build_control/current_release_control.properties: https://repository.dataone.org/software/cicore/trunk/d1_stable_build_control/current_release_control.properties

execute as
::

        controlStageAndRelease.pl --tag || --branch
             --component componenName || --all
               [--revision svnRevisionNumber]

              --tag  create a tag from a branch (may not be used in combination
                  with --branch)
             --branch   create a branch from the trunk (may not be used in 
                  combination with --tag)
             --component  a specific dataone software component name derived from the keys in the properties files
                  to either branch or tab (may not be used in combination with --all)
             --all tag or branch all the components as listed in a properties 
                  file, ignoring those already tagged or branched (may not
                  by used in combination with --component)
             --revision optional parameter that pulls the component to be branched
                  or tagged from an SVN revision

A second command is

::
        **mergeDiffProjects.pl**

        Use this program to find differences and merge them
        in single project that have two copies in two different 
        directories on your system.

        For example you have checked out the trunk and a branch on 
        your computer and you wish to find all the files that are missing
        or have been altered.  You would issue a command similar
        to the one below

>                       mergeDiffProjects.pl --trunk /home/rwaltz/Projects/AllSoftware/d1_common_java 
>                        --branch /home/rwaltz/Projects/branches/D1_COMMON_JAVA_v1.1

       The program assumes you have checkout out both the trunk and branches
       of the projects you will diff and then merge.
    
       To check out the trunk issue the svn command in a dirctory of your
       choice

>               svn checkout https://repository.dataone.org/software/cicore/trunk AllSoftware
        
        Checking out branches is a bit more difficult considering the large number
        of subdirectories in the branches and the fact that there are full dumps
        of the trunk in singular branch lineages.  It is best to checkout all the
        folders without their contents and then allow mergeDiffProjects.pl to
        update the contents upon execution (with svn update --set-depth infinity)

>               svn checkout --depth immediates https://repository.dataone.org/software/cicore/branches branches

        If you need to compare all the projects from trunk and branch, you will
        specify tha --all switch and then provide the path to the checked out
        trunk and branches.  The directory holding trunk must have all the 
        subdirectories of the trunk, and most importantly the trunk subdirectories
        d1_dev_build_control and d1_staging_build_control

        The program will investigate the properties files in both subdirectory and
        come up with a series of projects to diff.

        Once the diff is complete on the project, any files with the same name
        that have differences will be showin in a visual diff editor.  You
        should verify that all the changes from the trunk that are needed
        in the branch are present.  Also any files that are missing from the
        trunk will be examined and you will be prompted whether they should be
        copied to the branch.

Execute as:

::
                       mergeDiffProjects.pl [--all] 
                               --trunk path_to_a_trunk_project| path_to_the_full_trunk 
                               --branch  path_to_a_branched_project| path_to_the_full_branch

                               --all (optional) compare all the components in the trunk
                                       to the components in branches
                                       use full paths to the trunk and branches
                               --trunk the full path to the component in the trunk
                                       to compare, or if used with --all toggle the path to the 
                                       checkout trunk
                               --branch the full path to the component in the branch to
                                       compare, of if used with the --all toggle the path to
                                       the checkout branches

        The visual diff program used on linux environments is meld.
        For mac, it is gvimdiff.  If a better visual diff tool is available
        Then change the code to point to the correct visual diff


Testing and Staging of a Software Component
-------------------------------------------

Once product development is complete, code should be migrated to a
staging environment.  The staging environment is build from branches
in the subversion repository.  The staging environment also follows an
controlled-iteration process for maintenance or new releases.  Branches
of code are tested with integration tests on staging machines.  There
are two staging environments. One environment is a pure staging 
environment that pulls proposed releases from branches for testing.
Once the branched code passes all tests, then the code is tagged, and
the tagged version of the code is tested in a production simulation
environment, the second staging environment. If a tag does not pass
release to the simulated production environment, a new round of edits
will occur in the branch and a new revision level will be released as a 
tag.

Configure Staging Build Control file
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The staging build is controlled through a configuration file. 
If you have not done so, then checkout the d1_stage_build_control directory:

>        svn checkout https://repository.dataone.org/software/cicore/trunk/d1_staging_build_control  d1_staging_build_control

The property that builds the current staging environment is named 
current_staging_control.properties. 

You are then ready to modify the current_release_control.properties file.

The Staging build control file only needs to be updated upon a new Major 
or Minor release of the products.  For maintenance or bugfix releases, 
modifications are performed in the same branch and thus the staging 
control file may point to the same branch.

Any new component will need an entry in the staging control file. The 
key should be an uppercase version of the component's name. It should
be identical to the value in the development control file. For example,
the dataone coordinating nodes' common library's named identifier
 is 'D1_CN_COMMON' and the key should be named the same.
The value should reflect the relative path to the 
software project starting from the absolute path of the cicore 
subversion repository:

    https://repository.dataone.org/software/cicore/_
.. _https://repository.dataone.org/software/cicore/: https://repository.dataone.org/software/cicore/
  
Since staging allows for maintainence fixes in its projects, the name
of the branch should conform to the schema of major.minor. The formal
naming scheme would be uppercase project name + '_v '+ major.minor.
For exmample, a relative path of dataone's coordinating nodes' common 
library would be 'branches/D1_CN_COMMON_v1.0'.

Check Out or Create Subversion Branches
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--
If you do not have have the subversion branches checked out, then do so:

>       svn checkout --depth immediates https://repository.dataone.org/software/cicore/branches branches

        (Please note the use of --depth immediates, this option allows on the
        toplevel directory- branches- and its immediate children to be checked
        out. To retrieve the entire contents of a branch, then change directories
        into the branch and perform and update command with the --depth infinity
        option)
        
The subversion branches will meet the criteria for Major.Minor releases.  
All maintenance patches between minor/major releases will occcur within 
the branches checked out for that release.  For example, when development 
in the trunk has ended upon the 1.1 release of DataONE, branches for 
every component will need to be created to support maintanence releases.  
The format of the branch names are such:

        <UPPERCAST PRODUCT NAME>_v<MAJOR>.<MINOR>

If the subversion branch you need has to be created, then you should run
the above tool- controlStageAndRelease.pl .

If you execute the controlStageAndRelease.pl with the --branch switch,
then the code will create a new branch, named from the newly committed
value in the staging control properties file for the component that 
needs to be tested.  You can specifically name the component with the
--component <NAMED_IDENTIFER_KEY> option, or you can provide the --all
switch. The --all switch will cause the program to evaluate all the paths
in the staging control properties file to ensure they exist, and if not
to create them.


Compare the Trunk to the Branches
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order to ensure that all changes have been merged, a visual inspection 
of the differences should be made of all the modifications made in 
trunk to the files in the branch. 

To assist with the process, you should run the mergeDiffProjects.pl 
program.

After ensuring that all the files in the projects have been merged, 
increment the version number in the pom.xml file for java projects, 
or the control file for debian package builds.

Make certain to test building the newly changed java components. Ensure 
any dependencies in the pom have been updated to point to the appropriate 
version of the dependency.


Run Staging Jenkins Build Process & Upgrade
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Login to Jenkins at the default 'Jenkins' screen as admin (trying from any other screen will cause a failure).
Click on the tab 'Staging Builds'
In the project dataone-cn-metacat-deb-stage, edit one place: 

>        under Build -> Execute -> Command edit the string 'JARS=( Metacat_stable/workspace/METACAT_X_X_X/dist/knb.war );' (substitute release numbers for X_X_X )

In the project Metacat_Stage, edit four places:

>        Source Code Management -> Subversion -> Modules -> Repository URL edit the string to be https://code.ecoinformatics.org/code/metacat/tags/METACAT_X_X_X (substitute release numbers for X_X_X )

>        Execute shell -> Command edit the string 'cd "$WORKSPACE/METACAT_X_X_X/docs/user/metacat"' (substitute release numbers for X_X_X )

>        Invoke Ant -> Targets -> Press Advanced... Button -> edit the string 'metacat.dir=/home/jenkins/work/jobs/Metacat_stable/workspace/METACAT_X_X_X' (substitute release numbers for X_X_X )

>        Javadoc directory -> edit the string 'METACAT_X_X_X/build/docs'  (substitute release numbers for X_X_X )

Once these edits have been made, then click on the job 'Build_Stage' in 
the 'Staging Builds' tab.  OIn the left side bar of the page, click 
'Build Now' to begin the build process.

The build process is complete when  'Build_Stage_Level_8' had completed 
running its last job, currently 'dataone-mercury-deb-stagedataone-mercury-deb-stage'.  
Login to the staging machines, turn off all processing and indexing
daemons, then execute the 'sudo apt-get update'  and 
'sudo apt-get upgrade' commands.


Production Release of a Software Component
-------------------------------------------

Configure Stable Build Control file
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The stable build is controlled through a configuration file. 
If you have not done so, then checkout the d1_stable_build_control directory:

>       svn checkout https://repository.dataone.org/software/cicore/trunk/d1_stable_build_control  d1_stable_build_control
        
Each historical release and release candidate of the coordinating 
node stack should have its own tagged d1_stable_build_control properties 
file. Before modifying the  current_release_control.properties file for 
a new build, confirm that the previous version was tagged, and if not 
then tag it.

For example, is the previous release was 1.0.4 for the CN stack and the 
next release is 1.1.0, then the following command would be appropriate 
to execute:

>       svn copy -m"Preparing for 1.1.0 tagging by backing up current stable build properties" 
>               https://repository.dataone.org/software/cicore/trunk/d1_stable_build_control 
>               https://repository.dataone.org/software/cicore/tags/D1_STABLE_BUILD_CONTROL_v1.0.4

Now, you are ready to modify the current_release_control.properties file.

Any new release will need an entry in the stable control file. The 
key should be an uppercase version of the component's name. It should
be identical to the value in the staging control file. For example,
the dataone coordinating nodes' common library's named identifier
 is 'D1_CN_COMMON' and the key should be named the same.
The value should reflect the relative path to the 
software project starting from the absolute path of the cicore 
subversion repository:

    https://repository.dataone.org/software/cicore/_
.. _https://repository.dataone.org/software/cicore/: https://repository.dataone.org/software/cicore/
  
Since stable is a final production release, the name
of the branch should conform to the schema of major.minor.maintenance.
The formal naming scheme would be uppercase project name + '_v '+ 
major.minor.maintenance.

For exmample, a relative path of dataone's coordinating nodes' common 
library would be 'tagss/D1_CN_COMMON_v1.0.0'. Therefore, 
'tagss/D1_CN_COMMON_v1.0.0' would be the value for the key 'D1_CN_COMMON'
in the d1_stable_build_control properties file.

Tag a Component
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The subversion tags will meet the criteria for Major.Minor.Maintenance 
releases.  No maintenance is to be performed on tags. They mark the
end of a line of development.

For example, when revisions in a branch have ended and a production 
release of DataONE is ready, a tag  for a production component will 
need to be created.  The format of the tag names are such:

        <UPPERCAST PRODUCT NAME>_v<MAJOR>.<MINOR>.<MAINTENANCE>

The the subversion tag will have to be create. You should run
the above tool- controlStageAndRelease.pl .

If you execute the controlStageAndRelease.pl with the --tag switch,
then the code will create a new tag, named from the newly committed
value in the stable control properties file for the component.  
You can specifically name the component with the
--component <NAMED_IDENTIFER_KEY> option, or you can provide the --all
switch. The --all switch will cause the program to evaluate all the paths
in the stable control properties file to ensure they exist, and if not
to create them.


Confirm that the tag exists in subversion by reviewing the list of 
components in https://repository.dataone.org/software/cicore/tags/ .
You will then edit the current_release_control.properties file. 

Update Trunk to new Snapshot level
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1) Maven Projects

Lastly, edit the pom.xml in the trunk and branches of the project to
increase the revision numbers.  You will modify the  version element of 
the project to increase the revision number and you will modify the d1 
component property elements to point to the lastest SNAPSHOTs of those 
dependencies, such that if dataone_common_java had just been updated 
from release 1.2.3 to 1.3.0, then the new trunk 
pom.xml would look like:

::
    <artifactId>d1_common_java</artifactId>
    <packaging>jar</packaging>
    <version>1.4.0-SNAPSHOT</version>
    <name>DataONE_Common_Java</name>
    <url>http://dataone.org</url>
    <description>DataONE Common Code with Service Interface Definitions</description>
    <properties>
        <org.jibx.version>1.2.3</org.jibx.version>
        <d1_jibx_extensions_release>1.0.0-RC5-SNAPSHOT</d1_jibx_extensions_release>
        <d1_test_resources_release>1.0.0-RC5-SNAPSHOT</d1_test_resources_release>
    </properties>

while the pom.xml in the branch would look like:

::
    <artifactId>d1_common_java</artifactId>
    <packaging>jar</packaging>
    <version>1.3.1</version>
    <name>DataONE_Common_Java</name>
    <url>http://dataone.org</url>
    <description>DataONE Common Code with Service Interface Definitions</description>
    <properties>
        <org.jibx.version>1.2.3</org.jibx.version>
        <d1_jibx_extensions_release>1.0.0-RC4</d1_jibx_extensions_release>
        <d1_test_resources_release>1.0.0-RC4</d1_test_resources_release>
    </properties>

and the pom.xml in the tag would look like:

::
    <artifactId>d1_common_java</artifactId>
    <packaging>jar</packaging>
    <version>1.3.0</version>
    <name>DataONE_Common_Java</name>
    <url>http://dataone.org</url>
    <description>DataONE Common Code with Service Interface Definitions</description>
    <properties>
        <org.jibx.version>1.2.3</org.jibx.version>
        <d1_jibx_extensions_release>1.0.0-RC4</d1_jibx_extensions_release>
        <d1_test_resources_release>1.0.0-RC4</d1_test_resources_release>
    </properties>

commit the pom to the latest SNAPSHOT.

2) Debian Packages

  Similarly, edit the control files in the trunk and branch to increase 
  to the next major.minor.maintenance level and then commit.  

Commit Control File
~~~~~~~~~~~~~~~~~~~

Once all the components have been tagged and branched for a release, 
then you will need to commit the current_release_control.properties 
into svn.

Jenkins Stable Build Processes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The build process on Jenkins is as follows:
Login to Jenkins at the default 'Jenkins' screen as admin (trying from any 
other screen will cause a failure).

Click on the tab 'Stable Builds'
In the project dataone-cn-metacat-deb-stable, edit two places: 

         under String -> Parameter edit 'DATAONE_CN_METACAT_DEB' to be  represented by the default value tags/DATAONE-CN-METACAT_vx.x.x (where  x.x.x is the tag being released)

         under Build -> Execute -> Command edit the string 'JARS=(  Metacat_stable/workspace/METACAT_X_X_X/dist/knb.war );' (substitute  release numbers for X_X_X )

In the project Metacat_Stage, edit four places:

        Source Code Management -> Subversion -> Modules -> Repository URL edit the string to be https://code.ecoinformatics.org/code/metacat/tags/METACAT_X_X_X (substitute release numbers for X_X_X )

         Execute shell -> Command edit the string 'cd  "$WORKSPACE/METACAT_X_X_X/docs/user/metacat"' (substitute release  numbers for X_X_X )

         Invoke Ant -> Targets -> Press Advanced... Button -> edit the  string  'metacat.dir=/home/jenkins/work/jobs/Metacat_stable/workspace/METACAT_X_X_X'  (substitute release numbers for X_X_X )

        Javadoc directory -> edit the string 'METACAT_X_X_X/build/docs'  (substitute release numbers for X_X_X )

Deliver Product to the Public
-----------------------------
Copy the product over to releases.dataone.org.  Write an announcement
and post it.


Miscellaneous Notes
-------------------

Notes about Jenkins builds:
~~~~~~~~~~~~~~~~~~~~~~~~~~

There are three different apt repositories built by jenkins:

  http://jenkins-1.dataone.org/ubuntu-unstable/
  
and

  http://jenkins-1.dataone.org/ubuntu-beta/
  
and

  http://jenkins-1.dataone.org/ubuntu-stable/
  
The unstable packages are built from the subversion trunk. Jenkins will 
take the existing version number in the package and append RYYYYMMDD.nnn 
to the version number, where YYYY=year, MM=month, DD=day, 
nnn = build number for the day. So for example, the package 
"dataone-cn-portal" currently has Version: "1.0.0-RC2~unstable" in the 
DEBIAN/control file. The resulting file name and package version as 
generated by Jenkins are:

  dataone-cn-portal_1.0.0R20120303.002~unstable_all.deb
  
for the file name and the version information contained in Packages.gz is:

  1.0.0R20120303.002~unstable
  
The auto-generated version number means that any build of the package is 
equivalent to a snapshot, and apt will always install the new version 
during upgrade or install operations. The stable packages, build by the 
Stable_Build job on Jenkins, use the version information exactly as 
recorded in the DEBIAN/control file. These packages are built from 
subversion tags, and so will not change between compilations.

The result is that the ubuntu-unstable repository always contains only 
unstable (trunk) builds, and they always have a modified version tag.
The ubuntu-stable repository always contains only the stable builds from 
tags, and these have the version as recored in the DEBIAN/control file.
As a result, there should never by a mix-up between unstable and stable 
packages. However, it of course critical that the version numbers in 
the DEBIAN/control files are updated when a new release is being made. 
Apt uses that information to determine if it needs to install the 
package over an existing installation. 


Detail on Metacat versioning and integration testing:
+++++++++++++++++++++++++++++++++++++++++++++++++++++

Much of the CN interface is implemented by Metacat, whose code is 
maintained in a separate code repository (https://code.ecoinformatics.org/code/metacat/).  
It also has its own set of tests that need to pass before being 
released to DataONE.  Still, metacat needs to pass d1_integration tests, 
so changes in metacat code need to be deployed into a testing environment 
for this to happen. 

Some relevant detail:

    1) The deployable unit of metacat is the metacat.war, 
           and it is deployed as a separate webapp into a tomcat container.

    2) DataONE uses metacat in 2 contexts:

                a) as a Member Node

                b) as a member node, the metacat.war is deployed as metacat on 
                   the target node. 

                c) as a component of the Coordinating Node

                d) as a coordinating node, the metacat.war is copied to the DataONE 
                   SVN repository under the cn-buildout/dataone-cn-metacat project

It may not practical in most cases for the metacat developer to deploy 
often to the DEV environment, so integration testing against a 
local metacat deployment is preferable, and has be facilitated.  
In this way dataone integration tests (at least for member node 
functionality) can be run against the same locally deployed metacat 
instance used for metacat tests.