Software Development Guidelines

Consistently designed, structured and formatted code enhances readability, reusability, and longevity of code, which is a good thing, especially considering the effort (and expense) that goes into composing software. This document provides some basic guidelines for the development of code for DataONE project.

General

  • Document the code. No exceptions.
  • It is generally a good idea to indicate the authors of the code (usually in the comment block for the file). This is especially helpful a couple years from now when trying to determine why something is implemented a particular way or to find the expert on the issue being examined.
  • Set your editor to use UTF-8 by default.
  • Always at least make a list of external dependencies (libraries, tools, applications) and where to find them (if not included in version control).
  • Avoid OS specific dependencies. If really necessary, always make a note of any OS specific dependencies introduced by a block of code, and always implement such a block in a way that replacing with another block that might be developed in the future would be a relatively trivial matter (e.g. don’t embed a bunch of win32 specific calls throughout the code- write an abstraction layer instead).
  • As a general development note, all code should conform to the separation of concerns principle by way of modular design. Code should be grouped together into modules that act together to provide specific functionality or take on limited responsibilities.
  • Write unit tests. Unit testing should be included in any code development where feasible. Unit tests should not rely on external services. Unit tests should only test a single module of code rather than an entire software package. The aim of a unit test is to evaluate the smallest unit of a component, such as a method, function or data structure. The best way to achieve unit testing is through mock interfaces of other modules.

Java

Software Methodology

Modularization

In Java, Design by Interface provides a modular design strategy. The approach divides software between interfaces and implementations. Modules should only interact with one another through interfaces.

Programming Style

The Java Programming Style Guidelines by GeoSoft provides a very readable, concise summary of some good conventions to follow while producing Java code. It is quite similar (identical?) to the document Code Conventions for the Java Programming Language, though I find the single page layout of the former to be more convenient.

In general, these conventions should be followed for all new Java code developed specifically for DataONE. Where code is extending an existing application, is may be more appropriate to utilize the convention utilized by other developers working with that project.

Use JavaDoc conventions for documenting code.

The important aspect is consistency with a goal towards readability. It’s easy
for a compiler to interpret code, generally much harder for people.

Python

The Style Guide for Python Code (PEP-0008) should be followed for all Python code, with the following exceptions:

  • indentation should use 2 spaces, not 4 as suggested by Guido and Barry. One space is too short, and four leads to a greater likelihood of having to split lines.
  • Never use tabs for indenting. Set your editor to use soft tabs or replace tabs with spaces.
  • Leave two blank lines between method definitions inside a class.
  • If adding multiple classes in a file, separate them with a commented row of “=”
  • To be more consistent with the Java conventions, use the suffix “Exception” (rather than “Error” as suggested in PEP-0008) when naming exceptions.
  • Follow DocString conventions to document code. Use either the EpyDoc or Sphinx approaches, but do not mix them within the same application. In general, using Sphinx autodoc is the preferred approach and will produce cleaner documentation than JavaDoc. This implies using reStructuredText formatting conventions in the docstrings. Documenting Python provides an overview of using Sphinx for documenting python.

Java and Maven Setup

Java development is currently compiled on OpenJDK 7, the official Java SE 7 release. Our coding standards, as of July 2014, however, must conform to Java 6. Any software developed for use and comoiled with OpenJDK7 must be executable with a Java 6 runtime.

On a Debian Linux system, run the following apt-get commands to install all needed Java components:

> sudo apt-get update > sudo apt-get –no-install-recommends install openjdk-7-jdk > sudo update-java-alternatives -s java-1.7.0-openjdk-amd64 > sudo apt-get –no-install-recommends install libjaxp1.3-java > sudo apt-get –no-install-recommends install libxerces2-java > sudo apt-get –no-install-recommends install ant > sudo apt-get –no-install-recommends install mvn

As of July 2014, DataONE uses Maven 3 as our dependency management and build tool. Each Java project of DataONE should follow a base maven project structure that, at a minimal, contains a pom.xml file that will build via Java a complete component.

The following instructions are for Ubuntu or an Ubuntu derivative setup. Mac or Windows development environments will have different instructions. If you are using Ubuntu or an Ubuntu derivative, edit the /etc/environment.

::
JAVA_HOME=”/usr/lib/jvm/java-1.7.0-openjdk-amd64” MAVEN_OPTS=”-Xms128m -Xmx512m” M2_HOME=”/usr/share/maven” M2=”$M2_HOME/bin”

Developing Java Projects with Maven

From the development perspective, a developer will need to install maven. If the developer is using Eclipse and/or Netbeans, there are helpful plug-ins that preclude downloading and installing maven itself. For eclipse, the m2eclipse plug-in assists the developer to maintain the maven configuration files and dependencies needed for a project. An open source book, Developing with Eclipse and Maven_.

with the documentation on the maven website should be all that is required to get started.

However, once using maven, there are certain standards to which conformance is expected. The directory structure for a maven project should conform to the standard maven template. If it is impossible, then the structure may be altered in a project descriptor.

::
pom.xml Maven configuration file build.xml Ant configuration file src/ Application and Testing structure src/main/java Application/Library sources src/main/resources Application/Library resources src/main/filters Resource filter files src/main/assembly Assembly descriptors src/main/config Configuration files src/main/webapp Web application sources src/test/java Test sources src/test/resources Test resources src/test/filters Test resource filter files src/site Site target/ Build directory target/classes Classes for Build

It is unlikely all these folders will be needed for every project, but if so, then please follow the above structure.

It should also be noted that maven will create a local repository of all downloaded artifacts in a .m2 directory of a developers home directory. If there is a unique configuration needed for a developers local repository, then they are saved in a settings.xml file.

Product Version Control

DataONE source control is managed by various applications in the DataONE software environment. Jenkins, maven, subversion, PyPI, debian packaging, and ant are used in building and controlling the release of DataONE products.

DataONE follows a standard incremental versioning approach to its software products whereby new features under development are assigned a version number. Products are made of multiple components and the components supported by DataONE are placed under version control as well.

This document describes in some detail the process to follow when we want to release a version and move mainline development to the next increment.

Note: DataONE tracks a RESTful-based web services interface, or API,
indicates communication compatability between the independently managed
nodes of DataONE.  These versions, while not divorced from the
software product versioning process, are not ruled by it either.

Baseline assumptions

Any software project will eventually result in a software product. A DataONE product may be composed of one or many sofware components. Components may be combined to form one of five types of products- a datatype schema, a software library (such as dataone-common-java), an executable (such as dataone-cn-daemon), a web service (such as metacat), or a software distribution package (for use by a software package management system, such the debian package d1-cn-os-core).

Currently, software components are written in Java, Python, Bash Script, XML Schema, R, and Perl. Other software languages are not precluded. The following baseline assumptions apply to any software produced, and at any level or type of product composition (component, schema, library, web application, executable, package).

Software configuration management (SCM) practices

Subversion (SVN) is the revison control management software of choice.

The SVN trunk is where new feature development takes place. New features of existing software are written in the trunk. New software components are written in the trunk. New API releases are developed in the trunk. Bug fixes of product releases must be applied to the trunk when nessessary. Dead development trees may be deleted from the trunk.

Once a new feature has undergone sufficient development and testing, a SVN branch is created of the software component. Branches that have been created from the trunk may never be deleted.

The only code change in branches are patches that fix functionality

SVN tags are used to mark versions of stable code that meet the DataONE standard of a Project Deliverable

SVN tags are not modified after they are created. They are considered static milestones of the software stack.

Tags are never to be deleted.

In rare circumstances, branches may be reserved for divergent or exploratory development, and typically requires a merge back into the trunk when complete. The divergent or exploratory branch may be deleted after the merge is complete.

Build practices

DataONE’s build automation technology for Java components is Maven

Maven’s dependency management, using a DatONE local maven repository, is used to integrate different DataONE components together.

The DataONE local maven repository is located on maven.dataone.org, and populated by Jenkins

Testing practices and Continuous Integration

Unit tests are written as part of the component under development, and should be passing locally- on a developer’s local environment- before committing to the SVN repository

Apache Jenkins is the software product used as our continuous integration environment. All components should be built by Jenkins and all tests must pass on Jenkins before a component is considered verified.

Jenkins jobs are created for each component under development, for continuous unit testing of committed code.

Jenkins deploys verified components (those that pass all tests) to the DataONE local maven repository

Integration Testing is primarily accomplished through the dedicated software product d1_integration

Test Environments, consisting of a distinct sets of nodes, are maintained to accomplish integration testing.

The current DataONE testing environments are dev, sandbox, stage2, and stage. It is possible for a developer to establish their own locally hosted environment.

The Dev environment should pull and test software from the trunk. Beta and Stage2 environment should pull and test software from the branches. Stage environment should pull and test software from the tags.

Product Deliverables

DataONE product Deliverables may be acquired through one of two mechanism. releases.dataone.org provide schema, library and web services. A debian repository on jenkins-1.dataone.org (http://jenkins-1.dataone.org/ubuntu precise universe) may be used for debian software packages distribution.

Software versioning

Software versions are applied as strings in various files depending upon the component build strategy. For Java Products, the software revision number is set in the pom.xml file. For Debian packages, the revision number is set in the control file. For XML schema, the revision number is set on the version attribute of the xsd element. All software components will have a unique manner in which to apply revision numbers, and should conform to the best community wide standard.

DataONE follows a sequence-based naming scheme for its software revisions. Its sequence-based identifiers are molded from the form x.y.z[-qualifier], where ‘x’ is a major revision, y is a ‘minor’ revision, z is a ‘maintance’ patch and the qualifier refers to its level of stability. A valid pattern would be constructed such that /d+.d+.d+(?:-(?:SNAPSHOT)|(?:BETA)|(?:RCd+))?/ could pass a Perl regular expression test.

The ‘major’ revision number is only increased when a significant change to the Service Level APIs have taken place such that communication between differing major revisions (such as a revision 1 and revision 2) of the software stack is incompatible. The correlary is that at any major revision (v1), all API calls are compatible between software implementations.

The ‘minor’ revision number is only increased when new functionality is added to the a release and is backwardly compatible with the major release of which it is a part. The new release revision number will be x.(y+1).0.

When only bugfixes are part of the new release there will be a maintenance release, otherwise known as a patch release. The revision number of the new release in this case will be of the form x.y.(z+1).

Qualifiers are added for notification of the stability of the release. SNAPSHOTS are constantly changing and are only applied in the trunk. SNAPSHOTS are considered in an ALPHA stage of development. BETA releases are set when all development activity is feature complete and all tests are passing. BETA stage of development is aimed at working out bugs and/or performance issues. BETA software should be run through a series of integration tests on DataONE nodes established for BETA testing. ‘Release Candidate’ (RC), is for software components in a testing phase but tagged. Software at the RC stage has completed BETA integration testing and has its code frozen. A RC will not undergo further code changes. If a RC component fails in the final integration tests on Staging nodes, then software development must revert to beta phase development and testing. A RELEASE without qualificaton is a finalized products and is considered to be deliverable to the public.

Cutting a stable software release from development

DataONE follows a structured process of product lifecycle management. Each product is evaluated for its need for and prioritized in the development cycle. Before any coding begins, requirements are gathered and a product is designed. Once a design has been approved, formal development begins by creating software projects. Development is initially tested in the development environment. After coding is complete, the code goes through BETA testing in the sandbox environments and staging environment.

One staging environment is a release candidate testing while the other is a simulated production environment.

Only software that has been tagged for production release is issued into the simulated production environment. The last environment is production.

Development or Maintenance of a Software Component

DataONE follows a typical controlled-iteration process for developing its software products. All initial development should occur in the trunk. Development in the trunk is strickly new features and backporting bug fixes.

Unit Tests should be written in order to test fulfillment of the requirements of the software component. After unit tests pass and the software builds with a compiler, the code may be checked in to subversion.

A jenkins job will compile a version of the trunk with the new code. The code may then be installed on a development environment.

Build Control Files

Each level of software release has at properties file that describes the location of the software package. Development, Staging and Production properties files are maintained in separate directories in the trunk of the Subversion repository:

Development

Beta

https://repository.dataone.org/software/cicore/trunk/d1_beta_build_control/current_beta_control.properties

Production
current_release_control.properties

The format of the control property file is a key value pair in which the key indicates a named identifier for a component. The named identifier must be identical in each environment control file for the a component. The value of the named identifier is a relative path that indicates the current working subversion directory for the component. In the Development control properties file, the relative path begins at the ‘trunk’ such that the named identifier ‘D1_COMMON_JAVA’, representing the dataone common java library, has a value of ‘trunk/d1_common_java’, a relative path, which when combined with the string ‘https://repository.dataone.org/software/cicore/‘ provides a full path to the software project.

Similarly for the staging and production control property file, the key is a named identifier that must be present in the development control property file (assuming it is still under active development). The key must be identical to the one in the other files. However, the value is different due to differences in the relative path for each environment.

The relative path to the staging resources begin with the ‘branches’ token such that if the active major.minor release of the dataone common java library, denoted by the named identifier ‘D1_COMMON_JAVA’, is 1.2, then the relative path would be ‘branches/D1_COMMON_JAVA_v1.2’. Similarly, the relative path to the production resources begin with the ‘tags’ token. However, tags are made with a full revision number conforming to the scheme, major.minor.maintenance. For example, the 1.2.1 release of the the dataone common java library would be denoted by the relative path ‘tags/D1_COMMON_JAVA_v1.2.1’. When the relative paths are combined with the absolute path to the cicore subversion repository location:

The component is discoverable.

Edit the Development build control file

Any new component will need an entry in the development control file. The key should be an uppercase version of the component’s name. For example, the the directory that contains the project dataone coordinating nodes’ common library is ‘d1_cn_common’. The key should be named ‘D1_CN_COMMON’. The value should reflect the relative path to the software project starting from the absolute path of the cicore subversion repository:

Thus for d1_cn_common, the relative path would be ‘trunk/cn/d1_cn_common’.

Determine the components and products to release

You will need to communicate with other developers, look at redmine tickets and review subversion logs to determine the the extent of changes that will need to be included in a production release. These changes should be noted in a redmine ticket. Ideally every file that has been modified/added/deleted will be noted in redmine tickets that relate to the release.

Release Control Tools

In order to make branching and tagging easier, I have committed some perl scripts. If you checkout

you will find two perl scripts. They are both command line executables.

::

controlStageAndRelease.pl

This executable will branch or tag dataone components as listed in the build control property files maintained in trunk, either singly or in batch.

To Branch development for staging, the script will pull from the trunk components as noted in

https://repository.dataone.org/software/cicore/trunk/d1_dev_build_control/current_dev_control.properties

To Tag a release, the script will pull from repository branch components as noted in

https://repository.dataone.org/software/cicore/trunk/d1_staging_build_control/current_staging_control.properties_

execute as

controlStageAndRelease.pl --tag || --branch
     --component componenName || --all
       [--revision svnRevisionNumber]

      --tag  create a tag from a branch (may not be used in combination
          with --branch)
     --branch   create a branch from the trunk (may not be used in
          combination with --tag)
     --component  a specific dataone software component name derived from the keys in the properties files
          to either branch or tab (may not be used in combination with --all)
     --all tag or branch all the components as listed in a properties
          file, ignoring those already tagged or branched (may not
          by used in combination with --component)
     --revision optional parameter that pulls the component to be branched
          or tagged from an SVN revision

A second command is

::

mergeDiffProjects.pl

Use this program to find differences and merge them in single project that have two copies in two different directories on your system.

For example you have checked out the trunk and a branch on your computer and you wish to find all the files that are missing or have been altered. You would issue a command similar to the one below

> mergeDiffProjects.pl –trunk /home/rwaltz/Projects/AllSoftware/d1_common_java > –branch /home/rwaltz/Projects/branches/D1_COMMON_JAVA_v1.1

The program assumes you have checkout out both the trunk and branches of the projects you will diff and then merge.

To check out the trunk issue the svn command in a dirctory of your choice

> svn checkout https://repository.dataone.org/software/cicore/trunk AllSoftware

Checking out branches is a bit more difficult considering the large number of subdirectories in the branches and the fact that there are full dumps of the trunk in singular branch lineages. It is best to checkout all the folders without their contents and then allow mergeDiffProjects.pl to update the contents upon execution (with svn update –set-depth infinity)

> svn checkout –depth immediates https://repository.dataone.org/software/cicore/branches branches

If you need to compare all the projects from trunk and branch, you will specify tha –all switch and then provide the path to the checked out trunk and branches. The directory holding trunk must have all the subdirectories of the trunk, and most importantly the trunk subdirectories d1_dev_build_control and d1_staging_build_control

The program will investigate the properties files in both subdirectory and come up with a series of projects to diff.

Once the diff is complete on the project, any files with the same name that have differences will be showin in a visual diff editor. You should verify that all the changes from the trunk that are needed in the branch are present. Also any files that are missing from the trunk will be examined and you will be prompted whether they should be copied to the branch.

Execute as:

::
mergeDiffProjects.pl [–all]

–trunk path_to_a_trunk_project| path_to_the_full_trunk –branch path_to_a_branched_project| path_to_the_full_branch

–all (optional) compare all the components in the trunk
to the components in branches use full paths to the trunk and branches
–trunk the full path to the component in the trunk
to compare, or if used with –all toggle the path to the checkout trunk
–branch the full path to the component in the branch to
compare, of if used with the –all toggle the path to the checkout branches

The visual diff program used on linux environments is meld. For mac, it is gvimdiff. If a better visual diff tool is available Then change the code to point to the correct visual diff

Testing and Staging of a Software Component

Once product development is complete, code should be migrated to a staging environment. The staging environment is build from branches in the subversion repository. The staging environment also follows an controlled-iteration process for maintenance or new releases. Branches of code are tested with integration tests on staging machines. There are two staging environments. One environment is a pure staging environment that pulls proposed releases from branches for testing. Once the branched code passes all tests, then the code is tagged, and the tagged version of the code is tested in a production simulation environment, the second staging environment. If a tag does not pass release to the simulated production environment, a new round of edits will occur in the branch and a new revision level will be released as a tag.

Configure Staging Build Control file

The staging build is controlled through a configuration file. If you have not done so, then checkout the d1_stage_build_control directory:

> svn checkout https://repository.dataone.org/software/cicore/trunk/d1_staging_build_control d1_staging_build_control

The property that builds the current staging environment is named current_staging_control.properties.

You are then ready to modify the current_release_control.properties file.

The Staging build control file only needs to be updated upon a new Major or Minor release of the products. For maintenance or bugfix releases, modifications are performed in the same branch and thus the staging control file may point to the same branch.

Any new component will need an entry in the staging control file. The key should be an uppercase version of the component’s name. It should be identical to the value in the development control file. For example, the dataone coordinating nodes’ common library’s named identifier

is ‘D1_CN_COMMON’ and the key should be named the same.

The value should reflect the relative path to the software project starting from the absolute path of the cicore subversion repository:

Since staging allows for maintainence fixes in its projects, the name of the branch should conform to the schema of major.minor. The formal naming scheme would be uppercase project name + ‘_v ‘+ major.minor. For exmample, a relative path of dataone’s coordinating nodes’ common library would be ‘branches/D1_CN_COMMON_v1.0’.

Check Out or Create Subversion Branches ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~– If you do not have have the subversion branches checked out, then do so:

> svn checkout –depth immediates https://repository.dataone.org/software/cicore/branches branches

(Please note the use of –depth immediates, this option allows on the toplevel directory- branches- and its immediate children to be checked out. To retrieve the entire contents of a branch, then change directories into the branch and perform and update command with the –depth infinity option)

The subversion branches will meet the criteria for Major.Minor releases. All maintenance patches between minor/major releases will occcur within the branches checked out for that release. For example, when development in the trunk has ended upon the 1.1 release of DataONE, branches for every component will need to be created to support maintanence releases. The format of the branch names are such:

<UPPERCAST PRODUCT NAME>_v<MAJOR>.<MINOR>

If the subversion branch you need has to be created, then you should run the above tool- controlStageAndRelease.pl .

If you execute the controlStageAndRelease.pl with the –branch switch, then the code will create a new branch, named from the newly committed value in the staging control properties file for the component that needs to be tested. You can specifically name the component with the –component <NAMED_IDENTIFER_KEY> option, or you can provide the –all switch. The –all switch will cause the program to evaluate all the paths in the staging control properties file to ensure they exist, and if not to create them.

Compare the Trunk to the Branches

In order to ensure that all changes have been merged, a visual inspection of the differences should be made of all the modifications made in trunk to the files in the branch.

To assist with the process, you should run the mergeDiffProjects.pl program.

After ensuring that all the files in the projects have been merged, increment the version number in the pom.xml file for java projects, or the control file for debian package builds.

Make certain to test building the newly changed java components. Ensure any dependencies in the pom have been updated to point to the appropriate version of the dependency.

Run Staging Jenkins Build Process & Upgrade

Login to Jenkins at the default ‘Jenkins’ screen as admin (trying from any other screen will cause a failure). Click on the tab ‘Staging Builds’ In the project dataone-cn-metacat-deb-stage, edit one place:

> under Build -> Execute -> Command edit the string ‘JARS=( Metacat_stable/workspace/METACAT_X_X_X/dist/knb.war );’ (substitute release numbers for X_X_X )

In the project Metacat_Stage, edit four places:

> Source Code Management -> Subversion -> Modules -> Repository URL edit the string to be https://code.ecoinformatics.org/code/metacat/tags/METACAT_X_X_X (substitute release numbers for X_X_X )

> Execute shell -> Command edit the string ‘cd “$WORKSPACE/METACAT_X_X_X/docs/user/metacat”’ (substitute release numbers for X_X_X )

> Invoke Ant -> Targets -> Press Advanced... Button -> edit the string ‘metacat.dir=/home/jenkins/work/jobs/Metacat_stable/workspace/METACAT_X_X_X’ (substitute release numbers for X_X_X )

> Javadoc directory -> edit the string ‘METACAT_X_X_X/build/docs’ (substitute release numbers for X_X_X )

Once these edits have been made, then click on the job ‘Build_Stage’ in the ‘Staging Builds’ tab. OIn the left side bar of the page, click ‘Build Now’ to begin the build process.

The build process is complete when ‘Build_Stage_Level_8’ had completed running its last job, currently ‘dataone-mercury-deb-stagedataone-mercury-deb-stage’. Login to the staging machines, turn off all processing and indexing daemons, then execute the ‘sudo apt-get update’ and ‘sudo apt-get upgrade’ commands.

Production Release of a Software Component

Configure Stable Build Control file

The stable build is controlled through a configuration file. If you have not done so, then checkout the d1_stable_build_control directory:

> svn checkout https://repository.dataone.org/software/cicore/trunk/d1_stable_build_control d1_stable_build_control

Each historical release and release candidate of the coordinating node stack should have its own tagged d1_stable_build_control properties file. Before modifying the current_release_control.properties file for a new build, confirm that the previous version was tagged, and if not then tag it.

For example, is the previous release was 1.0.4 for the CN stack and the next release is 1.1.0, then the following command would be appropriate to execute:

> svn copy -m”Preparing for 1.1.0 tagging by backing up current stable build properties” > https://repository.dataone.org/software/cicore/trunk/d1_stable_build_control > https://repository.dataone.org/software/cicore/tags/D1_STABLE_BUILD_CONTROL_v1.0.4

Now, you are ready to modify the current_release_control.properties file.

Any new release will need an entry in the stable control file. The key should be an uppercase version of the component’s name. It should be identical to the value in the staging control file. For example, the dataone coordinating nodes’ common library’s named identifier

is ‘D1_CN_COMMON’ and the key should be named the same.

The value should reflect the relative path to the software project starting from the absolute path of the cicore subversion repository:

Since stable is a final production release, the name of the branch should conform to the schema of major.minor.maintenance. The formal naming scheme would be uppercase project name + ‘_v ‘+ major.minor.maintenance.

For exmample, a relative path of dataone’s coordinating nodes’ common library would be ‘tagss/D1_CN_COMMON_v1.0.0’. Therefore, ‘tagss/D1_CN_COMMON_v1.0.0’ would be the value for the key ‘D1_CN_COMMON’ in the d1_stable_build_control properties file.

Tag a Component

The subversion tags will meet the criteria for Major.Minor.Maintenance releases. No maintenance is to be performed on tags. They mark the end of a line of development.

For example, when revisions in a branch have ended and a production release of DataONE is ready, a tag for a production component will need to be created. The format of the tag names are such:

<UPPERCAST PRODUCT NAME>_v<MAJOR>.<MINOR>.<MAINTENANCE>

The the subversion tag will have to be create. You should run the above tool- controlStageAndRelease.pl .

If you execute the controlStageAndRelease.pl with the –tag switch, then the code will create a new tag, named from the newly committed value in the stable control properties file for the component. You can specifically name the component with the –component <NAMED_IDENTIFER_KEY> option, or you can provide the –all switch. The –all switch will cause the program to evaluate all the paths in the stable control properties file to ensure they exist, and if not to create them.

Confirm that the tag exists in subversion by reviewing the list of components in https://repository.dataone.org/software/cicore/tags/ . You will then edit the current_release_control.properties file.

Update Trunk to new Snapshot level

  1. Maven Projects

Lastly, edit the pom.xml in the trunk and branches of the project to increase the revision numbers. You will modify the version element of the project to increase the revision number and you will modify the d1 component property elements to point to the lastest SNAPSHOTs of those dependencies, such that if dataone_common_java had just been updated from release 1.2.3 to 1.3.0, then the new trunk pom.xml would look like:

::

<artifactId>d1_common_java</artifactId> <packaging>jar</packaging> <version>1.4.0-SNAPSHOT</version> <name>DataONE_Common_Java</name> <url>http://dataone.org</url> <description>DataONE Common Code with Service Interface Definitions</description> <properties>

<org.jibx.version>1.2.3</org.jibx.version> <d1_jibx_extensions_release>1.0.0-RC5-SNAPSHOT</d1_jibx_extensions_release> <d1_test_resources_release>1.0.0-RC5-SNAPSHOT</d1_test_resources_release>

</properties>

while the pom.xml in the branch would look like:

::

<artifactId>d1_common_java</artifactId> <packaging>jar</packaging> <version>1.3.1</version> <name>DataONE_Common_Java</name> <url>http://dataone.org</url> <description>DataONE Common Code with Service Interface Definitions</description> <properties>

<org.jibx.version>1.2.3</org.jibx.version> <d1_jibx_extensions_release>1.0.0-RC4</d1_jibx_extensions_release> <d1_test_resources_release>1.0.0-RC4</d1_test_resources_release>

</properties>

and the pom.xml in the tag would look like:

::

<artifactId>d1_common_java</artifactId> <packaging>jar</packaging> <version>1.3.0</version> <name>DataONE_Common_Java</name> <url>http://dataone.org</url> <description>DataONE Common Code with Service Interface Definitions</description> <properties>

<org.jibx.version>1.2.3</org.jibx.version> <d1_jibx_extensions_release>1.0.0-RC4</d1_jibx_extensions_release> <d1_test_resources_release>1.0.0-RC4</d1_test_resources_release>

</properties>

commit the pom to the latest SNAPSHOT.

  1. Debian Packages
Similarly, edit the control files in the trunk and branch to increase to the next major.minor.maintenance level and then commit.

Commit Control File

Once all the components have been tagged and branched for a release, then you will need to commit the current_release_control.properties into svn.

Jenkins Stable Build Processes

The build process on Jenkins is as follows: Login to Jenkins at the default ‘Jenkins’ screen as admin (trying from any other screen will cause a failure).

Click on the tab ‘Stable Builds’ In the project dataone-cn-metacat-deb-stable, edit two places:

under String -> Parameter edit ‘DATAONE_CN_METACAT_DEB’ to be represented by the default value tags/DATAONE-CN-METACAT_vx.x.x (where x.x.x is the tag being released)

under Build -> Execute -> Command edit the string ‘JARS=( Metacat_stable/workspace/METACAT_X_X_X/dist/knb.war );’ (substitute release numbers for X_X_X )

In the project Metacat_Stage, edit four places:

Source Code Management -> Subversion -> Modules -> Repository URL edit the string to be https://code.ecoinformatics.org/code/metacat/tags/METACAT_X_X_X (substitute release numbers for X_X_X )

Execute shell -> Command edit the string ‘cd “$WORKSPACE/METACAT_X_X_X/docs/user/metacat”’ (substitute release numbers for X_X_X )

Invoke Ant -> Targets -> Press Advanced... Button -> edit the string ‘metacat.dir=/home/jenkins/work/jobs/Metacat_stable/workspace/METACAT_X_X_X’ (substitute release numbers for X_X_X )

Javadoc directory -> edit the string ‘METACAT_X_X_X/build/docs’ (substitute release numbers for X_X_X )

Deliver Product to the Public

Copy the product over to releases.dataone.org. Write an announcement and post it.

Miscellaneous Notes

Notes about Jenkins builds:

There are three different apt repositories built by jenkins:

and

and

The unstable packages are built from the subversion trunk. Jenkins will take the existing version number in the package and append RYYYYMMDD.nnn to the version number, where YYYY=year, MM=month, DD=day, nnn = build number for the day. So for example, the package “dataone-cn-portal” currently has Version: “1.0.0-RC2~unstable” in the DEBIAN/control file. The resulting file name and package version as generated by Jenkins are:

dataone-cn-portal_1.0.0R20120303.002~unstable_all.deb

for the file name and the version information contained in Packages.gz is:

1.0.0R20120303.002~unstable

The auto-generated version number means that any build of the package is equivalent to a snapshot, and apt will always install the new version during upgrade or install operations. The stable packages, build by the Stable_Build job on Jenkins, use the version information exactly as recorded in the DEBIAN/control file. These packages are built from subversion tags, and so will not change between compilations.

The result is that the ubuntu-unstable repository always contains only unstable (trunk) builds, and they always have a modified version tag. The ubuntu-stable repository always contains only the stable builds from tags, and these have the version as recored in the DEBIAN/control file. As a result, there should never by a mix-up between unstable and stable packages. However, it of course critical that the version numbers in the DEBIAN/control files are updated when a new release is being made. Apt uses that information to determine if it needs to install the package over an existing installation.

Detail on Metacat versioning and integration testing:

Much of the CN interface is implemented by Metacat, whose code is maintained in a separate code repository (https://code.ecoinformatics.org/code/metacat/). It also has its own set of tests that need to pass before being released to DataONE. Still, metacat needs to pass d1_integration tests, so changes in metacat code need to be deployed into a testing environment for this to happen.

Some relevant detail:

  1. The deployable unit of metacat is the metacat.war,

    and it is deployed as a separate webapp into a tomcat container.

  2. DataONE uses metacat in 2 contexts:

    1. as a Member Node
    2. as a member node, the metacat.war is deployed as metacat on the target node.
    3. as a component of the Coordinating Node
    4. as a coordinating node, the metacat.war is copied to the DataONE SVN repository under the cn-buildout/dataone-cn-metacat project

It may not practical in most cases for the metacat developer to deploy often to the DEV environment, so integration testing against a local metacat deployment is preferable, and has be facilitated. In this way dataone integration tests (at least for member node functionality) can be run against the same locally deployed metacat instance used for metacat tests.