Warning: These documents are under active 
      development and subject to change (version 2.1.0-beta).
      The latest release documents are at:
      https://purl.dataone.org/architecture
      
While each of the individual components of the DataONE system are designed and tested independently, we also need to design effective tests of the full system functionality. To fulfill many of the use cases, many of the components in the DataONE infrastructure need to interact in a sequenced manner with particular inputs and outputs. The integration tests are designed to allow individual components to be exercised in these particular use case scenarios with known inputs in order to verify that the correct output is produced, and that adequate performance, reliability, and scalability metrics can be met.
In order to run these tests effectively, we need a testing framework that can be used to encode and execute the tests in an orderly fashion and to capture the results of the tests over time. This will allow us to trace changes in the test results over time. It would be useful to provide a synopsis view of all of the tests, showing which are passing and which are failing. This testing framework must be able to execute processes across several different languages, allow a multitude of clients to be configured to load test Member Nodes, and allow a multitude of Member Nodes to be configured to load test Coordinating Nodes.
Note
This list needs to be segmented to show what tests need to be passed by July 31st, versus what tests can be deferred to later.
Note
The metrics need to be parts of the tests. Are there things in the metrics that we’re not testing?
Note
Codes: Integration tests were ranked as High (H) or Low (L) priority, where each person had 10 high and 10 low votes to assign. Unranked tests were considered Medium (M) priority.
| # | High | Due July 31? | Low | Description | 
|---|---|---|---|---|
| 1 | 8 | July 31 | Use case 36 (resolve) | |
| 2 | 8 | July 31 | Use case 2 (query) | |
| 3 | 8 | July 31 | Completing the loop: publish data set, be sure it is retrievable exactly as submitted | |
| 4 | 6 | July 31 | Use case 1 (get). Note: need to test for non-existant ID’s, test access control, test for malicious content | |
| 5 | 6 | Can a downed CN be revived/repopulated? | ||
| 6 | 6 | Testing for invalid input and known problems such as XSS and SQL Injection | ||
| 7 | 5 | Use case 6 (synchronize). Note: what is needed for reschronize and recovery from outage. | ||
| 8 | 5 | July 31 | 1 | Do the Java Stacks and Python stacks return the same thing for the same object | 
| 9 | 4 | July 31 | Does CN metadata replication work | |
| 10 | 3 | Broadly, test node failures, including CN and MN. Mostly concerned with read access at this point. MN1 submits data, gets replicated to MN2, MN1 goes down, test access to data even in the absence of the owning Member Node. | ||
| 11 | 2 | Use case 16 (log CRUD operations) | ||
| 12 | 2 | Test that invalid documents are properly logged on harvest and that logs are machine parseable | ||
| 13 | 2 | July 31 | Consistency Check on CN’s ( Test that the checksums at CN’s match) | |
| 14 | 2 | Use case 9 (MN-MN replication): includes testing replication policies | ||
| 15 | 2 | Load testing to determime the break points (how far can we scale). Relates to risk register issue that CN’s don’t scale to handle the load. What is the target that we should be able to handle. Suggestion is that we want to be able to handle 100 Member Node traffic. This may not be a big issue until we get to handling high frequency updates (e.g. sensor data). Thought: look at the milestone 3 metrics and test at 5x that number. | ||
| 16 | 1 | Does an insertion via a MN replication request work? | ||
| 17 | 1 | Serve up data from a replicated MN when home MN is inavailable | ||
| 18 | 1 | Data and Metadata re-synchronization when MN and CN come back from outage | ||
| 19 | 1 | July 31 | Testing authentication and access control. Includes testing actions that should be rejected. | |
| 20 | 1 | Testing that system metadata is properly validated. Resilience against a “loco” node with malformed metadata (both operator error and malicious users) | ||
| 21 | Use case 12 (User Authentication) | |||
| 22 | Use case 13 (User authorization) | |||
| 23 | Use case 14 (system authenication and authorization) | |||
| 24 | Integration tests across multiple Member Nodes and Coordinating Nodes. For example, inserting documents at the same time but in different places to ensure it works and we don’t have deadlocks showing up. | |||
| 25 | 1 | 1 | Threshold tests for heartbeats and other status testing. | |
| 26 | 1 | 1 | General penetration testing (WebInspect as a specific example) | |
| 27 | 1 | Use case 17 (CRUD logs aggregated at CN) Not impmemented yet | ||
| 28 | 1 | Use case 24 (MN and CN’s support transactions) | ||
| 29 | 1 | Does a MN data/metadata insertion prevent a race condition, even if insertion deviuosly tries to trigger a race condition | ||
| 30 | 2 | 2 | Test that the ITK libraries work properly on all supported (advertised) OS platforms, hardware, and software combinations | |
| 31 | 1 | 2 | Race condition: What happens if two MN’s submit the same thing? | |
| 32 | 1 | 3 | Testing the logging and auditing functions. Ditto on the reporting functions. There should be unit tests for this within the software stacks, but this needs to verify that, for example, a Member Node action is logged at the MN and at the CN. | |
| 33 | 3 | Performance testing (particularly as a function of load) | ||
| 34 | 3 | Accessibility – 508 compliance as a floor, but going beyond this to appropriate level of accessibility. | ||
| 35 | 3 | Test replication policies. (duplicative test of above?) Side issue: unit testing to see if replication policy is inconsistent. More an issue for rule-based replication policies. Precedence of operations may fix this, but what happens when the replication policy resolves to zero replication nodes allowed (or replication nodes less than required number of replication nodes)? | ||
| 36 | 3 | Usability testing (generally) – push off to a different category from Integration testing. | ||
| 37 | 4 | Response time test (both API and GUI). Relates back to performance testing. | ||
| 38 | 4 | Testing validity of science metadata (compliance with standards). See issue in parking lot. | ||
| 39 | 4 | Use case 3 (register – manual operation) | ||
| 40 | 4 | Deployment testing – can we test that dependancies have been resolved, perhaps a particular issue for the ITK, but also for the various Member Node stacks. | ||
| 41 | 5 | Test that the libraries can check for dependancies, so that when the library intializes it checks that the necessary dependancies have been installed. Some of this is handled with egg installations. ??what happens if something changes after the installation? Library needs to manage these issues, and emit appropriate error codes. | ||
| 42 | 5 | Use case 10 (MN status reports) | 
Note
How many failure modes do we want to test? Need a Failure Mode Effect Analysis (FMEA) to prioritize the test cases to write against. Comment on FMEA: Assign, for a given failure mode a score of 1-10 for each of a) how likely is this to occur, b) what is the cost if this occurs, c) how early can the failure mode be detected, d) how much does it cost to mitigate the failure mode. Idea is that you multiply across the scores and prioritize accordingly. A failure mode that is very likely, is catastrophic, cannot be detected until it occurs, and is easy to mitigate prioritizes to the top of the list.