Äcdocutils.nodes
document
q)Åq}q(U	nametypesq}q(X���transaction ratesqNX���time and bandwidth constraintsqNX���cn - cn transfer ratesqNuUsubstitution_defsq	}q
Uparse_messagesq]qUcurrent_sourceq
NU
decorationqNUautofootnote_startqKUnameidsq}q(hUtransaction-ratesqhUtime-and-bandwidth-constraintsqhUcn-cn-transfer-ratesquUchildrenq]qcdocutils.nodes
section
q)Åq}q(U	rawsourceqU�UparentqhUsourceqXu���/var/lib/jenkins/jobs/API_Documentation_trunk/workspace/api-documentation/source/notes/time_bandwidth_constraints.txtqUtagnameqUsectionqU
attributesq }q!(Udupnamesq"]Uclassesq#]Ubackrefsq$]Uidsq%]q&haUnamesq']q(hauUlineq)KUdocumentq*hh]q+(cdocutils.nodes
title
q,)Åq-}q.(hX���Time and Bandwidth Constraintsq/hhhhhUtitleq0h }q1(h"]h#]h$]h%]h']uh)Kh*hh]q2cdocutils.nodes
Text
q3X���Time and Bandwidth Constraintsq4ÖÅq5}q6(hh/hh-ubaubcdocutils.nodes
paragraph
q7)Åq8}q9(hXÏ���Given the DataONE architecture, estimate the constraints on rates of data
acquisition, the size of data objects, and the number of simultaneous users
that may be supported. There are of course, interactions between each of these
metricsq:hhhhhU	paragraphq;h }q<(h"]h#]h$]h%]h']uh)Kh*hh]q=h3X���Given the DataONE architecture, estimate the constraints on rates of data
acquisition, the size of data objects, and the number of simultaneous users
that may be supported. There are of course, interactions between each of these
metricsq>ÖÅq?}q@(hh:hh8ubaubh)ÅqA}qB(hU�hhhhhhh }qC(h"]h#]h$]h%]qDhah']qEhauh)K
h*hh]qF(h,)ÅqG}qH(hX���CN - CN Transfer RatesqIhhAhhhh0h }qJ(h"]h#]h$]h%]h']uh)K
h*hh]qKh3X���CN - CN Transfer RatesqLÖÅqM}qN(hhIhhGubaubh7)ÅqO}qP(hXI���Goal - what is the average rate of data transfer between each of the CNs.qQhhAhhhh;h }qR(h"]h#]h$]h%]h']uh)Kh*hh]qSh3XI���Goal - what is the average rate of data transfer between each of the CNs.qTÖÅqU}qV(hhQhhOubaubh7)ÅqW}qX(hXb���Four random files of sizes 1MB, 10MB, 100MB and 1GB were generated using
variants of the command::hhAhhhh;h }qY(h"]h#]h$]h%]h']uh)Kh*hh]qZh3Xa���Four random files of sizes 1MB, 10MB, 100MB and 1GB were generated using
variants of the command:q[ÖÅq\}q](hXa���Four random files of sizes 1MB, 10MB, 100MB and 1GB were generated using
variants of the command:hhWubaubcdocutils.nodes
literal_block
q^)Åq_}q`(hX8���dd if=/dev/urandom of=test_100M.bin bs=1048576 count=100hhAhhhU
literal_blockqah }qb(U	xml:spaceqcUpreserveqdh%]h$]h"]h#]h']uh)Kh*hh]qeh3X8���dd if=/dev/urandom of=test_100M.bin bs=1048576 count=100qfÖÅqg}qh(hU�hh_ubaubh7)Åqi}qj(hX¿���These were placed in a location (/var/www/test) that can be served by the apache
web server running on each of the CNs, and a script to time retrieval of the
documents from each node executed.qkhhAhhhh;h }ql(h"]h#]h$]h%]h']uh)Kh*hh]qmh3X¿���These were placed in a location (/var/www/test) that can be served by the apache
web server running on each of the CNs, and a script to time retrieval of the
documents from each node executed.qnÖÅqo}qp(hhkhhiubaubcsphinx.ext.graphviz
graphviz
qq)Åqr}qs(hU�hhAhhhUgraphvizqth }qu(UcodeqvXc��graph {

  fontname = "Courier";
  fontsize = 9;


  edge [
    fontname = "Courier"
    fontsize = 9
    color = "#333333"
    arrowhead = "open"
    arrowsize = 0.5
    len = 0.2
    dir = forward
    ljust = "l"
    ];

  node [
    fontname = "Courier"
    fontsize = 9
    fontcolor = "black"
    ljust = "l"];


UNM -- UCSB [label="1.1 (0.89)\n5.4 (1.84)\n30 (3.29)\n284 (3.51)"]
UCSB -- UNM [label="1.0 (1.00)\n5.6 (1.76)\n25 (3.89)\n232 (4.30)"];
UNM -- ORC [label="9.2 (0.11)\n14.2 (0.71)\n62 (1.61)\n553 (1.81)"]
ORC -- UNM [label="0.9 (0.54)\n2.1 (1.4)\n19.2 (5.2)\n144 (6.93)"]
UCSB -- ORC [label="9.2 (0.11)\n14.2 (0.7)\n40 (2.5)\n255 (3.91)"]
ORC -- UCSB [label="1.1 (0.86)\n5.7 (1.74)\n26 (3.77)\n268 (3.72)"]
UNM -- Home [label="2.2 (0.44)\n14.3 (0.70)"]
UCSB -- Home  [label="2.4 (0.40)\n14.5 (0.69)"]
ORC -- Home  [label="1.4 (0.70)\n11.7 (0.86)"]
}h%]h$]h"]h#]h']Uoptionsqw}uh)K;h*hh]ubh7)Åqx}qy(hXã��Preliminary results are shown in diagram above. Numbers on left are seconds,
numbers in parentheses are MB/sec. Each row represents average of three
transfers for each of the four file sizes of 1MB, 10MB, 100MB, and 1GB
respectively. For example, the time taken to transfer 100MB from UCSB to ORC
was 40 seconds. Only first two values are shown for transfers to Home (Verizon
FIOS in Annapolis).qzhhAhhhh;h }q{(h"]h#]h$]h%]h']uh)K<h*hh]q|h3X�Preliminary results are shown in diagram above. Numbers on left are seconds,
numbers in parentheses are MB/sec. Each row represents average of three
transfers for each of the four file sizes of 1MB, 10MB, 100MB, and 1GB
respectively. For example, the time taken to transfer 100MB from UCSB to ORC
was 40 seconds. Only first two values are shown for transfers to Home (Verizon
FIOS in Annapolis).q}ÖÅq~}q(hhzhhxubaubeubh)ÅqÄ}qÅ(hU�hhhhhhh }qÇ(h"]h#]h$]h%]qÉhah']qÑhauh)KEh*hh]qÖ(h,)ÅqÜ}qá(hX���Transaction RatesqàhhÄhhhh0h }qâ(h"]h#]h$]h%]h']uh)KEh*hh]qäh3X���Transaction RatesqãÖÅqå}qç(hhàhhÜubaubh^)Åqé}qè(hX‰��nCN = # of coordinating nodes
nD = # of data objects
nM = # of science metadata objects
nY = # of system metadata objects
nr = # of replicas of each data object
n0 = total number of objects before synchronization or replication
n1 = total number of objects after synchronization
n2 = total number of objects after replication
D = difference in object count between start and steady state

nY = nM + nD

n0 = nY + nM + nD

n1 = nY*nCN + nM*nCN + n0

n2 = nY + nr * nD + n1

D = n2 - n0hhÄhhhhah }qê(hchdh%]h$]h"]h#]h']uh)KIh*hh]qëh3X‰��nCN = # of coordinating nodes
nD = # of data objects
nM = # of science metadata objects
nY = # of system metadata objects
nr = # of replicas of each data object
n0 = total number of objects before synchronization or replication
n1 = total number of objects after synchronization
n2 = total number of objects after replication
D = difference in object count between start and steady state

nY = nM + nD

n0 = nY + nM + nD

n1 = nY*nCN + nM*nCN + n0

n2 = nY + nr * nD + n1

D = n2 - n0qíÖÅqì}qî(hU�hhéubaubh7)Åqï}qñ(hX���So, if::qóhhÄhhhh;h }qò(h"]h#]h$]h%]h']uh)K]h*hh]qôh3X���So, if:qöÖÅqõ}qú(hX���So, if:hhïubaubh^)Åqù}qû(hX-���nD = nM = 1, n0 = 4, n1 = 13, n2 = 18, D = 14hhÄhhhhah }qü(hchdh%]h$]h"]h#]h']uh)K_h*hh]q†h3X-���nD = nM = 1, n0 = 4, n1 = 13, n2 = 18, D = 14q°ÖÅq¢}q£(hU�hhùubaubh7)Åq§}q•(hXñ���If nD = 100,000 D = 1.4e6. The approximate (actually minimum) transaction rate
(t) to reach steady state after d days for this number of new objects::hhÄhhhh;h }q¶(h"]h#]h$]h%]h']uh)Kah*hh]qßh3Xï���If nD = 100,000 D = 1.4e6. The approximate (actually minimum) transaction rate
(t) to reach steady state after d days for this number of new objects:q®ÖÅq©}q™(hXï���If nD = 100,000 D = 1.4e6. The approximate (actually minimum) transaction rate
(t) to reach steady state after d days for this number of new objects:hh§ubaubh^)Åq´}q¨(hXB���d = 1   t = 16.2
d = 7   t = 2.3
d = 30  t = 0.54
d = 365 t = 0.04hhÄhhhhah }q≠(hchdh%]h$]h"]h#]h']uh)Kdh*hh]qÆh3XB���d = 1   t = 16.2
d = 7   t = 2.3
d = 30  t = 0.54
d = 365 t = 0.04qØÖÅq∞}q±(hU�hh´ubaubh7)Åq≤}q≥(hX���if nD = 1,000,000::q¥hhÄhhhh;h }qµ(h"]h#]h$]h%]h']uh)Kih*hh]q∂h3X���if nD = 1,000,000:q∑ÖÅq∏}qπ(hX���if nD = 1,000,000:hh≤ubaubh^)Åq∫}qª(hX?���d = 1   t = 162
d = 7   t = 23
d = 30  t = 5.4
d = 365 t = 0.44hhÄhhhhah }qº(hchdh%]h$]h"]h#]h']uh)Kkh*hh]qΩh3X?���d = 1   t = 162
d = 7   t = 23
d = 30  t = 5.4
d = 365 t = 0.44qæÖÅqø}q¿(hU�hh∫ubaubh7)Åq¡}q¬(hX
���if nD = 1e9::q√hhÄhhhh;h }qƒ(h"]h#]h$]h%]h']uh)Kph*hh]q≈h3X���if nD = 1e9:q∆ÖÅq«}q»(hX���if nD = 1e9:hh¡ubaubh^)Åq…}q (hXE���d = 1   t = 162000
d = 7   t = 23000
d = 30  t = 5400
d = 365 t = 443hhÄhhhhah }qÀ(hchdh%]h$]h"]h#]h']uh)Krh*hh]qÃh3XE���d = 1   t = 162000
d = 7   t = 23000
d = 30  t = 5400
d = 365 t = 443qÕÖÅqŒ}qœ(hU�hh…ubaubh7)Åq–}q—(hX ��Note that there will be many small additions of content, not necessarily a
single large chunk except in the case where a total rebuild is required. These
figures provide a quantitative basis for some indication as to what sort of
capacity can be handled by the infrastructure given the fundamental constraint
of the performance of the Coordinating Node replicated object store and the
overall latency of operations across the network. A few key observations:q“hhÄhhhh;h }q”(h"]h#]h$]h%]h']uh)Kxh*hh]q‘h3X ��Note that there will be many small additions of content, not necessarily a
single large chunk except in the case where a total rebuild is required. These
figures provide a quantitative basis for some indication as to what sort of
capacity can be handled by the infrastructure given the fundamental constraint
of the performance of the Coordinating Node replicated object store and the
overall latency of operations across the network. A few key observations:q’ÖÅq÷}q◊(hh“hh–ubaubcdocutils.nodes
bullet_list
qÿ)ÅqŸ}q⁄(hU�hhÄhhhUbullet_listq€h }q‹(Ubulletq›X���-h%]h$]h"]h#]h']uh)Kh*hh]qfi(cdocutils.nodes
list_item
qfl)Åq‡}q·(hXs���Adding 1 data set along with its science and system metadata causes creation
of 14 new data objects in the system.
hhŸhhhU	list_itemq‚h }q„(h"]h#]h$]h%]h']uh)Nh*hh]q‰h7)ÅqÂ}qÊ(hXr���Adding 1 data set along with its science and system metadata causes creation
of 14 new data objects in the system.qÁhh‡hhhh;h }qË(h"]h#]h$]h%]h']uh)Kh]qÈh3Xr���Adding 1 data set along with its science and system metadata causes creation
of 14 new data objects in the system.qÍÖÅqÎ}qÏ(hhÁhhÂubaubaubhfl)ÅqÌ}qÓ(hXO���Refactoring the data store, system metadata can be a very expensive
operation.
hhŸhhhh‚h }qÔ(h"]h#]h$]h%]h']uh)Nh*hh]qh7)ÅqÒ}qÚ(hXN���Refactoring the data store, system metadata can be a very expensive
operation.qÛhhÌhhhh;h }qÙ(h"]h#]h$]h%]h']uh)KÇh]qıh3XN���Refactoring the data store, system metadata can be a very expensive
operation.qˆÖÅq˜}q¯(hhÛhhÒubaubaubhfl)Åq˘}q˙(hXî���Overall network impact must be taken into consideration when bringing on a
new Member Node or when a Member Node adds a significant volume of data.
hhŸhhhh‚h }q˚(h"]h#]h$]h%]h']uh)Nh*hh]q¸h7)Åq˝}q˛(hXì���Overall network impact must be taken into consideration when bringing on a
new Member Node or when a Member Node adds a significant volume of data.qˇhh˘hhhh;h }r���(h"]h#]h$]h%]h']uh)KÖh]r��h3Xì���Overall network impact must be taken into consideration when bringing on a
new Member Node or when a Member Node adds a significant volume of data.r��ÖÅr��}r��(hhˇhh˝ubaubaubhfl)År��}r��(hXÛ���Preference should be towards less granularity of data. For example, a single
natural history collection alone may have several million records. These
should be contributed to DataONE as a collection not as individual data
objects per specimen.hhŸhhhh‚h }r��(h"]h#]h$]h%]h']uh)Nh*hh]r��h7)År	��}r
��(hX���Preference should be towards less granularity of data. For example, a single
natural history collection alone may have several million records. These
should be contributed to DataONE as a collection not as individual data
objects per specimen.r��hj��hhhh;h }r��(h"]h#]h$]h%]h']uh)Kàh]r
��h3X���Preference should be towards less granularity of data. For example, a single
natural history collection alone may have several million records. These
should be contributed to DataONE as a collection not as individual data
objects per specimen.r��ÖÅr��}r��(hj��hj	��ubaubaubeubeubeubahU�Utransformerr��NU
footnote_refsr��}r��Urefnamesr��}r��Usymbol_footnotesr��]r��Uautofootnote_refsr��]r��Usymbol_footnote_refsr��]r��U	citationsr��]r��h*hUcurrent_liner��NUtransform_messagesr��]r ��Ureporterr!��NUid_startr"��KU
autofootnotesr#��]r$��U
citation_refsr%��}r&��Uindirect_targetsr'��]r(��Usettingsr)��(cdocutils.frontend
Values
r*��or+��}r,��(Ufootnote_backlinksr-��KUrecord_dependenciesr.��NUrfc_base_urlr/��Uhttps://tools.ietf.org/html/r0��U	tracebackr1��àUpep_referencesr2��NUstrip_commentsr3��NU
toc_backlinksr4��Uentryr5��U
language_coder6��Uenr7��U	datestampr8��NUreport_levelr9��KU_destinationr:��NU
halt_levelr;��KU
strip_classesr<��Nh0NUerror_encoding_error_handlerr=��Ubackslashreplacer>��Udebugr?��NUembed_stylesheetr@��âUoutput_encoding_error_handlerrA��UstrictrB��U
sectnum_xformrC��KUdump_transformsrD��NU
docinfo_xformrE��KUwarning_streamrF��NUpep_file_url_templaterG��Upep-%04drH��Uexit_status_levelrI��KUconfigrJ��NUstrict_visitorrK��NUcloak_email_addressesrL��àUtrim_footnote_reference_spacerM��âUenvrN��NUdump_pseudo_xmlrO��NUexpose_internalsrP��NUsectsubtitle_xformrQ��âUsource_linkrR��NUrfc_referencesrS��NUoutput_encodingrT��Uutf-8rU��U
source_urlrV��NUinput_encodingrW��U	utf-8-sigrX��U_disable_configrY��NU	id_prefixrZ��U�U	tab_widthr[��KUerror_encodingr\��UUTF-8r]��U_sourcer^��hUgettext_compactr_��àU	generatorr`��NUdump_internalsra��NUsmart_quotesrb��âUpep_base_urlrc��U https://www.python.org/dev/peps/rd��Usyntax_highlightre��Ulongrf��Uinput_encoding_error_handlerrg��jB��Uauto_id_prefixrh��Uidri��Udoctitle_xformrj��âUstrip_elements_with_classesrk��NU
_config_filesrl��]Ufile_insertion_enabledrm��àUraw_enabledrn��KU
dump_settingsro��NubUsymbol_footnote_startrp��K�Uidsrq��}rr��(hhÄhhhhAuUsubstitution_namesrs��}rt��hh*h }ru��(h"]h%]h$]Usourcehh#]h']uU	footnotesrv��]rw��Urefidsrx��}ry��ub.