€cdocutils.nodes document q)q}q(U nametypesq}q(Xprivacy concernsqNXlogging and privacy concernsqNXimplications and issuesqNXpotential designsq NuUsubstitution_defsq }q Uparse_messagesq ]q Ucurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hUprivacy-concernsqhUlogging-and-privacy-concernsqhUimplications-and-issuesqh Upotential-designsquUchildrenq]qcdocutils.nodes section q)q}q(U rawsourceqUUparentqhUsourceqXl/var/lib/jenkins/jobs/API_Documentation_trunk/workspace/api-documentation/source/notes/LoggingAndPrivacy.txtqUtagnameq Usectionq!U attributesq"}q#(Udupnamesq$]Uclassesq%]Ubackrefsq&]Uidsq']q(haUnamesq)]q*hauUlineq+KUdocumentq,hh]q-(cdocutils.nodes title q.)q/}q0(hXLogging and Privacy concernsq1hhhhh Utitleq2h"}q3(h$]h%]h&]h']h)]uh+Kh,hh]q4cdocutils.nodes Text q5XLogging and Privacy concernsq6…q7}q8(hh1hh/ubaubcdocutils.nodes paragraph q9)q:}q;(hXBDesign decisions for DataONE have until now been focused on comprehensive and universal logging for all operations performed on Member Nodes and Coordinating Nodes. One rationale for this is that data providers have traditionally been unwilling to replicate their data for distribution by other parties because they have been unable to get usage metrics for these data. The current DataONE design for logging is based on 5 use cases that generally outline the need to provide log information to data providers (see :ref:`logging-use-case-synopsis` for summary of Use Cases 16, 17, 18, 19, and 20). Under the current :doc:`../design/LoggingSchema`, all operations are logged, recording the user's IP address, browser agent, the date and time and type of the operation, and the user's identity if they have authenticated to the system.hhhhh U paragraphq(h5XDesign decisions for DataONE have until now been focused on comprehensive and universal logging for all operations performed on Member Nodes and Coordinating Nodes. One rationale for this is that data providers have traditionally been unwilling to replicate their data for distribution by other parties because they have been unable to get usage metrics for these data. The current DataONE design for logging is based on 5 use cases that generally outline the need to provide log information to data providers (see q?…q@}qA(hXDesign decisions for DataONE have until now been focused on comprehensive and universal logging for all operations performed on Member Nodes and Coordinating Nodes. One rationale for this is that data providers have traditionally been unwilling to replicate their data for distribution by other parties because they have been unable to get usage metrics for these data. The current DataONE design for logging is based on 5 use cases that generally outline the need to provide log information to data providers (see hh:ubcsphinx.addnodes pending_xref qB)qC}qD(hX :ref:`logging-use-case-synopsis`qEhh:hhh U pending_xrefqFh"}qG(UreftypeXrefUrefwarnqHˆU reftargetqIXlogging-use-case-synopsisU refdomainXstdqJh']h&]U refexplicit‰h$]h%]h)]UrefdocqKXnotes/LoggingAndPrivacyqLuh+Kh]qMcdocutils.nodes inline qN)qO}qP(hhEh"}qQ(h$]h%]qR(UxrefqShJXstd-refqTeh&]h']h)]uhhCh]qUh5Xlogging-use-case-synopsisqV…qW}qX(hUhhOubah UinlineqYubaubh5XE for summary of Use Cases 16, 17, 18, 19, and 20). Under the current qZ…q[}q\(hXE for summary of Use Cases 16, 17, 18, 19, and 20). Under the current hh:ubhB)q]}q^(hX:doc:`../design/LoggingSchema`q_hh:hhh hFh"}q`(UreftypeXdocqahHˆhIX../design/LoggingSchemaU refdomainUh']h&]U refexplicit‰h$]h%]h)]hKhLuh+Kh]qbhN)qc}qd(hh_h"}qe(h$]h%]qf(hShaeh&]h']h)]uhh]h]qgh5X../design/LoggingSchemaqh…qi}qj(hUhhcubah hYubaubh5X», all operations are logged, recording the user's IP address, browser agent, the date and time and type of the operation, and the user's identity if they have authenticated to the system.qk…ql}qm(hX», all operations are logged, recording the user's IP address, browser agent, the date and time and type of the operation, and the user's identity if they have authenticated to the system.hh:ubeubh)qn}qo(hUhhhhh h!h"}qp(h$]h%]h&]h']qqhah)]qrhauh+Kh,hh]qs(h.)qt}qu(hXPrivacy concernsqvhhnhhh h2h"}qw(h$]h%]h&]h']h)]uh+Kh,hh]qxh5XPrivacy concernsqy…qz}q{(hhvhhtubaubh9)q|}q}(hXBRecently, discussions have pointed out that there are potential privacy concerns for data users associated with these logging policies, and that DataONE should consider cases where truly anonymous access to resources may be warranted. A comparison has been made to libraries, whereby patron access to resources is not recorded in order to avoid having to expose these records to third parties. A similar situation may exist where a data user does not want a data provider or other third parties to know that they accessed data in DataONE. Some example scenarios might include:q~hhnhhh hh„)r?}r@(hUh"}rA(h‰X-h']h&]h$]h%]h)]uhj;h]rBh‹)rC}rD(hXþUnder this scenario, data consumers would not authenticate against DataONE, and thus their identifying information would not be logged at MN or CN. However, under the current specification, their IP number would still be recorded, which may be sufficient to identify the user. The specification could be modified to eliminate the collection of IP numbers for the non-authenticated users, but this would significantly comprimise our ability to analyze anonymous download statistics (e.g., geographic breakdown, differentiating web-crawler accesses versus user accesses, etc.). An alternative would be to create a mechanism to differentiate typical non-authenticated access (where IP numbers are recorded) from 'anonymous' access (where IP numbers are not recorded).h"}rE(h$]h%]h&]h']h)]uhj?h]rFh9)rG}rH(hXþUnder this scenario, data consumers would not authenticate against DataONE, and thus their identifying information would not be logged at MN or CN. However, under the current specification, their IP number would still be recorded, which may be sufficient to identify the user. The specification could be modified to eliminate the collection of IP numbers for the non-authenticated users, but this would significantly comprimise our ability to analyze anonymous download statistics (e.g., geographic breakdown, differentiating web-crawler accesses versus user accesses, etc.). An alternative would be to create a mechanism to differentiate typical non-authenticated access (where IP numbers are recorded) from 'anonymous' access (where IP numbers are not recorded).rIhjChhh h