Wdocutils.nodesdocument)}( rawsourcechildren]hsection)}(hhh](htitle)}(hEnabling Web Searches: Sitemapsh]hTextEnabling Web Searches: Sitemaps}(hhparenthhhsourceNlineNuba attributes}(ids]classes]names]dupnames]backrefs]utagnamehhh hhhZ/var/lib/jenkins/jobs/metacat_beta/workspace/metacat/docs/user/metacat/source/sitemaps.rsthKubh paragraph)}(hXSitemaps are XML files that tell search engines - such as Google, which is discussed in this section - which URLs on your websites are available for crawling. Currently, the only way for a search engine to crawl and index Metacat so that individual metadata entries are available via Web searches is with a sitemap. Metacat automatically creates sitemaps for all public documents in the repository that meet these criteria:h]hXSitemaps are XML files that tell search engines - such as Google, which is discussed in this section - which URLs on your websites are available for crawling. Currently, the only way for a search engine to crawl and index Metacat so that individual metadata entries are available via Web searches is with a sitemap. Metacat automatically creates sitemaps for all public documents in the repository that meet these criteria:}(hh/hh-hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKhh hhubh bullet_list)}(hhh](h list_item)}(hIs publicly readableh]h,)}(hhDh]hIs publicly readable}(hhDhhFubah}(h]h!]h#]h%]h']uh)h+hh*hK hhBubah}(h]h!]h#]h%]h']uh)h@hh=hhhh*hNubhA)}(h Is metadatah]h,)}(hh[h]h Is metadata}(hh[hh]ubah}(h]h!]h#]h%]h']uh)h+hh*hK hhYubah}(h]h!]h#]h%]h']uh)h@hh=hhhh*hNubhA)}(h(Is the newest version in a version chainh]h,)}(hhrh]h(Is the newest version in a version chain}(hhrhhtubah}(h]h!]h#]h%]h']uh)h+hh*hK hhpubah}(h]h!]h#]h%]h']uh)h@hh=hhhh*hNubhA)}(hIs not archived h]h,)}(hIs not archivedh]hIs not archived}(hhhhubah}(h]h!]h#]h%]h']uh)h+hh*hKhhubah}(h]h!]h#]h%]h']uh)h@hh=hhhh*hNubeh}(h]h!]h#]h%]h']bullet-uh)h;hh*hK hh hhubh,)}(hZHowever, you must register the sitemaps with the search engine before it will take effect.h]hZHowever, you must register the sitemaps with the search engine before it will take effect.}(hhhhhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKhh hhubh )}(hhh](h)}(h Configurationh]h Configuration}(hhhhhhhNhNubah}(h]h!]h#]h%]h']uh)hhhhhhh*hKubh,)}(hXMetacat's sitemaps functionality is controlled by four properties in metacat.properties.h]hZMetacat’s sitemaps functionality is controlled by four properties in metacat.properties.}(hhhhhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKhhhhubh<)}(hhh](hA)}(hv``sitemap.enabled``: Controls whether sitemaps are automatically generated while Metacat is running. Defaults to true.h]h,)}(hv``sitemap.enabled``: Controls whether sitemaps are automatically generated while Metacat is running. Defaults to true.h](hliteral)}(h``sitemap.enabled``h]hsitemap.enabled}(hhhhubah}(h]h!]h#]h%]h']uh)hhhubhc: Controls whether sitemaps are automatically generated while Metacat is running. Defaults to true.}(hc: Controls whether sitemaps are automatically generated while Metacat is running. Defaults to true.hhubeh}(h]h!]h#]h%]h']uh)h+hh*hKhhubah}(h]h!]h#]h%]h']uh)h@hhhhhh*hNubhA)}(hu``sitemap.interval``: Controls the interval, in milliseconds, between rebuilding the sitemap index and sitemap files.h]h,)}(hu``sitemap.interval``: Controls the interval, in milliseconds, between rebuilding the sitemap index and sitemap files.h](h)}(h``sitemap.interval``h]hsitemap.interval}(hhhjubah}(h]h!]h#]h%]h']uh)hhjubha: Controls the interval, in milliseconds, between rebuilding the sitemap index and sitemap files.}(ha: Controls the interval, in milliseconds, between rebuilding the sitemap index and sitemap files.hjubeh}(h]h!]h#]h%]h']uh)h+hh*hKhjubah}(h]h!]h#]h%]h']uh)h@hhhhhh*hNubhA)}(hX-``sitemap.location.base``: Controls the URL pattern used in the ``sitemap_index.xml`` file. You can use either a full URL (e.g., ``https://example.com/some_path``) or a URL relative to your server (e.g., ``/some_path``). This is different than the ``sitemap.entry.base`` property (see directly below).h]h,)}(hX-``sitemap.location.base``: Controls the URL pattern used in the ``sitemap_index.xml`` file. You can use either a full URL (e.g., ``https://example.com/some_path``) or a URL relative to your server (e.g., ``/some_path``). This is different than the ``sitemap.entry.base`` property (see directly below).h](h)}(h``sitemap.location.base``h]hsitemap.location.base}(hhhj/ubah}(h]h!]h#]h%]h']uh)hhj+ubh': Controls the URL pattern used in the }(h': Controls the URL pattern used in the hj+ubh)}(h``sitemap_index.xml``h]hsitemap_index.xml}(hhhjBubah}(h]h!]h#]h%]h']uh)hhj+ubh, file. You can use either a full URL (e.g., }(h, file. You can use either a full URL (e.g., hj+ubh)}(h!``https://example.com/some_path``h]hhttps://example.com/some_path}(hhhjUubah}(h]h!]h#]h%]h']uh)hhj+ubh*) or a URL relative to your server (e.g., }(h*) or a URL relative to your server (e.g., hj+ubh)}(h``/some_path``h]h /some_path}(hhhjhubah}(h]h!]h#]h%]h']uh)hhj+ubh). This is different than the }(h). This is different than the hj+ubh)}(h``sitemap.entry.base``h]hsitemap.entry.base}(hhhj{ubah}(h]h!]h#]h%]h']uh)hhj+ubh property (see directly below).}(h property (see directly below).hj+ubeh}(h]h!]h#]h%]h']uh)h+hh*hKhj'ubah}(h]h!]h#]h%]h']uh)h@hhhhhh*hNubhA)}(hX``sitemap.entry.base``: Controls the URL pattern used for the entires in the individual sitemap files (e.g., ``sitemap1.xml``). You can use either a full URL (e.g., ``https://example.com/some_path``) or a URL relative to your server (e.g., ``/some_path``). h]h,)}(hX``sitemap.entry.base``: Controls the URL pattern used for the entires in the individual sitemap files (e.g., ``sitemap1.xml``). You can use either a full URL (e.g., ``https://example.com/some_path``) or a URL relative to your server (e.g., ``/some_path``).h](h)}(h``sitemap.entry.base``h]hsitemap.entry.base}(hhhjubah}(h]h!]h#]h%]h']uh)hhjubhW: Controls the URL pattern used for the entires in the individual sitemap files (e.g., }(hW: Controls the URL pattern used for the entires in the individual sitemap files (e.g., hjubh)}(h``sitemap1.xml``h]h sitemap1.xml}(hhhjubah}(h]h!]h#]h%]h']uh)hhjubh(). You can use either a full URL (e.g., }(h(). You can use either a full URL (e.g., hjubh)}(h!``https://example.com/some_path``h]hhttps://example.com/some_path}(hhhjubah}(h]h!]h#]h%]h']uh)hhjubh*) or a URL relative to your server (e.g., }(h*) or a URL relative to your server (e.g., hjubh)}(h``/some_path``h]h /some_path}(hhhjubah}(h]h!]h#]h%]h']uh)hhjubh).}(h).hjubeh}(h]h!]h#]h%]h']uh)h+hh*hK"hjubah}(h]h!]h#]h%]h']uh)h@hhhhhh*hNubeh}(h]h!]h#]h%]h']hhuh)h;hh*hKhhhhubeh}(h] configurationah!]h#] configurationah%]h']uh)h hh hhhh*hKubh )}(hhh](h)}(hCreating a Sitemaph]hCreating a Sitemap}(hj hj hhhNhNubah}(h]h!]h#]h%]h']uh)hhjhhhh*hK(ubh,)}(hX{Metacat automatically generates a sitemap file for all public documents in the repository on a daily basis. The sitemap file(s) must be available via the Web on your server, and must be registered with Google before they take effect. For information on the sitemap protocol, please refer to the Google page on using the sitemap protocol. You can view Metacat's sitemap files at::h]hX|Metacat automatically generates a sitemap file for all public documents in the repository on a daily basis. The sitemap file(s) must be available via the Web on your server, and must be registered with Google before they take effect. For information on the sitemap protocol, please refer to the Google page on using the sitemap protocol. You can view Metacat’s sitemap files at:}(hXzMetacat automatically generates a sitemap file for all public documents in the repository on a daily basis. The sitemap file(s) must be available via the Web on your server, and must be registered with Google before they take effect. For information on the sitemap protocol, please refer to the Google page on using the sitemap protocol. You can view Metacat's sitemap files at:hjhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hK*hjhhubh literal_block)}(h/sitemapsh]h/sitemaps}(hhhj*ubah}(h]h!]h#]h%]h'] xml:spacepreserveuh)j(hK0hjhhhh*ubh,)}(h%The directory contains an index file:h]h%The directory contains an index file:}(hj<hj:hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hK2hjhhubh block_quote)}(hhh]h,)}(hsitemap_index.xmlh]hsitemap_index.xml}(hjOhjMubah}(h]h!]h#]h%]h']uh)h+hh*hK4hjJubah}(h]h!]h#]h%]h']uh)jHhjhhhh*hNubh,)}(h(and one or more sitemap XML files named:h]h(and one or more sitemap XML files named:}(hjchjahhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hK6hjhhubjI)}(hhh]h,)}(hsitemap.xmlh]hsitemap.xml}(hjthjrubah}(h]h!]h#]h%]h']uh)h+hh*hK8hjoubah}(h]h!]h#]h%]h']uh)jHhjhhhh*hNubh,)}(hwhere ```` is a number (e.g., 1 or 2) used to increment each sitemap file. Because Metacat limits the number of sitemap entries in each sitemap file to 50,000, the servlet creates an additional sitemap file for each group of 50,000 entries.h](hwhere }(hwhere hjhhhNhNubh)}(h````h]h}(hhhjubah}(h]h!]h#]h%]h']uh)hhjubh is a number (e.g., 1 or 2) used to increment each sitemap file. Because Metacat limits the number of sitemap entries in each sitemap file to 50,000, the servlet creates an additional sitemap file for each group of 50,000 entries.}(h is a number (e.g., 1 or 2) used to increment each sitemap file. Because Metacat limits the number of sitemap entries in each sitemap file to 50,000, the servlet creates an additional sitemap file for each group of 50,000 entries.hjhhhNhNubeh}(h]h!]h#]h%]h']uh)h+hh*hK:hjhhubh,)}(hHVerify that your sitemap files are available to the Web by browsing to::h]hGVerify that your sitemap files are available to the Web by browsing to:}(hGVerify that your sitemap files are available to the Web by browsing to:hjhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hK?hjhhubj))}(hd/sitemaps/sitemap.xml (e.g., https://example.org/metacat/sitemaps/sitemap1.xml)h]hd/sitemaps/sitemap.xml (e.g., https://example.org/metacat/sitemaps/sitemap1.xml)}(hhhjubah}(h]h!]h#]h%]h']j8j9uh)j(hKAhjhhhh*ubeh}(h]creating-a-sitemapah!]h#]creating a sitemapah%]h']uh)h hh hhhh*hK(ubh )}(hhh](h)}(hServing Your Sitemapsh]hServing Your Sitemaps}(hjhjhhhNhNubah}(h]h!]h#]h%]h']uh)hhjhhhh*hKEubh,)}(hXIn most scenarios, you'll want to take extra steps to make sure your sitemaps are served correctly so they're available and indexable by Google. Because Metacat places sitemap XML files in ``/sitemaps``, you'll need to configure your web server to serve these files.h](hIn most scenarios, you’ll want to take extra steps to make sure your sitemaps are served correctly so they’re available and indexable by Google. Because Metacat places sitemap XML files in }(hIn most scenarios, you'll want to take extra steps to make sure your sitemaps are served correctly so they're available and indexable by Google. Because Metacat places sitemap XML files in hjhhhNhNubh)}(h``/sitemaps``h]h/sitemaps}(hhhjubah}(h]h!]h#]h%]h']uh)hhjubhB, you’ll need to configure your web server to serve these files.}(h@, you'll need to configure your web server to serve these files.hjhhhNhNubeh}(h]h!]h#]h%]h']uh)h+hh*hKGhjhhubh,)}(hAs an example, a sample configuration is presented for the Apache 2 web server that uses `mod_rewrite` to redirect clients accessing your sitemaps from the top level of your website to their location under the Metacat deployment context:h](hYAs an example, a sample configuration is presented for the Apache 2 web server that uses }(hYAs an example, a sample configuration is presented for the Apache 2 web server that uses hjhhhNhNubhtitle_reference)}(h `mod_rewrite`h]h mod_rewrite}(hhhj ubah}(h]h!]h#]h%]h']uh)j hjubh to redirect clients accessing your sitemaps from the top level of your website to their location under the Metacat deployment context:}(h to redirect clients accessing your sitemaps from the top level of your website to their location under the Metacat deployment context:hjhhhNhNubeh}(h]h!]h#]h%]h']uh)h+hh*hKLhjhhubh,)}(h'(Note: Ensure `mod_rewrite` is enabled)h](h(Note: Ensure }(h(Note: Ensure hj$hhhNhNubj )}(h `mod_rewrite`h]h mod_rewrite}(hhhj-ubah}(h]h!]h#]h%]h']uh)j hj$ubh is enabled)}(h is enabled)hj$hhhNhNubeh}(h]h!]h#]h%]h']uh)h+hh*hKPhjhhubj))}(h6RewriteRule ^/(sitemap.+) /metacat/sitemaps/$1 [R=303]h]h6RewriteRule ^/(sitemap.+) /metacat/sitemaps/$1 [R=303]}(hhhjFubah}(h]h!]h#]h%]h']j8j9languagetextlinenoshighlight_args}uh)j(hh*hKRhjhhubh,)}(hYou should also ensure your ``robots.txt`` file correctly points to the location of the ``sitemap_index.xml``. e.g., for example.org:h](hYou should also ensure your }(hYou should also ensure your hjYhhhNhNubh)}(h``robots.txt``h]h robots.txt}(hhhjbubah}(h]h!]h#]h%]h']uh)hhjYubh. file correctly points to the location of the }(h. file correctly points to the location of the hjYhhhNhNubh)}(h``sitemap_index.xml``h]hsitemap_index.xml}(hhhjuubah}(h]h!]h#]h%]h']uh)hhjYubh. e.g., for example.org:}(h. e.g., for example.org:hjYhhhNhNubeh}(h]h!]h#]h%]h']uh)h+hh*hKVhjhhubh,)}(h``robots.txt``:h](h)}(h``robots.txt``h]h robots.txt}(hhhjubah}(h]h!]h#]h%]h']uh)hhjubh:}(h:hjhhhNhNubeh}(h]h!]h#]h%]h']uh)h+hh*hKYhjhhubj))}(hFUser-agent: * Allow: / sitemap: https://example.org/sitemap_index.xmlh]hFUser-agent: * Allow: / sitemap: https://example.org/sitemap_index.xml}(hhhjubah}(h]h!]h#]h%]h']j8j9jTtextjVjW}uh)j(hh*hK[hjhhubeh}(h]serving-your-sitemapsah!]h#]serving your sitemapsah%]h']uh)h hh hhhh*hKEubh )}(hhh](h)}(hRegistering a Sitemaph]hRegistering a Sitemap}(hjhjhhhNhNubah}(h]h!]h#]h%]h']uh)hhjhhhh*hKcubh,)}(hBefore Google will begin indexing the public files in your Metacat, you must register the sitemaps. To register your sitemaps and ensure that they are up to date:h]hBefore Google will begin indexing the public files in your Metacat, you must register the sitemaps. To register your sitemaps and ensure that they are up to date:}(hjhjhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKdhjhhubhenumerated_list)}(hhh](hA)}(hZRegister for a Google Webmaster Tools account, and add your Metacat site to the Dashboard.h]h,)}(hZRegister for a Google Webmaster Tools account, and add your Metacat site to the Dashboard.h]hZRegister for a Google Webmaster Tools account, and add your Metacat site to the Dashboard.}(hjhjubah}(h]h!]h#]h%]h']uh)h+hh*hKhhjubah}(h]h!]h#]h%]h']uh)h@hjhhhh*hNubhA)}(hFrom your Google Webmaster Tools site account, register your sitemaps. See the Google help site for more information about how to register sitemaps. Note: Register the full URL path to your sitemap files, including the http:// (or https://) headers. h]h,)}(hFrom your Google Webmaster Tools site account, register your sitemaps. See the Google help site for more information about how to register sitemaps. Note: Register the full URL path to your sitemap files, including the http:// (or https://) headers.h](hFrom your Google Webmaster Tools site account, register your sitemaps. See the Google help site for more information about how to register sitemaps. Note: Register the full URL path to your sitemap files, including the }(hFrom your Google Webmaster Tools site account, register your sitemaps. See the Google help site for more information about how to register sitemaps. Note: Register the full URL path to your sitemap files, including the hjubh reference)}(hhttp://h]hhttp://}(hhhjubah}(h]h!]h#]h%]h']refurijuh)j hjubh (or }(h (or hjubj )}(hhttps://h]hhttps://}(hhhj"ubah}(h]h!]h#]h%]h']refurij$uh)j hjubh ) headers.}(h ) headers.hjubeh}(h]h!]h#]h%]h']uh)h+hh*hKjhjubah}(h]h!]h#]h%]h']uh)h@hjhhhh*hNubeh}(h]h!]h#]h%]h']enumtypearabicprefixhsuffix.uh)jhjhhhh*hKhubh,)}(hmOnce the sitemaps are registered, Google will begin to index the public documents in your Metacat repository.h]hmOnce the sitemaps are registered, Google will begin to index the public documents in your Metacat repository.}(hjOhjMhhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKohjhhubh,)}(hNOTE: As you add more publicly accessible data to Metacat, you will need to periodically revisit the Google Webmaster Tools utility to refresh your sitemap registration.h]hNOTE: As you add more publicly accessible data to Metacat, you will need to periodically revisit the Google Webmaster Tools utility to refresh your sitemap registration.}(hj]hj[hhhNhNubah}(h]h!]h#]h%]h']uh)h+hh*hKrhjhhubeh}(h]registering-a-sitemapah!]h#]registering a sitemapah%]h']uh)h hh hhhh*hKcubeh}(h]enabling-web-searches-sitemapsah!]h#]enabling web searches: sitemapsah%]h']uh)h hhhhhh*hKubah}(h]h!]h#]h%]h']sourceh*uh)hcurrent_sourceN current_lineNsettingsdocutils.frontendValues)}(hN generatorN datestampN source_linkN source_urlN toc_backlinksentryfootnote_backlinksK sectnum_xformKstrip_commentsNstrip_elements_with_classesN strip_classesN report_levelK halt_levelKexit_status_levelKdebugNwarning_streamN tracebackinput_encoding utf-8-siginput_encoding_error_handlerstrictoutput_encodingutf-8output_encoding_error_handlerjerror_encodingUTF-8error_encoding_error_handlerbackslashreplace language_codeenrecord_dependenciesNconfigN id_prefixhauto_id_prefixid dump_settingsNdump_internalsNdump_transformsNdump_pseudo_xmlNexpose_internalsNstrict_visitorN_disable_configN_sourceh* _destinationN _config_files]pep_referencesN pep_base_url https://www.python.org/dev/peps/pep_file_url_templatepep-%04drfc_referencesN rfc_base_urlhttps://tools.ietf.org/html/ tab_widthKtrim_footnote_reference_spacefile_insertion_enabled raw_enabledKsyntax_highlightlong smart_quotessmartquotes_localesNcharacter_level_inline_markupdoctitle_xform docinfo_xformKsectsubtitle_xformembed_stylesheetcloak_email_addressesenvNgettext_compactubreporterNindirect_targets]substitution_defs}substitution_names}refnames}refids}nameids}(jvjsjjjjjjjnjku nametypes}(jvNjNjNjNjnNuh}(jsh jhjjjjjkju footnote_refs} citation_refs} autofootnotes]autofootnote_refs]symbol_footnotes]symbol_footnote_refs] footnotes] citations]autofootnote_startKsymbol_footnote_startKid_startKparse_messages](hsystem_message)}(hhh](h,)}(hhh]hTitle underline too short.}(hhhjubah}(h]h!]h#]h%]h']uh)h+hjubj))}(h(Serving Your Sitemaps ------------------h]h(Serving Your Sitemaps ------------------}(hhhjubah}(h]h!]h#]h%]h']j8j9uh)j(hjubeh}(h]h!]h#]h%]h']levelKtypeWARNINGlineKEsourceh*uh)jubj)}(hhh](h,)}(hTitle underline too short.h]hTitle underline too short.}(hhhj!ubah}(h]h!]h#]h%]h']uh)h+hjubj))}(h(Serving Your Sitemaps ------------------h]h(Serving Your Sitemaps ------------------}(hhhj/ubah}(h]h!]h#]h%]h']j8j9uh)j(hjubeh}(h]h!]h#]h%]h']levelKtypejlineKEsourceh*uh)jhjhhhh*hKEubetransform_messages] transformerN decorationNhhub.