XML Sitemaps and Robots.txt

Sitemaps help search engines navigate your website more easily, which helps search engines index your content better. We'll show you how you can add an XML sitemap to your LemonStand store to tell search engines which content to index and combine it with a Robots.txt file to define excluded content.

Adding an XML Sitemap Page

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site. Follow the steps below to create a Sitemap page for your online store.

Begin by creating a new Template. Set the Code to sitemap, the Content Type to text/xml; charset=utf-8 and include the {{ page() }}function call in the Content field. Click the Save & Close button.

Next create a new Page, setting the Name and Template fields to sitemap.

In this example we'll be generating the URLs of the product pages, which have the following structure, site_url/product/product_name, and the date of last modification.

Set the Page Action to shop:products and Paste the following code into the Content field:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   {% for product in products %}
         {% set page_url = "https:#{site_url('/')}product/#{product.url_name}" %}
         <loc>{{ page_url }}</loc>
         <lastmod>{{ product.updated_at|date("Y-m-d\\TH:i:sP") }}</lastmod>   
   {% endfor %}

Now if you navigate to storename.lemonstand.com/sitemap in Google Chrome, you'll be able to view the Sitemap. In order to see correctly formatted XML as shown below, you will need to view the page source.

As well as basic URL information, Sitemaps can contain detailed information about specific types of content on your site, including video, images and news.

Adding a Robots.txt page

The robots exclusion standard, also known as the robots exclusion protocol or simply robots.txt, is a standard used by websites to communicate with web crawlers and other web robots. The standard specifies how to inform the web robot about which areas of the website should not be processed or scanned. It can also be used to inform search engines about the location of your Sitemap. Follow the steps below to create a robots.txt file for your online store.

In the "Editor" create a new template and set the "Content-Type" to "text/plain; charset=utf-8". The contents of your template should simply be:

 {{ page() }}

Now, create a page and set the URL to "/robots.txt". Make sure the template is set to use the one created in the first step. The contents of the page should be similar to:

# robots.txt for http://www.example.com/
User-agent: *
Disallow: /cyberworld/map/ # This is an infinite virtual URL space
Disallow: /tmp/ # these will soon disappear
Disallow: /foo.html

You can also add a line specifying the location of the Sitemap in the site's robot.txt file as shown below:

Sitemap: https://storename.lemonstand.com/sitemap

Below is an example of how it would look to add a robots.txt page.