How to Extract XML Sitemap URLs in Google Sheets?

Google Sheet Formula to Extract Sitemap URLs

In this tutorial, you will learn how to extract URLs from a XML sitemap using a custom function of Google sheets.

Follow these steps to extract URLs of XML sitemap:

  1. Let’s start, first get the URL of XML sitemap from you want to extract URLs. eg: www.example.com/sitemap.xml.
  2. Now you need to create a new Google spreadsheet. Type sheets.new in the URL bar to create a new Google spreadsheet.
  3. Now, go to menu bar and select script editor under the tools option.

    Google Sheets Script Editor
  4. Now you will redirect to script page, in this script page, we will create a custom JavaScript function. For this copy below code and paste into your script editor and save it with your desire name.

    Google Sheets Script Editor
/**
* Returns URLs in sitemap.xml file
*
* @param {"https://www.google.com/gmail/sitemap.xml"} sitemapUrl REQUIRED The url of the sitemap
* @param {"http://www.sitemaps.org/schemas/sitemap/0.9"} namespace REQUIRED Look at the source of the xml sitemap, look for the xmlns value 
* @return Returns urls <loc> from an xml sitemap
* @customfunction
*/

function sitemap(sitemapUrl,namespace) {
  
  try {
    var xml = UrlFetchApp.fetch(sitemapUrl).getContentText();
    var document = XmlService.parse(xml);
    var root = document.getRootElement()
    var sitemapNameSpace = XmlService.getNamespace(namespace);
    
    var urls = root.getChildren('url', sitemapNameSpace)
    var locs = []
    
    for (var i=0;i <urls.length;i++) {
      locs.push(urls[i].getChild('loc', sitemapNameSpace).getText()) 
    }
    
    return locs  
  } catch (e) {
    return e 
  }
}

Now formula has been created, continue to next process:

  1. Get back to sheet and type =sitemapurl() which is the function name that we have created in the script editor.
    Sitemap Custom Function Google Sheets
  2. This function requires two parameters, first is sitemap URL (eg: http://www.countingcharacters.com/sitemap.xml) and second is namespace (eg: http://www.sitemaps.org/schemas/sitemap/0.9) of sitemap.
    Google Sheet Formula to Extract Sitemap URLs
  3. After putting the parameters, press enter to run a function.
  4. Now it’s done, you can see the list of URLs that are present in your sitemap.
extract urls from xml sitemap

I hope you understand the XML sitemap URLs extracting process by using the custom function of Google sheets.

If you are interested in Google sheets formulas, so you should also read this blog that about IMPORTXML function.

Still, you have any query, please let me know in the comment section.

If you learned something valuable, please share with others. ❤👍