Introducing Google’s New Sitemap Generator

Google’s Webmasters Central has released a brand new Google sitemap generator for Web server administrators looking to help automate the generation of XML sitemaps on their Web servers. Since I didn’t see much being written about it on the blogs I read, I decided to cover it here.

The new open-source sitemap generator, currently in beta, is a software application that improves the ability of search engines to find your Web sites’ content. (Yes, I said Web sites.) Whether your Web server is hosting one Web site or hundreds of sites, the Google Sitemap Generator will be able to create industry standard XML sitemaps and submit them automatically to your search engines of choice (I personally prefer Google, Yahoo, and MSN/Live; your mileage may vary). Here’s how it works.

What does the Google Sitemap Generator do?

Unlike other sitemap generators, which create XML sitemaps by crawling Web sites, Google’s Sitemap Generator monitors your Web server’s traffic levels and accesses your server’s logs, files and other data, which it then uses to create and update a customized XML sitemap file for every Web site on your server. In addition to generating and updating an XML sitemap for every Web site on your server, the new Google Sitemap Generator can also assist in the creation of mobile sitemaps and Code Search sitemaps.

The server plugin can also calculate relevant metadata (such as lastmod and changefreq ) and ping Google’s Blog Search (and other search engines that support the sitemaps.org standard) whenever you add or modifiy a new URI automatically. According to John Mueller of Google, the combination of these methods allows Google’s Sitemap Generator to be very fast in finding these URLs and calculating relevant metadata, thereby making your Sitemap files as effective as possible.

So, what’s the catch?

For starters, you need root access to the Web server in order to install it. If you have a shared or reseller hosting account, this isn’t the product for you. Instead, I recommend using either a program like Gsitemap (which requires the .NET framework to be installed on your Windows PC), a Web-based application (such as this handy tool from AuditMyPC.com) to create your XML sitemap files, or make them by hand. That being said, I do need to issue one word of caution before going any further. If you choose to participate in the beta testing, respect your users’ privacy. Google even makes this point patently clear on the project home page:

PRIVACY WARNING: Any Sitemap information that you send to Google (including Sitemaps created using the Sitemap Generator) should be consistent with commitments you make to your users in your site’s privacy policy. If your site contains or generates URLs that contain user information, you must filter the user information out of the data that you send to Google. Instructions for filtering such information can be found at the Sitemap Generator configuration instructions here. In addition, you must add language to your privacy policy substantially similar to “This site uses a tool which collects your requests for pages and passes elements of them to search engines to assist them in indexing this site. We control the configuration of the tool and are responsible for any information sent to the search engines.”

Where can I get the Google Sitemap Generator, and how do I use it?

You can download the Google Sitemap Generator directly from Google Code. Make sure you read the installation instructions and configure the sitemap generator before using it. You’ll need either Apache (1.3, 2.0, or 2.2) or IIS (version 6 or 7), between 100MB to 1GB, and administrative access to the Web server in order to install the software (Windows server administrators will need to use a console or remote desktop connection).

One Response to “Introducing Google’s New Sitemap Generator”

  1. Nice post Dan and good recommendation. I was thinking about using something similar a few months back but was a concerned about the many files on my server that aren’t up-to-date or finished…I’d be doing a lot of filtering but it would be worth it.

Leave a Response

Contact Information


:) ;) :D ;D  >:( :( :o 8) ??? ::) :P :-[ :-X :-| :-* :cry: >:D }:) #P :tickedoff: ^-^ ^A^ :whacko: !:? >:/! :[) *|:) {>{> @p? O0 :applaud: :brew: :drool: :blank: :deal:

Note: This post is over a year and a half old. You may want to check later in this blog to see if there is new information relevant to your comment.