active.com is an aggregator of data from disparate sites within the Active Network. Because the source of content is distributed and because we have so much of it, creating a sitemap.xml file for SEO purposes has become challenging. We're working on a new search solution (which you can access in Beta at http://labs.active.com/search) , and the engine behind that has a sitemap.xml generating capability.
But there's only one problem: the file it generates is huge.
According to https://www.google.com/webmasters/tools/docs/en/protocol.html, a sitemap file can only be 10MB in size or up to 30,000 URLs (whichever comes first). The sitemap for active.com has over 220,000 URLs and is about 34MB in size.
We needed an application that would split the sitemap.xml file, according to the constraints above. I searched all over the internet for something that would do this, and was found only with commercial applications that wanted to crawl my site before generating/splitting the sitemap.xml files.
So, my team developed a simple, .NET-based application that splits a large sitemap.xml file into smaller ones, and also creates the sitemap index file which references them.
Because I'm feeling philanthropic, I decided to give you access to this tool, free of charge. Download it here.
- Google+Sitemap+Split+Installer.zip (147.0 K)