Saturday, March 12, 2016

Don’t end your urls with .exe

✔ Youtuber 2020 SEO Tips 11:21 PM

Sometimes at a conference people will ask me “Does it matter what extension I use for my pages? Does Google prefer .php over .asp, or .html over .htm?” And my answer is “We’re happy to crawl all of these file extensions. It doesn’t matter what you choose between any of those.”

Usually I also try to insert a reminder at the end of my reply such as “But there are some file extensions that are mostly binary data, such as .exe, where the vast majority of the time the data would be meaningless blobs, so there are a few extensions to avoid. If your files are named example.dll or example.bin and you don’t see Google crawling pages with that file extension, I’d recommend changing your file extension to something else.”

There’s a simple way to check whether Google will crawl things with a certain filetype extension. If you do a query such as [filetype:exe] and you don’t see any urls that end directly in “.exe” then that means either 1) there are no such files on the web, which we know isn’t true for .exe, or 2) Google chooses not to crawl such pages at this time — usually because pages with that file extension have been unusually useless in the past. So for example, if you query for [filetype:tgz] or [filetype:tar], you’ll see urls such as “papers.ssrn.com/pape.tar?abstract_id” that contain “.tar” but no files that end directly in .tar. That means that you probably shouldn’t make your html pages end in .tar.

The SEOmoz folks stumbled across this when they had a url that ended with “/web2.0” . It looks like previously they had a url looked like “/web2.0/” (note the trailing slash), which we were happy to crawl/index/rank. But when their linkage shifted enough that “/web2.0” became their preferred url, Google wouldn’t crawl urls ending in “.0”, so the page became uncrawled.

Even though urls ending in “.0” are often binary and therefore end up getting dropped later in our indexing pipeline, it’s always good to revisit old decisions and respond to feedback by running new tests. So just in the last day or so, we switched it so that Google is willing to crawl pages that end in in “.0”. This will help the small number of pages out on the web that want to serve up HTML pages with a “.0” extension.

You can see the results trickling into Google with a bunch of “X hours ago” fresh results:

So my quick takeaways would be:

– Why Google doesn’t crawl some filetype extensions (when we’ve seen good evidence that the extensions are mostly binary or otherwise not-very-indexable files).
– An easy was to use the filetype: operator, so that you can decide whether to avoid a particular filename extension yourself.
– Google is willing to revisit old decisions and test them again, which is what we’re doing with the “.0” filetype extension.

I hope that helps a few people who are considering unusual filetype extensions of their own.

Don’t end your urls with .exe Youtuber 2020 2016-03-12T23:21:00-08:00 5.0 stars based on 35 reviews Sometimes at a conference people will ask me “Does it matter what extension I use for my pages? Does Google prefer .php over .asp, or ....

Reduce CSS code by avoiding Browser vendor prefixes CSS3 is the new improvement of CSS in the web history. It brought many changes in the Internet and still the web is changing a ...
How to Transform Your Boring Blogger Blog Design into a Stunner As an amateur template designer, I can't help but notice the number of horribly designed blogs on the internet. Ugly blogs just ru ...
Top 3 Benefits of Guest Posting One of the most given tips forbuilding back-links and building a brand is via guest posting. Before I will star ...
List of Guest Blogging Sites (140+ best sites) Guest blogging can be an extremelyconsistent and fast way to build your email list and relationships with the people in your market. ...
Google Crawling and Indexing SEO is a very big sea and to understand SEO, we should know all the basic terminology of SEO. Crawling and indexing are two such wor ...

Load disqus comments

Saturday, March 12, 2016

Don’t end your urls with .exe

Related Posts

0 comments

About me

Popular Posts

Select Category

Cloud Labels

Recent Posts

Contact Form