Duplicate content is one of the problems that we regularly come across as part of the search engine optimization services we offer. If the search engines determine your site contains similar content, this may result in penalties and even exclusion from the search engines. Fortunately it's a problem that is easily rectified.
Your primary weapon of choice against duplicate content can be found within "The Robot Exclusion Protocol" which has now been adopted by all the major search engines.
There are two ways to control how the search engine spiders index your site.
1. The Robot Exclusion File or "robots.txt" and
2. The Robots < Meta > Tag
The Robots Exclusion File (Robots.txt)
This is a simple text file that can be created in Notepad. Once created you must upload the file into the root directory of your website e.g. www.yourwebsite.com/robots.txt. Before a search engine spider indexes your website they look for this file which tells them exactly how to index your site's content.
The use of the robots.txt file is most suited to static html sites or for excluding certain files in dynamic sites. If the majority of your site is dynamically created then consider using the Robots Tag.
Creating your robots.txt file
Example 1 Scenario
If you wanted to make the .txt file applicable to all search engine spiders and make the entire site available for indexing. The robots.txt file would look like this:
User-agent: *
Disallow:
Explanation
The use of the asterisk with the "User-agent" means this robots.txt file applies to all search engine spiders. By leaving the "Disallow" blank all parts of the site are suitable for indexing.
Example 2 Scenario
If you wanted to make the .txt file applicable to all search engine spiders and to stop the spiders from indexing the faq, cgi-bin the images directories and a specific page called faqs.html contained within the root directory, the robots.txt file would look like this:
User-agent: *
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /faqs.html
Explanation
The use of the asterisk with the "User-agent" means this robots.txt file applies to all search engine spiders. Preventing access to the directories is achieved by naming them, and the specific page is referenced directly. The named files & directories will now not be indexed by any search engine spiders.
Example 3 Scenario
If you wanted to make the .txt file applicable to the Google spider, googlebot and stop it from indexing the faq, cgi-bin, images directories and a specific html page called faqs.html contained within the root directory, the robots.txt file would look like this:
User-agent: googlebot
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /faqs.html
Explanation
By naming the particular search spider in the "User-agent" you prevent it from indexing the content you specify. Preventing access to the directories is achieved by simply naming them, and the specific page is referenced directly. The named files & directories will not be indexed by Google.
That's all there is to it!
As mentioned earlier the robots.txt file can be difficult to implement in the case of dynamic sites and in this case it's probably necessary to use a combination of the robots.txt and the robots tag.
The Robots Tag
This alternative way of telling the search engines what to do with site content appears in the section of a web page. A simple example would be as follows;
In this example we are telling all search engines not to index the page or to follow any of the links contained within the page.
In this second example I don't want Google to cache the page, because the site contains time sensitive information. This can be achieved simply by adding the "noarchive" directive.
What could be simpler!
Although there are other ways of preventing duplicate content from appearing in the Search Engines this is the simplest to implement and all websites should operate either a robots.txt file and or a Robot tag combination.
Should you require further information about our search engine marketing or optimization services please visit us at http://www.e-prominence.co.uk ? The search marketing company
I was all set to write an article predicting the... Read More
In order to design a website that performs well with... Read More
Duplicate content is one of the problems that we regularly... Read More
It is no secret that Google and Yahoo are on... Read More
While there are many ethical SEO firms serving Internet users... Read More
A few months ago I was looking through the search... Read More
It's difficult to dispute the rational behind the rant since... Read More
Julia: Welcome Bob. Thank you for taking the time to... Read More
The corporate fundamentals are par excellence! The product is unsurpassable... Read More
For anyone looking to enhance their Google Page Rank (PR)... Read More
Let's talk about what keyword density is and how to... Read More
Search engine traffic should be a priority for any online... Read More
Welcome to part seven in this ten-part search engine positioning... Read More
The importance to the algorithmic web crawlers that speed throughout... Read More
According to the dictionary, the definition of the word "overture"... Read More
Unfortunately, not many Search Engine Optimization companies know what this... Read More
SEO is all about "Individualism"According to me SEO is not... Read More
Webmasters can spend most of their waking hours doing everything... Read More
There are those that get on the computer one night... Read More
Google has recently made some pretty significant changes in its... Read More
This is the second part of an article series in... Read More
The internet marketing industry is now flooded with various page... Read More
Some of the Search engines want only original content. So... Read More
If you have ever been into a McDonalds you will... Read More
Search engine optimization (SEO) is a long and complicated process... Read More
Most internet marketers are aware of, and probably use, the... Read More
It's no surprise that dominant cosmetic surgery practices also have... Read More
When Google Adwords first came on scene, it was not... Read More
The importance of keywordsKeywords or key phrases you choose will... Read More
As the economy begins to recover in certain parts of... Read More
Search Engine Optimization is the creation of a web page,... Read More
Recent studies suggest that more than 80% of new visitors... Read More
I just wanted to share a little Search Engine Optimization... Read More
You have put lot of sweat in making your site.... Read More
Google's premier of desktop search proves that the desktop is... Read More
If you're serious about SEO, you need to know how... Read More
The recent patent application filed by Google details numerous items... Read More
All the great efforts made for making your own site,... Read More
Get Indexed FastWhat does getting indexed mean?The search engines keep... Read More
Did you know that you can dramatically increase the number... Read More
When you get an e-mail from SEO Company with content... Read More
The importance to the algorithmic web crawlers that speed throughout... Read More
About 80% of website traffic comes through search engines. And... Read More
First of all, the Google directory is really just the... Read More
Web users turn to search engines for answers to their... Read More
I seem to have created quite a stir, on a... Read More
We all know that the lion's share of web traffic... Read More
Want to know the secret to great search engine listings?... Read More
In parts 1 and 2 you learnt how to develop... Read More
Finding Targeted Keyword Phrases Your Competitors MissFinding keyword phrases your... Read More
Generating high traffic to your web site can be costly,... Read More
Every business owner I've met has been encouraged at some... Read More
Search engine optimization is one of most popular online marketing... Read More
No, it's not a general question for all and sundry.... Read More
Computers have become a way of life for people around... Read More
For the uninitiated, searching for web pages can seem a... Read More
If you're like most other CEOs, the term "search engine... Read More
Google Sitemaps is a new tool for website owners and... Read More
Search engine optimization this and search engine optimization that. You... Read More
Most Internet marketing methods are risky and many will not... Read More
Google has many ways to help you find want you... Read More
There are a lot of ways to promote your website... Read More
There are a lot of things in Search Engine Optimization... Read More
The first task most netizens do when they log on... Read More
Most internet marketers are aware of, and probably use, the... Read More
If you are a webmaster, then you've probably submitted your... Read More
Search Engine Optimization (SEO) |