The link rel \u003d “canonical” attribute is another effective way to deal with duplicates and more. What is rel canonical? How to set up canonical URLs rel \u003d "canonical"

Allan Scott (Allan Scott), engineer software The Google Indexing Team, listed on the Webmaster Central blog the five most common mistakes webmasters make when using the rel \u003d "canonical" attribute, and gave some important tips for using the tag.

First of all, a search spokesperson reminded the industry that the canonical page attribute clearly tells crawlers which instance from a set of pages with similar content is being referenced by duplicates. At the same time, additional properties of the address (for example, PageRank) and related signals (the quality of the incoming link mass) are also transferred from the duplicated pages to the canonical one. The rel \u003d "canonical" attribute is currently supported by all major Western search engines: Yahoo !, Bing, and Google.

However, the use of the rel \u003d "canonical" attribute often causes certain difficulties for webmasters. In turn, errors related to the indication of the canonical page can also affect the display of resource pages in search results.

To avoid such mistakes, Google experts recommend adhering to the following general rules when setting the rel \u003d "canonical" attribute:

  • Most of the duplicated pages should contain links to the canonical address.
  • It is important to ensure that the page referenced with the rel \u003d "canonical" attribute exists and that the URL is correct (check if the server returns a 404 error).
  • Make sure that canon page not closed for indexing by search robots.
  • It is important to clearly understand which page the webmaster wants to see in the search results, it is this page that should be indicated as canonical (for example, if the site contains a set of pages with the same product model, differing in color, it is advisable to specify the page with the most popular color).
  • Remember to include the rel \u003d "canonical" attribute in the HTML section of your document.
  • It is important to avoid using the rel \u003d "canonical" attribute more than once per page. Otherwise, the search engine will simply ignore the attribute specification.

Mistake 1. The rel \u003d "canonical" attribute was used for the first page of the pagination series:

Imagine an article on your site has several pages:

  • example.com/article?story\u003dcupcake-news&page\u003d1
  • example.com/article?story\u003dcupcake-news&page\u003d2
  • etc.

In this case, pages 2 and 3 are not duplicates, which means that using the rel \u003d "canonical" attribute to indicate the canonical first page of the document would be an error. This error may cause pages 2 and 3 to be dropped from the index.

It is also important to use the rel \u003d "next" and rel \u003d "prev" HTML attributes when paginating a document to indicate the relationship between individual URLs.

Of course, rel \u003d "canonical" can be used for both absolute and relative links, however Google recommends using absolute links to minimize possible mistakes... If a basic link is specified in the document, then all relative links will be calculated based on it.

However, in cases where absolute links to a canonical page are mistakenly written as relative ( instead http://example.com/example.com/cupcake.html), the algorithms can ignore the fact that this page was listed as canonical.

Mistake 3. If several pages from a set with similar content are assigned canonical, or the rel \u003d "canonical" attribute is used by mistake:

Google experts often observe the following situation: a webmaster copies a page template, forgetting to change the value of the rel \u003d "canonical" attribute

If you use templates, remember to check if the rel \u003d "canonical" attribute was accidentally copied.

Another mistake webmasters make is when several pages from a set with similar content are assigned canonical. This is often due to the use of various plugins on the page. Plugin code is mistakenly embedded in the rel \u003d "canonical" attribute.

It is important to understand that in both cases google algorithms recognize the use of the rel \u003d "canonical" attribute as invalid, and disregard its use in indexing.

Mistake 4. One of the categories of the landing page links with the rel \u003d "canonical" attribute to the featured article:

With this approach, only the page with the article will be included in the index, while the category page itself will not be indexed.

Mistake 5. The rel \u003d "canonical" attribute is used in the document section :

As mentioned above, the rel \u003d "canonical" attribute must be included in the document section of the HTML code, and it must not be included in the document section. ... Otherwise, Google's algorithms will not take into account the use of this tag, especially if it will be displayed in plain text, or tags that we usually present in the section .

Additional Informationregarding the application of the rel \u003d "canonical" attribute is available on the form

In this article, we will analyze how and for what to use the rel \u003d "canonical" attribute, as well as on specific examples we will describe when it is better to use it.

What is rel canonical and what is it for?

The rel \u003d "canonical" attribute is one way to deal with duplicate content. It is placed on any HTML page between tags . Search robots begin to consider the page specified in the rel \u003d "canonical" attribute as priority (canonical). The canonical page will be displayed in the search, link weight and other characteristics of pages with the same content will go to it.

Thus, if your site has identical or very similar content available at different URLs, using the rel \u003d “canonical” attribute, you can specify the URL that is preferred for indexing.

When to use canonical links

1. To prevent the appearance of various duplicates. For instance:

  • sort pages: / * sort, asc, desc, list \u003d *;
  • duplicates due to UTM tags: * utm_source \u003d, / * utm_campaign \u003d, / * utm_content \u003d, / * utm_term \u003d, / * utm_medium \u003d;
  • other pages with GET parameters in the URL;
  • duplicates as a result of the peculiarities of the CMS (engine).

In this case, you need to add the rel \u003d “canonical” attribute to all static pages of the site. For example, for the page https://site.ru/category-1/page-2, rel \u003d “canonical” will look like this:

href \u003d “https://site.ru/category-1/page-2” /\u003e

2. For pages with very similar content available at different URLs.

For example, it can be pages of one series of a product that differs only in color or pages of a product that is located in several categories at once.

In this case, you need to point from all pages rel \u003d “canonical” to the main, priority page.

In this case, on each of the pagination pages, you need to specify the canonical "Show all" page.

For example, for the page https://site.ru/category-1/page-2, you need to write the canonical URL:

ru / category-1 / show-all ”/\u003e

How do I specify the main URL using the rel \u003d “canonical” attribute?

Prescribe between any HTML page tags

This is the main way. To specify a canonical link, place between tags on the page, the full URL of the page to be indexed.

For example, for the page https://site.ru/*utm_content\u003d the canonical will be https://site.ru/.

To get such a result, on the page https://site.ru/*utm_content\u003d we specified the tag:

ru /” />

Important!
To reduce the likelihood of error in link elements, use absolute links after the rel \u003d "canonical" attribute, not relative links.

Sitemap

In an XML sitemap, you can write the canonical (main) URL for any page.

Important!
The rel \u003d “canonical” attribute is a search engine recommendation, not a rule. In this case, the PS can ignore them.

In the HTTP header

Best used for non-HTML documents. For example, for PDF files.

In this case, the server, when requesting a duplicate file, must give a link to the original file:

Link: ; rel \u003d "canonical"

Important!
This method is suitable if you have access to the server settings. Not recommended for HTML documents.

Using a plugin

There are various plugins for the CMS that allow you to customize the canonical URL. For instance:
- canonical can be customized for WordPress using Yoast SEO;
- in OpenCart - implemented in the CMS settings (you need to go to the product settings and set the SEO URL parameter);
- to configure the canonical attribute in Joomla (version 3.x and higher), you need to enable the SEF function in the CMS settings. Once enabled, the rel \u003d “canonical” attribute will be added for technical pages of the /index.php?option type (indicating the URL to the page with the configured CNC).

How to check if rel \u003d “canonical” is configured correctly?

You can analyze special program for SEO site analysis -.

With this program, you will see:
- what pages on the site without rel \u003d “canonical” attribute;
- which pages have the rel \u003d "canonical" attribute, and which pages are canonical for them;

The main mistakes of using rel \u003d "canonical"

- Canonical URL gives 404 error.
- The specified canonical URL is on a different domain or subdomain.
- The canonical link is not indexable.
- Using rel \u003d “canonical” from pagination pages to the first page.

For all pagination pages it is wrong to write the canonical first page. This makes indexing of all pagination pages impossible.

For pagination pages, the same pages must be specified as canonical pages.

For example, the page https://site.ru/category-1/page-2 should contain a canonical link:

.

If your site has identical or very similar content available at different URLs, then the new format will allow you to specify the URL that should be returned to search engine... You can also be sure that all characteristics such as link weight, etc. will be transferred to the version you want addresses.

Now you can add this tag, to indicate your version of the address, inside the tag on pages with duplicate content:

In this way, Google will understandthat all duplicates refer to the canonical address specified in the tag. Additional address properties such as PageRank and related signals will also carry over from the duplicated pages to the specified one.

Such a tag will be useful mainly when using various engines (phpBB, IPB, WordPress, etc., for example, ipbskins.ru - website design development on IPB, you have to use a long robots.txt in order to avoid duplicate content), which create lots of similar pages, for example, these can be pages:

printed version of the article:
http://site.ru/article01.html?print\u003dtrue
text version of articles for mob. phones:
http://site.ru/lofiversion/article01.html
duplicated due to lack of engine:
http://site.ru/articles/?id\u003d1&category\u003dnew
http://site.ru/articles/?id\u003d1&tag\u003dkeyword
and a number of others ...

This standard can be adapted by any search engine when indexing a site.

For the popular blogging engine WordPress, a canonical plugin has already been developed that inserts a tag on required pages... Other popular engines for blogs, forums, online stores, etc. will also expand their functionality in the near future (stay tuned).

Answers to some popular questions about the tag:

Is rel \u003d “canonical” a hint or directive?
This is a hint that we take into account and, in interaction with other signals, calculate the most relevant page to display in search results.

Can I use a relative path to indicate canonical, like this: ?
Yes, relative paths are recognized in the same way as in a regular tag ... Even if you enter the tag with a link to a document, then relative paths will be considered in accordance with the base URL.

Is it ok if the canonical URLs contain incompletely duplicated content?
We allow minor differences such as the sort order in the product table. We also understand that canonical addresses can be parsed by the robot at different times, so this is all normal.

What if rel \u003d “canonical” returns a 404 error?
We will continue to index your content and use a heuristic approach to determine the canonical URL, however, we recommend that you use existing URLs as canonical.

What if rel \u003d “canonical” hasn't been indexed yet?
We try to reach the canonical URL quickly. As soon as we index it, then we immediately re-examine the rel \u003d "canonical" hint.

Can a canonical url contain a redirect?
Yes, you can specify a redirect, in this case search engine will process the redirect process as usual and try to index the new address.

What if I have conflicting signals for rel \u003d “canonical”?
Our algorithms are soft: we can follow canonical chains, however, we strongly recommend that you specify a single canonical address on the pages to ensure the optimal canonicalization result.

Could this link tag suggest a canonical address on a completely different domain?
No. To migrate to another domain is more suitable. Google currently supports canonicalization within subdomains or within a single domain. This way, site owners can specify www.example.com instead example.com or help.example.comhowever cannot indicate example.com instead example-widgets.com.

Sounds interesting, but can I see an example?
Yes, wikia.com helped us as a trust tester. For example, you will notice that source at http://starwars.wikia.com/wiki/Nelvana_Limited contains rel \u003d canonical http://starwars.wikia.com/wiki/Nelvana.

The two URLs are nearly identical, except that Nelvana_Limited, the first URL, contains a short message near the header. it good example use of the tag in the future. With rel \u003d canonical, the properties of the two URLs are merged and the search results display the correct version.

If you have any questions about using the new tag, you can ask them in the comments on the official Google Webmaster Blog.

1. In addition to getting rid of natural duplicate content (due to a lack of an engine), we also get rid of artificial duplicate content when competitors are trying to annoy us by adding to pages with arbitrary parameters in the URL.

2. Now there is no need to use robots.txt to prohibit indexing of pages such as the “print version” and other duplicates (for example, in WordPress, you had to close the / teg / path) and please each search engine separately (there is general standards for robots.txt, but there are also a number of features of how each search engine works with this file, so we previously could not foresee the prohibition of indexing some pages for all bots at once).

3. We now have good tool to speed up the indexing of the site 🙂

Quite often, you can see on different sites that visitors come to the same content from different addresses (URL). The reason for this phenomenon is duplication of content on the site. The right way - it happens when used different systems content management (cms) on the site. In order to avoid the problem, by Google back in 2009 it was suggested to use the tag rel \u003d ”canonical”, for a page with a specific url, which will participate in search engine results. A little later, all search engines supported the idea.

Rel \u003d "canonical" attribute

How to use the rel \u003d "canonical" attribute.

Let's say you found a page to which visitors come from different urls and want to solve the problem using an attribute. To do this, select the main url, for example: https: // site / kak-samomu-raskrutit / page-one-1 / and now to inform the search engine about this, you need to:

  • register attribute rel \u003d ”canonical” for the main page and add a tag to the page in body , here's an example:

The search engine will highlight this address as the main one and it will be used in search results. In order to avoid problems, include absolute links, not relative ones.

You need to use:

https: // site / kak-samomu-raskrutit / page-one-1 /

Do not use:

/ kak-samomu-raskrutit / page-one-1 /

Now it may appear before us main question, and whether characteristics such as link weight, pr are transferred to the canonical page? I can say for sure, all page characteristics, such as link weight, etc., are transmitted, tested in practice.

To make life easier for webmasters and SEOs, I recommend using plugins and modules for CMS that will track links with duplicate content and automatically prescribe canonical attributes. For WordPress, I can recommend the plugin, it does the job perfectly. It is enough to tick the “Canonical URLs” box and canonical pages will be generated automatically.


You just have to check correctly and correct the plugin robot.

Examples where to userel \u003d "canonical".

1. Server shows same content for https protocol and www subdomain, example:

http://lonbo.com/page-one
https://loknbol.com/page-one
http://www.lonbo.com/page-one

So, for this case, you can use.

2. For sites that use engineswhen saving content to different sections (categories).

https: // site / category-1 / page-one1 /
https: // site / category-2 / page-one1 /

3. Dynamic URLs... As a rule, it is typical for products of online stores that are created in different sessions or for different search queries... Example:

https: // site / products? category \u003d shapka & color \u003d gray
https: // site / head / gray? gclid \u003d ABCD
https: //site/shapka/grey/shapkaGrey.html

4. Distribution of site content (resource) to other sites, fully or partially.

General rules for using the rel \u003d "canonical" attribute.

  1. Do not use an attribute more than once for one page. The search engine can simply ignore his instructions.
  2. Remember to include rel \u003d ”canonical” in your HTML code section. Check especially when using plugins or modules.
  3. Make sure the canonical page is open for indexing, otherwise the use is useless.
  4. Pages

Email marketing is quite popular on the Internet today. This is especially true in the field of SEO news. Looking through one of the regular mailing lists dedicated to eliminating page duplicates, I noticed the following:

It seems like a trifle, but it makes you doubt. Based on these words, rel \u003d "canonical" tag, or rather an attribute, must be written on the duplicate page and indicate a link to itself!

How to correctly register and use rel canonical

Let's clarify this controversial issue. Why an attribute and not a tag? Because rel \u003d "canonical" is exactly the attribute (part) of the link, and not an independent tag. So here CORRECT use of rel \u003d "canonical" attribute: a canonical link is placed from the take page to the original page. It looks like this: on the duplicate page, which is located at http://yoursite.com/dubl, create an element of the following kind:

And for dessert - Matt Cutts' opinion on rel \u003d "canonical" and its applications:

Did you like the article? To share with friends: