New rel \u003d canonical tag to remove duplicate pages. Should you link pages on different domains with rel \u003d canonical? How to check if rel \u003d "canonical" is configured correctly

The canonical tag (rel \u003d "canonical") is a pretty important tool search engine optimization... It is often even better than a 301 redirect when dealing with duplicate content.

Let's take a closer look at this tag.

What's wrong with duplicate content

Duplicate content is two documents of the same content. When Google sees a duplicate, it tries to exclude it from the index. in theory, the user will not want to see the same document in search results repeatedly. And besides, the search engine does not want to constantly process thousands or even millions of duplicate pages, spending its production capacity on this.

The problem for the webmaster is that if the same information is on different pages, then only one of these pages will appear in the search results. But the url that Google chooses is not always the most optimal for the user, and is not always the primary source.

Until the search engine identifies the original source, the search results will not meet the goals of the original author of the content. In this case, the canonical tag is of limited use, as those who steal your content will most likely not be tagging.

On the other hand, if duplication occurs on your site, then the tag will come in handy. Even if your site contains links to duplicate content, only the originating page will have the SERP. Thus, even if there are multiple duplicate links, only one of these pages will be considered meaningful to Google. And it will not suffer.

Naturally, this is not the best solution in terms of SEO. But this will not lead to any sanctions from Google.

What is rel \u003d "canonical" tag

The tag has the following syntax: This way Google and Bing will understand that all duplicates link to the canonical URL specified in the tag. Google has a clear understanding of the use of this tag:

Yes, rel \u003d "canonical" should only be used to select the preferred page for duplication (slight differences in content are allowed).

In other words, only use it to deal with duplicate content. If you use it for other purposes, you may be caught in search engine spam.

Duplicate content issues

Implement tags appropriately on your site. This will save you the problem of duplicate content, some of which are related to site management systems (CMS):

  1. Tracking Codes... Some systems require you to add variables at the end of the url for backlinks to your site. The format can be www.example.com?tracking-variable or www.example.com/example.htm?tracking-code. The problem is search engines separate addresses, even if they differ by only one character. Although Google and Bing both have the technologies at their disposal to help identify such addresses, there are still many errors in their processing. It is interesting to note that some will link to your site to get a reciprocal link using this method... The canonical tag will protect you from this.
  2. Url prefixes... As we already understood, any two addresses that differ by at least one character are considered separate pages. And there are a few cases where URL prefixes can lead to duplicate content. These include, for example, the additional language of the site (Russian and English versions of the page) or the creation of additional pages by the engine (pronounced in WordPress).
  3. Pagination... This is when the site is automatically split into several pages with the same content. For example, if you have an online store and products can be sorted by the color of the product or by its price (in this case, a separate page with the same product descriptions).
  4. WWW... For the most part, this is not a problem. Google usually correctly detects URLs with and without www. But it still happens that the search engine indexes two versions of the site (example.com and www.example.com). As a result, half of your content is indexed with www and the other half without www. Usually this case is written in robot.txt, but the canonical tag can also help here.
  5. If it is impossible to implement a 301 redirect... Oddly enough, but in some cases the publisher is unable to implement a 301 redirect, for example, due to limited access to the server. The canonical tag in this case is an alternative, with the only difference that the original page will continue to exist.

Still, in accordance with Google policy, the canonical tag is a recommendation, not a mandatory rule. This option will help site owners to independently specify which page Google should consider canonical. This makes it easier for Google to determine which page to include in the index when there is duplicate content.

The rel \u003d "canonical" attribute is one way to deal with duplicate content. It is placed on any HTML page between tags ... Search robots begin to consider the page specified in the rel \u003d “canonical” attribute as priority (canonical). The canonical page will be displayed in the search, link weight and other characteristics of pages with the same content will go to it.

Thus, if your site has identical or very similar content available at different URLs, using the rel \u003d “canonical” attribute, you can specify the URL that is preferred for indexing.

When to use canonical links

1. To prevent the appearance of various duplicates. For instance:

  • sort pages: / * sort, asc, desc, list \u003d *;
  • duplicates due to UTM tags: * utm_source \u003d, / * utm_campaign \u003d, / * utm_content \u003d, / * utm_term \u003d, / * utm_medium \u003d;
  • other pages with GET parameters in the URL;
  • duplicates as a result of the peculiarities of the CMS (engine).

In this case, you need to add the rel \u003d “canonical” attribute to all static pages of the site. For example, for the page https://site.ru/category-1/page-2, rel \u003d “canonical” will look like this:

href \u003d “https://site.ru/category-1/page-2” /\u003e

2. For pages with very similar content available at different URLs.

For example, it can be pages of one series of a product that differs only in color or pages of a product that is located in several categories at once.

In this case, you need to point from all pages rel \u003d “canonical” to the main, priority page.

In this case, on each of the pagination pages, you need to specify the canonical "Show all" page.

For example, for the page https://site.ru/category-1/page-2, you need to write the canonical URL:

ru / category-1 / show-all ”/\u003e

How do I specify the main URL using the rel \u003d “canonical” attribute?

Prescribe between any HTML page tags

This is the main way. To specify a canonical link, place between tags on the page, the full URL of the page to be indexed.

For example, for the page https://site.ru/*utm_content\u003d the canonical will be https://site.ru/.

To get such a result, on the page https://site.ru/*utm_content\u003d we specified the tag:

ru /” />

Important!
To reduce the likelihood of error in link elements, use absolute links after the rel \u003d "canonical" attribute, not relative links.

Sitemap

In an XML sitemap, you can write the canonical (main) URL for any page.

Important!
The rel \u003d “canonical” attribute is a search engine recommendation, not a rule. In this case, the PS can ignore them.

In the HTTP header

Best used for non-HTML documents. For example, for PDF files.

In this case, the server, when requesting a duplicate file, must give a link to the original file:

Link: ; rel \u003d "canonical"

Important!
This method is suitable if you have access to the server settings. Not recommended for HTML documents.

Using a plugin

There are various plugins for the CMS that allow you to customize the canonical URL. For instance:
- canonical can be customized for WordPress using Yoast SEO;
- in OpenCart - implemented in the CMS settings (you need to go to the product settings and set the SEO URL parameter);
- for settings canonical attribute in Joomla (version 3.x and higher), you need to enable the SEF function in the CMS settings. Once enabled, the rel \u003d “canonical” attribute will be added for technical pages of the /index.php?option type (indicating the URL to the page with the configured CNC).

How to check if rel \u003d “canonical” is configured correctly?

You can analyze special program for SEO site analysis -.

With this program, you will see:
- what pages on the site without rel \u003d “canonical” attribute;
- which pages have the rel \u003d "canonical" attribute, and which pages are canonical for them;

The main mistakes of using rel \u003d "canonical"

- Canonical URL gives 404 error.
- The specified canonical URL is on a different domain or subdomain.
- The canonical link is not indexable.
- Using rel \u003d “canonical” from pagination pages to the first page.

For all pagination pages it is wrong to write the canonical first page. This makes indexing of all pagination pages impossible.

For pagination pages, the same pages must be specified as canonical pages.

For example, the page https://site.ru/category-1/page-2 should contain a canonical link:

.

- Multiple rel \u003d “canonical” links from one page.

There must be one canonical page per page, otherwise only the first URL will be taken into account.

- Various canonical URLs.

Specify the same canonical pages when different ways implementation of the attribute (for example, via an XML sitemap and via rel \u003d “canonical” on the page itself).

Conclusion

Rel \u003d “canonical” attribute is convenient and useful tool for search engine promotion. If used correctly, it will increase the efficiency and speed up the indexing of the site, which, in turn, will significantly affect its ranking.

Don't miss out on fresh articles

Subscribe to the newsletter

More on the topic:

Natalia Bondarenko

SEO optimizer

I have been optimizing sites since 2009. I love complex cases that were too tough for specialists from other companies. I do very detailed audits.

I am writing instructional articles for the SiteClinic blog on SEO tools and analytics.

Favorite Quote - To be successful, you must truly love what you do.

How to specify canon page from among the same or similar

If you have one page accessible from multiple URLs, or different pages with similar content (for example, versions for mobile devices and computers), Google will count one URL canonicaland the rest are his copies... The canonical URL will be crawled much more often than its copies.

Let us know which URL is the canonical one. Otherwise, we will choose it ourselves or we will consider both addresses to be equal, and this may lead to undesirable consequences. additional information are presented in the section below, which states that why choose the canonical URL.

Use rel \u003d "canonical" attribute

Use the tag in the page title ... It indicates that the corresponding page is copying another.

Suppose you want to specify the page https://example.com/dresses/green-dresses, whose content is reproduced on others, as canonical. Follow these steps:

    Place on all duplicate pages link rel \u003d "canonical"... Add to section these pages item with rel \u003d "canonical" attribute linking to the canonical page:

    If canonical pages have an option for mobile devices, add a rel \u003d "alternate" link pointing to the mobile version.

    Add the hreflang attribute or whatever redirect you want.

Use the following url structure: https://www.example.com/dresses/green/greendress.html
Not use this option: /dresses/green/greendress.html.

Use the rel \u003d "canonical" HTTP header

If you have access to server settings, you can specify the canonical URL for non-HTML documents (such as PDF) using the rel \u003d "canonical" attribute in HTTP headers (rather than using HTML tags).

For example, if your site has several different URLs available pDF file, you can return an HTTP rel \u003d "canonical" header to tell Googlebot which of these URLs is canonical:

Link: ; rel \u003d "canonical"

This method is currently only supported for web searches.

In the link element rel \u003d "canonical" specify paths as absolute, not relative... More details:

  • https://example.com/home
  • https://home.example.com
  • https://www.example.com

Choose one of these addresses as canonical and use server side 301 redirects to redirect to given address traffic from other URLs. This is one of the most reliable ways to ensure that users and search engines move to desired page... A 301 status code means that the requested page is located at a different address.

If you have access to a web hosting service, try searching the reference materials on that service for documentation on setting up 301 redirects.

Was this helpful?

How can this article be improved?

Quite often, you can see on different sites that visitors come to the same content from different addresses (URL). The reason for this phenomenon is duplication of content on the site. The right way - it happens when used different systems content management (cms) on the site. In order to avoid the problem, by Google back in 2009 it was suggested to use the tag rel \u003d ”canonical”, for a page with a specific url, which will participate in search engine results. A little later, all search engines supported the idea.

Rel \u003d "canonical" attribute

How to use the rel \u003d "canonical" attribute.

Let's say you found a page to which visitors come from different urls and want to solve the problem using an attribute. To do this, select the main url, for example: https: // site / kak-samomu-raskrutit / page-one-1 / and now to inform the search engine about this, you need to:

  • register attribute rel \u003d ”canonical” for the main page and add a tag to the page in body , here's an example:

The search engine will highlight this address as the main one and it will be used in search results. In order to avoid problems, include absolute links, not relative ones.

You need to use:

https: // site / kak-samomu-raskrutit / page-one-1 /

Do not use:

/ kak-samomu-raskrutit / page-one-1 /

Now it may appear before us main question, and whether characteristics such as link weight, pr are transferred to the canonical page? I can say for sure, all page characteristics, such as link weight, etc., are transmitted, tested in practice.

To make life easier for webmasters and SEOs, I recommend using plugins and modules for CMS that will track links with duplicate content and automatically prescribe canonical attributes. For WordPress, I can recommend the plugin, it does the job perfectly. It is enough to tick the “Canonical URLs” box and canonical pages will be generated automatically.


You just have to check correctly and correct the plugin robot.

Examples where to userel \u003d "canonical".

1. Server shows same content for https protocol and www subdomain, example:

http://lonbo.com/page-one
https://loknbol.com/page-one
http://www.lonbo.com/page-one

So, for this case, you can use.

2. For sites that use engineswhen saving content to different sections (categories).

https: // site / category-1 / page-one1 /
https: // site / category-2 / page-one1 /

3. Dynamic URLs... As a rule, it is typical for products of online stores that are created in different sessions or for different search queries... Example:

https: // site / products? category \u003d shapka & color \u003d gray
https: // site / head / gray? gclid \u003d ABCD
https: //site/shapka/grey/shapkaGrey.html

4. Distribution of site content (resource) to other sites, fully or partially.

General rules for using the rel \u003d "canonical" attribute.

  1. Do not use an attribute more than once for one page. The search engine can simply ignore his instructions.
  2. Remember to include rel \u003d ”canonical” in your HTML code section. Check especially when using plugins or modules.
  3. Make sure the canonical page is open for indexing, otherwise the use is useless.
  4. Pages

Allan Scott (Allan Scott), engineer software The Google Indexing Team, listed on the Webmaster Central blog the five most common mistakes webmasters make when using the rel \u003d "canonical" attribute, and gave some important tips for using the tag.

First of all, a search spokesperson reminded the industry that the canon page attribute clearly indicates search robotswhich instance from a set of pages with similar content is referenced by duplicates. At the same time, additional properties of the address (for example, PageRank) and related signals (the quality of the incoming link mass) are also transferred from the duplicated pages to the canonical one. The rel \u003d "canonical" attribute is currently supported by all major Western search engines: Yahoo !, Bing, and Google.

However, the use of the rel \u003d "canonical" attribute often causes certain difficulties for webmasters. In turn, errors related to the indication of the canonical page can also affect the display of resource pages in search results.

To avoid such mistakes, Google experts recommend adhering to the following general rules when setting the rel \u003d "canonical" attribute:

  • Most of the duplicated pages should contain links to the canonical address.
  • It is important to ensure that the page referenced with the rel \u003d "canonical" attribute exists and that the URL is correct (check if the server returns a 404 error).
  • You need to make sure that the canonical page is not closed for indexing by search robots.
  • It is important to clearly understand which page the webmaster wants to see in the search results, it is this page that should be indicated as canonical (for example, if the site contains a set of pages with the same product model, differing in color, it is advisable to specify the page with the most popular color).
  • Remember to include the rel \u003d "canonical" attribute in the HTML section of your document.
  • It is important to avoid using the rel \u003d "canonical" attribute more than once per page. Otherwise, the search engine will simply ignore the attribute specification.

Mistake 1. The rel \u003d "canonical" attribute was used for the first page of the pagination series:

Imagine an article on your site has several pages:

  • example.com/article?story\u003dcupcake-news&page\u003d1
  • example.com/article?story\u003dcupcake-news&page\u003d2
  • etc.

In this case, pages 2 and 3 are not duplicates, which means that using the rel \u003d "canonical" attribute to indicate the canonical first page of the document would be an error. This error may cause pages 2 and 3 to be dropped from the index.

It is also important to use the rel \u003d "next" and rel \u003d "prev" HTML attributes when paginating a document to indicate the relationship between individual URLs.

Of course, rel \u003d "canonical" can be used for both absolute and relative links, however Google recommends using absolute links to minimize possible mistakes... If a basic link is specified in the document, then all relative links will be calculated based on it.

However, in cases where absolute links to a canonical page are mistakenly written as relative ( instead http://example.com/example.com/cupcake.html), the algorithms can ignore the fact that this page was listed as canonical.

Mistake 3. If several pages from a set with similar content are assigned canonical, or the rel \u003d "canonical" attribute is used by mistake:

Google experts often observe the following situation: a webmaster copies a page template, forgetting to change the value of the rel \u003d "canonical" attribute

If you use templates, remember to check if the rel \u003d "canonical" attribute was accidentally copied.

Another mistake webmasters make is when several pages from a set with similar content are assigned canonical. This is often due to the use of various plugins on the page. Plugin code is mistakenly embedded in the rel \u003d "canonical" attribute.

It is important to understand that in both cases google algorithms recognize the use of the rel \u003d "canonical" attribute as invalid, and disregard its use in indexing.

Mistake 4. One of the categories of the landing page links with the rel \u003d "canonical" attribute to the featured article:

With this approach, only the page with the article will be included in the index, while the category page itself will not be indexed.

Mistake 5. The rel \u003d "canonical" attribute is used in the document section :

As mentioned above, the rel \u003d "canonical" attribute must be included in the document section of the HTML code, and it must not be included in the document section. ... Otherwise, Google's algorithms will not take into account the use of this tag, especially if it will be displayed in plain text, or tags that we usually present in the section .

Additional Informationregarding the application of the rel \u003d "canonical" attribute is available on the form

Did you like the article? To share with friends: