Robots file: spider web access will first visit the robots file, the file must be placed in the root directory, the file name must be in lowercase, website content can be used to control those spiders access or prohibited access. For example, can prevent the website backstage management, scripting language, duplicate content indexing, or website on some non public page etc.. The masked page in the document, the spider will not crawl, but there is no any weight on the transfer.
Disallow: instructions on the robots file can prevent the search engine to access the site, the site can use the disallow command to shield the search engine spiders on the site to grab officially launched in front, so as to avoid unnecessary capture result in a site of punishment. The instruction is to prevent crawling but not stop the index, if the other site to prevent the links in the site,
Canonical instructions: this tag is first launched by Google, YAHOO label for the identification and treatment of duplicate content, standardize website version information, the use of labels is relatively simple, properties similar to URL 301 redirection, but the essence is different. The appearance is to tell the search engine weights of all other versions are attributed to the definition of URL, but did not actually have different versions of the URL jump to define URL. Love Shanghai search engine also began to support the Canonical label, the label is simple to use but not write wrong grammar, or caused by the wrong search engine to determine the site content.
search engine optimization label different instruction plays a different role, through the instruction of webmaster can like search engine and user display different content, reasonable use of each instruction can help webmasters better operation site, but if not clear instructions, often will bring adverse effects to the site. Share some common label instructions to you.
Noindex label: marked Noindex page will be the search engine spiders encounter sites excluded, adding Noindex instructions, will still be on the page for crawling, and can put the weight transfer to other pages, but the content of the crawling does not appear in the search engine index. Different from the nofollow property is that it will still transfer weight to links to other sites, but not its corresponding ranking, such as web site map page does not want to access some of the rankings, but I hope this page can transfer the corresponding weights to the inner chain >
Allow: instructions on allow instruction can support search engines only Google, YAHOO and Ask, use this command can clearly tell the search engines which can crawl the site directory and the inside pages, if the page shielding needs a large number of sites, not in the robots file in a list, and then consider the use of the allow instruction convenient, faster on the site to grab a spider.