Google cancel support for robots.txt noindex? Recently, Search engine Giant has announced that it will not follow robot.txt directory for indexing websites, but the website owner has to follow the rule of the robot.txt index until September 1, 2019
What is robots.txt file actually?
The basic function of the robots.txt file is to give instructions to the web robots that which, website pages they have to crawl and which they have not to. Basically, robotx.txt is the file which is the sub-part of the robots exclusion protocol or REP which allows the robots the crawl web pages, index content and offers content to the users.
In brief, the robots.txt is basically used in telling user agents, that which part of the website they have to crawl and which part they have not to crawl.
The basic format of the robots.txt file
User-agent: [user-agent name]
Disallow: [URL string not to be crawled]
These are the two lines, together they complete the operation of the robots.txt file
Previously, Google Support robots.txt directive, but now Google has announced that they will no longer support robots.txt file. The website owners have to take it order and apply it as soon as possible.
So, how to control crawling?
We all know robots file is used to crawl crawling; now the thing is how the website owner can control crawling. If you are a website owner and you are worried about your crawling process then you should following methods:
Google Official website has announced the following methods to control the process of the website crawling
- Start using no index in your robots meta tags
- Use 404 or 410 HTTP status codes
- Have password protection services
- From the search Console URL tool ( URL Removal)
You can read out the Google official announcement from the blog here: