What is a robots.txt file and How it Can be Created?

The robots.txt file’s task is to prevent Google robots and cookies from being accessed by certain content on your website. Maybe many of you ask yourself why we should not allow Google to access the content of our site’s pages,

and we should be careful that Google will search our site pages for our search results and thus we will get more clicks from Google. !

In this post, we are going to point out that before we say what robots.txt is and what it is used to, we will look at why we should use robots.txt instead.

Why should I use a robots.txt file?

The Google search engine uses various crawlers or crawlers to find and index web pages from different robots.

These crawlers crawl on web pages and get information about different pages and send them to the search engine.

However, some webmasters are reluctant to find some of their webpages that do not care about Google Indexer and want to get Google Robots to not index these pages!

So now we use the robots.txt file, which is a text file that you enter a series of code into and there it is, in order to understand that some of our site’s pages should not be indexed or indexed. The code refers to pages that should not be indexed.

What address is the robots.txt file visible?

Most of the famous sites you see on the Internet are from the robots.txt file and the benefits and features that this file uses from the merchandise to the Portuguese website itself!

To access the robots.txt file for various sites and their analysis, it’s enough to add the “robots.txt /” to the domain name of your site.

For example, our robots.txt file at our site is accessible from this address: https://porteghaliha.com/robots.txt

Familiar with robots.txt file commands and their meanings…

آشنایی با دستورات فایل robots.txt

To use the robots.txt file functionality for your web site, you do not have the expertise in coding and you do not need to use sophisticated code.

We will teach you the types of code that you need, and you will be able to create your own robots.txt file in the optimal way by reading this post.

The robots.txt file commands are divided into 3 general sections:

  1. User-agent
  2. Disallow
  3. Allow

User-agent command

As we mentioned earlier, Google uses different crawlers and robots to find and index pages of different sites. Before you write a robots.txt file, you need to specify which Google robot should not check your website.

However, most websites usually target all Google robots and order all of these robots as well.

We suggest that you do the same, especially if you are beginner in this field.

If you want to order all crawling robots like this, just use the “*” sign after the user-agent statement. For example:

User-agent: *

But if you want to just order a Google crawler robot, do the following:

User-agent: Googlebot Images

The above command addresses the crawler robot for Google images only.

You can view the list of all Google Agents and Robots from this link.

Disallow command

Use the Disallow command to restrict specific pages to the website. If you would like some of the pages of your site not to be of high value, you should use this command in Google Index.

Suppose your site has a download section that you do not want to be Google Indexer. In this case, you should use a code like the following.

User-agent: *
Disallow: / dl

This command means that all pages of your site are indexed except for pages whose addresses are started with dl.

 

Allow command

The Allow command is also used to allow the Google Robot to access a file inside the Disallowed folder. For an explanation we will give an example:

Suppose you have restricted the dl folder and the Google crawler crawler does not have permission to access it. Now there is a file inside the dl folder called porteghal that you would like to be in Google Index for this code to use:

User-agent: *
Disallow: / dl
Allow: / dl / porteghal

 

How to create a robots.txt file

To create a robots.txt file, you must first type the code you want in the notepad software and then save this file called “robots.txt”.

Now you need to upload this file to your homepage via the web host. To do this, go to the File Manager section of cPanel.

بخش فایل منیجر در هاست CPanel

Then enter the Public html folder.

پوشه public html در هاست CPanel

After accessing this folder, you will encounter a series of files that are in the main path of your site.

  Now you need to upload the robots.txt file on the same page as shown below.

نصب وردپرس فارسی روی هاست cpanel

In this section, you need to select the robots.txt file from your hard drive and where you saved it, and after that, you could successfully create the robots.txt file.

If we want to summarize the above, we have to say that one of the things that you should do in your site is to not allow Google to leave the pages that are not important enough to index and search in their own search results.

Google always tends to have pages that are included in the search results, useful and valuable pages, and if you have pages that you do not value indexing or are not useful enough, you should know Google to not index these pages as one. There are robots.txt files available for this.

Of course, you can also use the noindex tag, which will soon be a tutorial on this topic in our site.

Leave a Reply

Your email address will not be published. Required fields are marked *