Results 1 to 5 of 5

Thread: robots.txt file

  1. #1
    casionmark is offline Private Member
    Join Date
    November 2016
    Posts
    298
    Thanks
    251
    Thanked 90 Times in 69 Posts

    Default robots.txt file

    Hi -

    Just setting one of these up for a new site. I have seen the following in another site I have access to. I assume this code is blocking the google bots from crawling the WP admin back end. Is this something that should always be done? Should I include it in my robots.txt?

    User-agent: *
    Disallow: /wp-admin*
    Disallow: /wp-login.php
    Disallow: /wp-content*

    Also, someone has set up a robots.txt for me and it has the following code:

    Crawl-Delay: 20

    In Search Console this is marked as an error. Any ideas why?

    Thanks,

  2. #2
    Progger's Avatar
    Progger is offline Public Member
    Join Date
    November 2014
    Posts
    1,422
    Blog Entries
    5
    Thanks
    283
    Thanked 939 Times in 587 Posts

    Default

    Not importend,

    The Bot stuck on the login,u can block him or ignore simple the "crawling error" reports in GWT.


    User-agent: *

    Disallow: /wp-admin/
    Disallow: /wp-includes/



    Regards

  3. The Following User Says Thank You to Progger For This Useful Post:

    GaryTheScubaGuy (23 January 2018)

  4. #3
    schegolev100's Avatar
    schegolev100 is offline Private Member
    Join Date
    April 2018
    Location
    Vietnam
    Posts
    33
    Thanks
    5
    Thanked 4 Times in 4 Posts

    Default

    Hi,

    From site on WP recommended close duplicate pages and admin panel from index.

    Example robots.txt

    User-agent: *

    Disallow: /page/*
    Disallow: /uncategorized/*
    Disallow: /wp-admin/*
    Disallow: /wp-admin/admin-ajax.php*
    Disallow: /p=*
    Disallow: /?*
    Disallow: */page/*
    Disallow: */attachment/*
    Disallow: /?s=
    Disallow: /?feed=
    Disallow: */feed/
    Disallow: */trackback
    Disallow: /xmlrpc.php
    Disallow: /wp-register.php
    Disallow: /wp-login.php
    Disallow: /wp-includes

  5. #4
    citizen42's Avatar
    citizen42 is offline Private Member
    Join Date
    September 2017
    Posts
    48
    Thanks
    8
    Thanked 37 Times in 25 Posts

    Default

    you may want to remove wp-content/ directory as you may have files directly referenced in there (say images).
    wp-includes/ and wp-admin/ should be disallowed, that's correct.

    as far as I know googlebot ignores the crawl-delay directive, I'd remove that as well.

    also it's a good practice to add the sitemap URL at the end of the robots.txt file.
    Sitemap: yousite.com/path-to-sitemap.xml

  6. #5
    RichardV's Avatar
    RichardV is offline Private Member
    Join Date
    September 2018
    Location
    earth
    Posts
    16
    Thanks
    2
    Thanked 11 Times in 7 Posts

    Default

    don't forget to disallow your affiliate links as well:
    Disallow: /visit/*
    or whatever other path you're using for it.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •