1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

http://www.whitehouse.gov/robots.txt Shocking What they DO NOT want Spridered

Discussion in 'General Chat' started by ! !, Jun 24, 2004.

  1. #1
    Yesterday they awarded the MEDAL OF FREEDOM
    http://www.whitehouse.gov/news/releases/2004/06/20040623-8.html

    to civilians - while looking at their press release - just decided to see how they construct their META TAGS and Robots.txt files

    http://www.whitehouse.gov/robots.txt

    Boy - sometimes - you have to wonder???

    here is just a small sample:


     
    ! !, Jun 24, 2004 IP
  2. digitalpoint

    digitalpoint Overlord of no one Staff

    Messages:
    38,334
    Likes Received:
    2,613
    Best Answers:
    462
    Trophy Points:
    710
    Digital Goods:
    29
    #2
    What a waste of time... restricting a directory, then also disallowing every sub-directory within it. heh

    Although, maybe Bush is learning to be a webmaster.
     
    digitalpoint, Jun 24, 2004 IP
  3. Lever

    Lever Deep Thought

    Messages:
    1,823
    Likes Received:
    94
    Best Answers:
    0
    Trophy Points:
    145
    #3
    LOL you'll have to let us know when he starts using the keyword tracker, Shawn - methinks he'll be watching how "miserable failure" does ;)
     
    Lever, Jun 25, 2004 IP
  4. Foxy

    Foxy Chief Natural Foodie

    Messages:
    1,614
    Likes Received:
    48
    Best Answers:
    0
    Trophy Points:
    0
    #4
    What happened to the WMD?
     
    Foxy, Jun 25, 2004 IP
  5. mcdar

    mcdar Peon

    Messages:
    1,831
    Likes Received:
    110
    Best Answers:
    0
    Trophy Points:
    0
    #5

    :D The Weapons of Mass Destruction (WMD) folder was EMPTY! :D
     
    mcdar, Jun 25, 2004 IP
  6. ! !

    ! ! Guest

    Messages:
    299
    Likes Received:
    25
    Best Answers:
    0
    Trophy Points:
    0
  7. Foxy

    Foxy Chief Natural Foodie

    Messages:
    1,614
    Likes Received:
    48
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Hehehe ....Lol :D
     
    Foxy, Jun 25, 2004 IP
  8. schlottke

    schlottke Peon

    Messages:
    2,185
    Likes Received:
    63
    Best Answers:
    0
    Trophy Points:
    0
    #8
    I second the giggly laughter. ;)
     
    schlottke, Jun 25, 2004 IP
  9. austexdance

    austexdance Peon

    Messages:
    1
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    wow, the sign up for these forums didn't take long at all. :) thank god, all the other ones take a whole day.

    anyway, I wanted to reply to this thread, as I have been actively seeking information on the robots.txt and am wating for the syntax to start using an "allow" feature as well, or something so that I don't have to have a crawler page and a robots page.

    In doing this, I read a really funny article
    http://www.theinquirer.net/?article=19357
    and I know, the source is the inquirer, but, it was so distorted in it's thinking. Like where it says that the r.t's keep search engines from taking a snapshot of our history. Dumb statement there. Search engines are not designed to take snapshots...lol.. they are designed to let you search for the information that you are looking for, duh, hello.

    The fact of the matter is that, what is going on here is that they are using some type of include files on their system, so they have a simple mostly text form of what they type, and then they do another page that calls the include and the tex field. It's a larg site, so they only want to have to change a little bit of code at a time.

    This is not in defense of the whitehouse...lol.. lord knows I wouldn't go that far, but, I doubt someone said "hey web programmer dude, let's be sneaky and not let search engines get certain information"... Look, if they were trying to be sneaky with their information, they simply wouldn't post it on the internet at all. It would be hidden somewhere for real.

    And as for that article saying that that is not the normal ammount of dissalows, again, it's not a normal sized site, so of course it has more directories. try going to www.google.com/robots.txt
    that one has a lot as well, but guess what? it's syntax is wrong. All the directories need to have trailing "/"

    OK, let me try this again, it wouldn't let me put in "live links" cause I am new... hello? administrator, I really am not trying to promote my own site, I promise...LOL..
    Anyway, thanks for lettin' me rant
    Mike
     
    austexdance, Dec 23, 2004 IP
  10. Blogmaster

    Blogmaster Blood Type Dating Affiliate Manager

    Messages:
    25,924
    Likes Received:
    1,354
    Best Answers:
    0
    Trophy Points:
    380
    #10
    it's being built in a secret location :)
     
    Blogmaster, Dec 23, 2004 IP
  11. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #11
    If you are doing something to try to serve up different pages to spiders/bots than to human visitors, be advised that such practices may result in penalties by search engines.

    Anything on your site that you don't want a spider to "view" can be excluded with the Disallow: statement in robots.txt.
     
    minstrel, Dec 24, 2004 IP