Web Design Forum: Strange one for ya - Web Design Forum

Jump to content

WDF
WDF Premium Memberships Reseller Hosting
Page 1 of 1
  • You cannot start a new topic
  • This topic is locked

Strange one for ya

#1 User is online   roothost 

  • Currently accepting new clients
  • PipPipPipPipPip
  • Group: Members
  • Posts: 1,463
  • Joined: 06-February 11
  • Reputation: 73
  • Gender:Male
  • Location:Lewes, East Sussex
  • Experience:Intermediate
  • Area of Expertise:Web Designer

Posted 31 August 2011 - 09:26 AM

I always thought that this..
User-agent: *
Disallow: /blog/


..would block the search engines from indexing the page/folder.
0

#2 User is offline   SniderDK 

  • Expert
  • PipPipPipPip
  • Group: Members
  • Posts: 697
  • Joined: 01-November 08
  • Reputation: 88
  • Gender:Male
  • Experience:Web Guru
  • Area of Expertise:Web Developer

Posted 31 August 2011 - 09:36 AM

View Postroothost, on 31 August 2011 - 09:26 AM, said:

I always thought that this..
User-agent: *
Disallow: /blog/


..would block the search engines from indexing the page/folder.



it would request that spiders dont go there... its not a "force" as a crawler can decide to not check and or ignore.... although thats not what happens on the polite internet... theres not a technical limitation stopping them from ignoring the file


http://www.robotstxt.../robotstxt.html
0

#3 User is online   roothost 

  • Currently accepting new clients
  • PipPipPipPipPip
  • Group: Members
  • Posts: 1,463
  • Joined: 06-February 11
  • Reputation: 73
  • Gender:Male
  • Location:Lewes, East Sussex
  • Experience:Intermediate
  • Area of Expertise:Web Designer

Posted 31 August 2011 - 09:41 AM

View PostSniderDK, on 31 August 2011 - 09:36 AM, said:

it would request that spiders dont go there... its not a "force" as a crawler can decide to not check and or ignore.... although thats not what happens on the polite internet... theres not a technical limitation stopping them from ignoring the file


http://www.robotstxt.../robotstxt.html

Ok, just wondered as I had blocked the /blog/ folder months (before I even added content to it) and noticed this morning that the big G had indexed /blog/ and /blog/news/.
0

#4 User is offline   SniderDK 

  • Expert
  • PipPipPipPip
  • Group: Members
  • Posts: 697
  • Joined: 01-November 08
  • Reputation: 88
  • Gender:Male
  • Experience:Web Guru
  • Area of Expertise:Web Developer

Posted 31 August 2011 - 10:11 AM

Well that's abit rude of the gman...

They do have a tool in google webmaster tools to drop pages... and you could add meta tags... or check for googlebot in the useragent header and 404 the page... will get them to drop it fairly quick... obv don't do that if there's page history that could be saved :)

Thinking about it webmaster tools from google will tell you exactly what they think they should be doing in the diagnostics section....
0

#5 User is online   rallport 

  • Web Guru
  • PipPipPipPipPip
  • Group: Members
  • Posts: 3,818
  • Joined: 03-January 10
  • Reputation: 266
  • Gender:Male
  • Location:England, UK
  • Experience:Advanced
  • Area of Expertise:Web Developer

Posted 01 September 2011 - 09:12 AM

I personally always use the meta noindex tag - I feel much more confident than using robots.txt
0

#6 User is online   roothost 

  • Currently accepting new clients
  • PipPipPipPipPip
  • Group: Members
  • Posts: 1,463
  • Joined: 06-February 11
  • Reputation: 73
  • Gender:Male
  • Location:Lewes, East Sussex
  • Experience:Intermediate
  • Area of Expertise:Web Designer

Posted 01 September 2011 - 06:25 PM

View PostSniderDK, on 31 August 2011 - 10:11 AM, said:

Well that's abit rude of the gman...

They do have a tool in google webmaster tools to drop pages... and you could add meta tags... or check for googlebot in the useragent header and 404 the page... will get them to drop it fairly quick... obv don't do that if there's page history that could be saved :)

Thinking about it webmaster tools from google will tell you exactly what they think they should be doing in the diagnostics section....

Yh, I had to get some pages for another site dropped and the webmaster tools was pretty handy. It's not a major issue really, more of an annoyance.

View Postrallport, on 01 September 2011 - 09:12 AM, said:

I personally always use the meta noindex tag - I feel much more confident than using robots.txt

I will certainly be doing that from now on.
0

#7 User is offline   handatravel 

  • Forum Newcomer
  • Pip
  • Group: Members
  • Posts: 20
  • Joined: 13-September 11
  • Reputation: 0
  • Gender:Male
  • Experience:Nothing
  • Area of Expertise:Designer

Posted 13 September 2011 - 07:51 AM

yes, it tells serach engine crawler not to crawl the blog folder
0

#8 User is online   roothost 

  • Currently accepting new clients
  • PipPipPipPipPip
  • Group: Members
  • Posts: 1,463
  • Joined: 06-February 11
  • Reputation: 73
  • Gender:Male
  • Location:Lewes, East Sussex
  • Experience:Intermediate
  • Area of Expertise:Web Designer

Posted 13 September 2011 - 07:09 PM

View Posthandatravel, on 13 September 2011 - 07:51 AM, said:

yes, it tells serach engine crawler not to crawl the blog folder

?
0

#9 User is offline   oliviasmith 

  • Forum Newcomer
  • Pip
  • Group: Members
  • Posts: 2
  • Joined: 14-September 11
  • Reputation: 0

Posted 14 September 2011 - 11:56 AM

View Postroothost, on 31 August 2011 - 09:26 AM, said:

I always thought that this..
User-agent: *
Disallow: /blog/


..would block the search engines from indexing the page/folder.



Yes your blog folder will not index.
0

#10 User is online   roothost 

  • Currently accepting new clients
  • PipPipPipPipPip
  • Group: Members
  • Posts: 1,463
  • Joined: 06-February 11
  • Reputation: 73
  • Gender:Male
  • Location:Lewes, East Sussex
  • Experience:Intermediate
  • Area of Expertise:Web Designer

Posted 14 September 2011 - 01:56 PM

View Postoliviasmith, on 14 September 2011 - 11:56 AM, said:

Yes your blog folder will not index.

That's not strictly correct as some people have said on here, and after some research it appears the Search engines dont always play nicely and this will not 100% block them from crawling and indexing the pahe/folder.
0

#11 User is online   zed 

  • Web Guru
  • Group: Moderators
  • Posts: 4,941
  • Joined: 25-May 10
  • Reputation: 703
  • Gender:Male
  • Experience:Intermediate
  • Area of Expertise:Designer/Coder

Posted 14 September 2011 - 01:57 PM

thread locked in agreement with roothost
0

Share this topic:


Page 1 of 1
  • You cannot start a new topic
  • This topic is locked

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users