Close Panel

6

May

2009

I’m Speaking at SMX London

By Chewie. Posted in Off Topic | View Comments

I recently got accepted to speak on the Whats new with Social Media panel which is being held at SMX London on the 19th May 2009. I am really looking forward to this panel because i will be speaking with some real good guys (and gals) including Ciaran Norris, Massimo Burgio, and Lucy Langdon.

Im hotlinking this image because my uploader won't work

I am planning on speaking about utilising Facebook and Twitter, much of the presentation will be spent on Facebook Connect and also using the twitter API. All this will be wrapped in some case studies which should push people to really consider the power of using such technologies.

Come along and say hi if you are there, i really do think it will be a good panel :)

 

Back in December 2008 Microsoft Live Search announced that it would be releasing its new spider/crawler into the wild to crawl all those lovely websites out there, now 4 months on it seems that the MSNbot is being very naughty and and completely disregarding robots.txt and no index meta tags, and even worse, could be crawling your site based on the robots.txt of a completely different domain!

msn_logo

So what’s exactly going on? It seems that the problem first started in February 2009 when some users on webmaster world noticed that the new MSNbot had been hitting on their robots.txt files but not obeying the rules and grabbing pages which had been excluded. Discussion ensued with people wondering if this was just some crawler spoofing as MSNbot, but it turns out that it was the real MSNbot so why would it be completely disregarding the robots.txt?

Well another discussion over at Webmaster talk confirmed that MSNbot was definitely disregarding the robots.txt instructions, in fact one member posted the following information…

65.55.106.115 - [01.11] "GET /robots.txt "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
65.55.106.115 - [01.11] "GET /about.php "msnbot/2.0b"
65.55.106.172 - [01.16] "GET /forbidden/ "msnbot/2.0b"

Now for the non technical out there, the above is basically three lines from a log file which shows that MSNbot came to the site from the ip of 65.55.106.115 and read the robots.txt file, the bot then requested the about.php page and left. However, shortly after, the MSNbot came back from a different ip address (this time 65.55.106.172) and tried to crawl the /forbidden directory. Whats weird here is that apparently the /forbidden directory is not linked to from anywhere so the only way the bot would know it existed is by reading and disregarding the robots.txt file. It might cross your mind to think that this is all a coincidence and that someone masquerading as MSNbot came along shortly after and tried to access /forbidden, however both ip address belong to Microsoft.

As i said earlier, it seems a bit strange that Microsoft would start to ignore robots.txt files, so after digging deeper it seems like there is a bug in the new MSNbot which means that it is actually reading the robots.txt on a complete different domain and then trying spider your site. Here is an example request from the spider…

GET /robots.txt HTTP/1.1
Accept: */*
Host: www.lumigan.com
User-Agent: msnbot/2.0b (+http://search.msn.com/msnbot.htm)
Connection: Keep-Alive
Cache-Control: no-cache
Pragma: no-cache

In this instance, the spider thinks that it is crawling www.lumigan.com but is in fact crawling a completely different website thus disregarding it’s robots.txt and indexing pages that shouldn’t be indexed. It’s at this point that Microsoft seemed to get wind of it and stated that they are looking into the problem.

bad_robot

The final piece of the puzzle comes from a post on one of Microsoft’s own social boards, where a user basically confirms what everyone else has been speculating…

For some reason, msnbot/2.0b is visiting the wrong IP addresses to retrieve robots.txt. In other words, it THINKS it is getting robots.txt for www.yoursite.com, but it is really reading the robots.txt file that is served for the default host at the IP address for www.mysite.com (not necessarily www.mysite.com’s robots.txt). Clearly, msnbot/2.0b is using the wrong DNS lookup for its requests.

So, we get confirmation that MSNbot is using the wrong DNS lookup for its requests and as such is definitely crawling sites based on the wrong robots.txt information. This is very concerning since areas on your website that you specifically do not want to be crawled, are being crawled and could end up being placed in to the Live SERPS.

Thankfully Brett from MSN yet again confirms that they are aware of the problem and they are trying to fix it. The problem is, no one seems to know when the fix will be complete or if the data that they have gathered in the past 4 months has already been used in the SERPS.

If you want to check to see if your site has been effected then i offer you the following advice from the above forum post…

Search your web log for requests from msnbot/2.0b. Do you see requests for links that don’t exist on your site? That’s because they exist on a different site, the one msnbot/2.0b THINKS it’s crawling . If you log the requested server name, do you see unfamilar hosts? Those are the ones msnbot/2.0b THINKS it’s visiting .

You could also just out right ban the MSNbot using an .htaccess line with something similar to the following…

RewriteCond %{HTTP_REFERER} ^msnbot/2\.0b [NC]
RewriteRule .* - [F,L]

//Returns a 403-Forbidden response and no content.

Hopefully Microsoft can get this issue resolved soon.

 

I’m a little late to the party regarding this as both Mike Nott and Robert Kerry have already covered it. However since my blog is read by a ton more people than both of theirs combined, i thought i should write a quick post about it too :)

I have been at Ayima for just over a year now, i came from sunny Lincoln to the Big Smoke to work with some of the best people in the industry. A lot has happened in that past year, i have been fortunate to travel to conferences and learn a lot of new things, and Ayima has been steadily growing all the time.

Ayima USA

Tony Spencer has long been friends with Mike N and as such we have started on a new venture by opening an Ayima office in America with Tony heading it up. It is an exciting chapter in the Ayima book and hopefully it will mean that we all get to visit America a lot more :)

I am sure that Tony will do a great job, and we are all looking forward to pulling in some big clients.

 

We all love Facebook, well most people did before the new layout change. If you read my blog you know that i spend a lot of time writing about new Facebook features etc. I thought i would lighten up the mood and create my first blog list, and what better place to start than with Facebook and the parody videos that people have made.

Lets get straight into it…

5) Facebook Song

The Facebook Song, umm song sounds like something from a country and western convention, that being said it’s not without its charm and good for a laugh in a couple of places.

4) Facebook Anthem

This is actually quite well put together and would be higher up my list if it wasn’t for the girl reminding me of Hannah Montana, who i literally can not stand.

3) The Facebook Skit

This video involves some Indian guy (i think he is Indian) singing along to a parody of Enrique Iglesias’ song Hero, need i say more?

2) Facebook Off

The guys from College Humour create a brilliant parody of the Nick Cage film, Face Off. Really well done and full of laughs, would of been 1st if it wasn’t for…

1) Facebook Gangsta

By far the best Facebook Parody, these guy encapsulate pretty much all the white guys who sit behind their computers thinking they are Gangsters, and yes, that includes me.

Super Bonus completely unrelated parady video

Totally unrelated but a great example of how to parody something from the guys a Cracked.com, if you actually watch Star Trek this is 100 times better.

 

I was checking out the hitwise blog, and came across an interesting post centered around the traffic growth of *shudders* Perezhilton.com… and no, i am not linking to it!

Anyway, the interesting thing about the post is that it states that according to hitwise numbers, Perezhilton.com now generates more traffic from Facebook than it does from Google searches.

Apparently…

Last week, 8.70% of visits to PerezHilton.com came from Facebook compared to 7.62% from Google. The switch happened the week ending December 27th, 2008.

then in the comments, someone else confirms that…

It’s a similar situation in the UK: 2.5x as many searches for Perez than Paris last week. 19.1% of Perez Hilton’s traffic from Facebook, just 12.0% from Google UK and 2.3% from Google.com

Now what else is interesting is that Facebook now accounts for about 3.3% of all traffic driven to video sites, Hitwise have provided the following handy dandy trend graph.

facebookvideotraffic

So to sum up the point of this post, whilst it may still be difficult to attract Facebook users to buy products from your site, it should certainly not be overlooked as a means to generate traffic to information based sites or blogs. I imagine that this trend will continue to rise, especially as Facebook look to launch its new real time homepage later today

 

26

Feb

2009

Sphinn’s wacky redirects

By Chewie. Posted in General | View Comments

I was just talking with Jane Copland about the beloved Sphinn blog and how it is always up to date :o ) So i decided to type in http://sphinn.com/blog and found it rather amusing when it 301 redirected to http://blog.sphinn.com/blog and then returned me a 404 error.

I then decided to type http://www.sphinn.com/blog and got redirected to http://sphinn.com/php-fastcgi.fcgi/blog/index.php

sphinn

It made me smile that Sphinn, an SEO blog, haven’t got there redirects sorted correctly, although I do sympathise with them since they are using the rather useless Pligg platform.

Evil Green Donkey was unavailble for comment, when i leaned over the desk to ask him he put his fingers in his ears and started singing YMCA :o )

 

20

Feb

2009

Facebook Connect – Comments widget

By Chewie. Posted in Facebook | View Comments

The Facebook Connect world just keeps on getting better and better, not only do we have the wonderus Facebook connect core, we now have widgets for nearly every blogging platform and on top of that FB have released a new comments widget that can plug directly into your site with little effort and code.

facebook-logo

With the Comments Box, Facebook users on your site can comment on your content, post those comments to their profiles, and share them with their friends on Facebook. The Comments Box allows non-Facebook users to make comments on your site as well. And via their APIs, you can access related comments made on Facebook as well to bring the conversation together.

The Comments Box comes with additional social features:

  • Fully customizable: Specify background color, text color, and other attributes by providing your own custom CSS to incorporate this best into your site.
  • Access to raw data: Query all comments via the comments.get API method or the comment FQL table.
  • Administration and moderation: Manage the privacy and permissions of your comment boxes on an individual or global basis.
  • Integrates seamlessly regardless of whether you do or don’t have Facebook Connect already on your site.

So as you can see it is pretty powerful, and you can get more info, plus the code from the Facebook Blog

 

18

Feb

2009

I’m speaking at SES London

By Chewie. Posted in General | View Comments

The blog has been a bit quiet over the last month since i have been a very busy little bee.

I just wanted to write a quick post to say that i am speaking at SES London tomorrow on the 19th February at 12:45 on the Successful Site Architecture panel.

We will pretty much be covering the following…

Learn how to successfully architect your site for search engines and how specific page elements and design technologies may impact your ability to gain good organic listings. Covers topics such as directory and file structure, server-side includes (SSIs), 404 error trapping, JavaScript, robots.txt use, frames, secure area usage, and much more. Toward the end of the session, volunteers from the audience will have their sites examined to see how changes could be made to their site architecture and design to increase search engine traffic, as time allows.

I am going to cover the more technical details, so if you around come and sit in my session and cheer me on :)

Dean chew SES

 

Well now you can with Zembly :)

Zembly

What the hell is Zembly may you be asking? Well according to their website…

Using just your browser and your creativity, and working collaboratively with others, you create and publish Facebook apps, Meebo apps, OpenSocial apps, iPhone apps, Google Gadgets, embeddable widgets, and other social applications.

At Zembly, you can easily and instantly…

  • author social applications using your browser
  • participate and collaborate with others around live, editable code
  • use the richness of popular web APIs to create your applications
  • publish your social applications to multiple social platforms with a single click

So as you can see Zembly gives you a way to generate applications for a number of platforms without having to learn the code for each of the said platform. So how do you go about actually creating them?

Basically you have an online editor which allows you to take sample code snippets from other projects that people have made and then insert them into your application. For example you may be trying to work out how you can build a voting system, you would search using the inbuilt search function for “voting” and then it would return you all the code snippets which you could then plug straight into your application.

Once you have your application built you can then save it onto the Zembly server so you don’t even have to sort out hosting or setting up your server to handle applications. Once done you can submit your app’s to say, Facebook and watch the millions of $$$ roll on in :) You can already check out a Facebook application which has been made using Zembly called Capitol Punishment, as you can see it is pretty robust.

Zembly is currently in private beta so keep an eye on their site for when it comes available.

 

My fascination with Facebook connect just grows and grows, i absolutely love it and as such i have decided to add it to this blog so that users can login and comment using their Facebook credentials. Obviously i realise that my blog doesn’t get a massive amount of traffic nor comments but i truly believe in FB connect as a platform which can greatly increase the exposure of your site, so lets see how it goes.

I have modified the wordpress plugin from Sociable.es and it is so easy to configure and change. You just upload the entire directory into your wp plugins and then go to Facebook and add the developers application. Once there, click ‘New Application’ and then you will be given a key that you put in the settings section of wordpress. If you get stuck along the way just checkout the readme.txt file that comes with the plugin.

Save them both and away you go. Its worth noting that there are two other wordpress plugins available for Facebook Connect but they are no where near as good as the one from Sociable.

You can use the Facebook Connect form on the right of this page to login and comment from now on :)