Monday, October 22, 2007

Jobster gets blocked

We're having performance issues on RHW lately, and one finding is that Jobster is scraping our jobs. No surprise there. They do attribute us, but I never liked their model of just scraping instead of asking for a feed.

We're seeing this in our error logs:

DateTime={ts '2007-10-22 14:47:22'}, Template=/BrowseAds/index.cfm, RemoteAddress=12.129.9.203, HTTPReferer=, Diagnostics=The request has exceeded the allowable time limit Tag: CFSTOREDPROC
The error occurred on line 223., QueryString=SN=103&M=100&RP=datedesc&D=summary&R=1, Cookie=, Browser=Mozilla/4.0 (compatible; +http://www.jobster.com/indexing.html)\

But the link their robot reports, http://www.jobster.com/indexing.html, is 404. For shame.

So, quick ARIN lookup on the remote IP 12.129.9.203 gives this:

AT&T WorldNet Services ATT (NET-12-0-0-0-1)
12.0.0.0 - 12.255.255.255
CERFnet ATTENS-SEA1-1 (NET-12-129-0-0-1)
12.129.0.0 - 12.129.63.255
Jobster, Inc. ATTENS-011426-005621 (NET-12-129-9-192-1)
12.129.9.192 - 12.129.9.223

So I'm blocking them at the firewall, until they decide to play nice.

3 comments:

John said...

Why *wouldn't* they use a feed? Seems bizarre.

Steve Bywater said...

I think its because, besides being an aggregator of of jobs, they also sell help wanted ads on their site. So they want our content, and want to compete with us too.

Unknown said...

Hi Steve,

We would prefer to take a feed from you for your jobs instead of scraping your site to ensure the data is accurate and to avoid any further issues. Give me a call to discuss.

Kevin O'Donnell
Business Director, Jobster
206.428.1131
kevino@jobster.com