We're having performance issues on RHW lately, and one finding is that Jobster is scraping our jobs. No surprise there. They do attribute us, but I never liked their model of just scraping instead of asking for a feed.
We're seeing this in our error logs:
DateTime={ts '2007-10-22 14:47:22'}, Template=/BrowseAds/index.cfm, RemoteAddress=12.129.9.203, HTTPReferer=, Diagnostics=The request has exceeded the allowable time limit Tag: CFSTOREDPROC
The error occurred on line 223., QueryString=SN=103&M=100&RP=datedesc&D=summary&R=1, Cookie=, Browser=Mozilla/4.0 (compatible; +http://www.jobster.com/indexing.html)\
But the link their robot reports, http://www.jobster.com/indexing.html, is 404. For shame.
So, quick ARIN lookup on the remote IP 12.129.9.203 gives this:
AT&T WorldNet Services ATT (NET-12-0-0-0-1)
12.0.0.0 - 12.255.255.255
CERFnet ATTENS-SEA1-1 (NET-12-129-0-0-1)
12.129.0.0 - 12.129.63.255
Jobster, Inc. ATTENS-011426-005621 (NET-12-129-9-192-1)
12.129.9.192 - 12.129.9.223
So I'm blocking them at the firewall, until they decide to play nice.
Monday, October 22, 2007
Jobster gets blocked
Subscribe to:
Post Comments (Atom)
3 comments:
Why *wouldn't* they use a feed? Seems bizarre.
I think its because, besides being an aggregator of of jobs, they also sell help wanted ads on their site. So they want our content, and want to compete with us too.
Hi Steve,
We would prefer to take a feed from you for your jobs instead of scraping your site to ensure the data is accurate and to avoid any further issues. Give me a call to discuss.
Kevin O'Donnell
Business Director, Jobster
206.428.1131
kevino@jobster.com
Post a Comment