thom blake computer ethics

Using SSI in XML for RSS

Using SSI in XML for RSS

Summary: As it turns out, it’s possible to include HTML files in your XML RSS feed using SSI.

(more…)

Hughes ISP ‘accelerator’ causing errors in my logs

After much investigation, I found that Hughes satellite Internet has been the source of the many “File does not exist” errors on our web server. And so, I will here detail the signs that this is happening, our best theories on why / how this happens, and what can be done about it. In the following, paths are changed to protect the innocent.

So in our httpd error logs, I’ll see things like the following:

[Thu Aug 05 14:14:45 2010] [error] [client 67.142.130.26] File does not exist: /m/vhost/htdocs/file.swf, referer: http://thomblake.com/

Now, no one should be looking for file.swf in that folder, which corresponds to http://thomblake.com/file.swf - it’s actually in /m/vhost/htdocs/resources/file.swf and should be accessed via
http://thomblake.com/resources/file.swf. Also, this affected a tiny percentage of users; most people were not getting errors. However, the javascript contains code like the following:

var location = path + "/file.swf";

I spent some time poking around before I got the idea of doing a whois on the offending IP addresses. It turns out they were all owned by Hughes, a satellite ISP. I had already guessed at this point that the errors might just be caused by a bad proxy or something, and this was pretty well confirmed here. As far as I can tell, Hughes has an ‘accelerator’ that prefetches content that dynamic pages might want at some point. One of the things this does is scrape the js file (without parsing or executing it) for anything that looks like a path. So it sees "/file.swf" and tries to retrieve the file at http://thomblake.com/file.swf.

At this point I had a few options. The obvious choice was to ignore the problem. One obscure ISP has a buggy piece of software that, as far as I can tell, is not even ruining the UX of my website. However, it might be worth finding a work-around, so I plodded onward. Obviously, there is not a reasonable, consistent work-around for this. Since someone else’s software being buggy is the problem, the solution is not to fix it on my end. However, it is possible to fix this for particular cases.

As it turns out, I was only using path in one place, so I was able to rewrite the script so that the ‘accelerator’ looks for the file where it actually is. In theory, this improves the performance of the site for the end-user (since the accelerator is now working properly) and it clears some of the junk out of the error logs. I fixed similar problems in other scripts and CSS files by changing some relative paths to absolute paths where it didn’t seem totally crazy to do so.

I was going to include for reference here a list of all Hughes IP addresses so it would be easier for folks to find this site if they’re trying to debug this sort of error, but those seem hard to come by. Below the fold is a short list via http://ws.arin.net/whois/. If anyone has a complete list handy, feel free to send it along.

(more…)

Twitter Script

congypsy.com until recently had a single validation error (XHTML 1.0 strict), and I can’t stand for that sort of thing on a site I work on. The source was the “Latest updates on Twitter” section; I was using Twitter’s pregenerated code bits with just a little modification, and their script expects the <ul> it uses to have a particular ID. The problem came in when I had two of them on a page; obviously, you can’t have two objects with the same ID on one page! But to my surprise, everything worked as expected in Firefox, so I left it alone for a while. (Fail)

So then I realized I could just download the script from the Twitter site, modify it slightly, and free myself from the horror of invalid XHTML. UnFail.

Why I Indent with Spaces

As promised, I’m weighing in on another burning issue of 10 years ago:

I recently overheard the utterance “@thomblake No one has any good reason to like \s indent. Either you like \t, or you don’t care. It’s not life-and-death, but \t is clear win” (via isaacschlueter on Twitter). I assented, as I really only started using space-indent because it’s a best practice at my workplace. But on reflection, I really do have good reasons for space-indent, based on the way that I use indentation in code.

(more…)