Sourcing the RSS Drain
Tuesday, November 25, 2003

Seems I've found the smoking gun draining my bandwidth over the RSS feeds, and it's ironic that the airplay I get comes from a Radio Userland Blog who likely learned about it from a blogroll tuned to my feed:

There was a discussion around this topic about a year ago that seemed to lead to a satisfactory solution, but this muddies the waters. Is the problem confined to Drupal-served feeds?

The culprit is definately not Drupal. That much is clear. Drupal may be strict on the spec, but that's not even being tested in what is really going awry: The culprit appears to be the vast majority of RSS reading clients who use the HTTP GET operator instead of using HEAD first ... and the biggest offender appears to be Radio Userland! (may not be them, they are just the most numerous agent fetching this feed and their numbers approximately match the bytes served)

UPDATE: drat ... just refined my test procedure and sure enough, 304 replies from my feed are devoid of the body, just as specified in the RFC 2616, the world going according to plan.

What about Conditional-GET?

according to RFC 2616

The semantics of the GET method change to a "conditional GET" if the request message includes an If-Modified-Since, If-Unmodified-Since, If-Match, If-None-Match, or If-Range header field. A conditional GET method requests that the entity be transferred only under the circumstances described by the conditional header field(s). The conditional GET method is intended to reduce unnecessary network usage by allowing cached entities to be refreshed without requiring multiple requests or transferring data already held by the client.

This should mean that these requests are valid and that perhaps it is my webhosting service that is frustrating them? Or perhaps it means that these aggregators just aren't implementing conditional-GET?

No closer to a solution actually, it is still a total mystery. My feed accounts for 25% of my daily bandwidth, so something is wrong, but whatever that something is, it's apparently not on my server side and my evidence on the aggregators turns out to be a false alarm.

Submitted by mrG on Tue, 2003-11-25 10:59.


Post new comment
  • Allowed HTML tags: <em> <strong> <cite> <code> <div><ul> <ol> <li> <dl> <dt> <dd> <img> <u> <i> <b> <tt> <span><blockquote>
  • You can use Textile markup to format text between the [textile] and (optional) [/textile] tags.
  • Lines and paragraphs break automatically.

More information about formatting options