TechnObituary: Trackback (2002-2005)
Thursday, February 3, 2005

How could we have not seen this coming? Ok, I did see it coming, but thought I'd hold on to the bitter end until the buggers saw the yet another big gaping opportunity and then I'd close it up. I just closed it up.

After a long and protracted battle against an advancing cancer of the spam, Trackback is dead, at the age of three. It is survived by comment systems (also in ill-health from a similar cause) and by more reliable and secure cross-threading technosiblings such as the technorati cosmos bookmarklets and the younger meta-threader WhoLinks2Me.

I've had 500 trackback spams in the past 24 hours, and what's most amazing is how few of them there are. Think about it: We have a no-login, no-authentication, automated backdoor posting method that feeds directly into the content stream using only a small readily-automated XML message posted via readily-available RPC method, and we open this open posting channel to any IP anywhere, to post whatever they like, when ever they like.

And we're surprised when it gets spam?

My guess is, like my long ago post about robot comment spam, this week's bombardment is the work of one person. Every blog I own and every blogowner that I know has reported the exact same trackback spam flooding in over the past three days. The perpetrator is smart enough to pace the posts, so much of it is probably going undetected, but when I have so many blog-like sites on one webhost account, that frequency was quite conspicious. For the first day, I just banned their IP at the Apache level, the faucet was closed immediately. Next day, they'd shifted IP and shifted to a few different affiliate site URLs, and also learned to avoid using their keywords in the body of their message.

That's the thing about viral scum. They adapt.

I banned the next IP, they went away. Today, they were back, more affiliate links, more clever avoidance, and on a different IP ... it was clear, they had learned what TB was and how it worked and their business of selling that easy access broadcast channel was picking up new clients daily.

So why don't I just go with the in-crowd and testify my errant trackbacks into the greater pool of sinful patterns and let the future co-operative successors to Jay Allen's blacklistings pick and trim and prune me back into the nice world that was? I mean, besides that meaning putting my hand in the hand of the people who put us into these waters? Because its time to face the facts: it won't work.

I had already surmised and now I have testimonial evidence that the link-spammers are motivated beyond anything we could inflict as a filtering. We can put up any impedement, and the link-spammers will simply cruise their Jag down to their scenic-view corner-office, push a few keys and escalate. You would too for that kind of money.

That's another thing about viral scum: When they find a nutricious host for their feast, they replicate ... exponentially.

I have removed our trackback module, thereby losing all trackback links on my stories and, I believe, also losing the ability to trackback to other sites (but they will all follow suit very shortly I'm sure). Trackback, so far as the teledyn.com suite of sites is concerned, is dead, gone and forgotten, another story now left only as tales to be told to the grandkids of how the old Net once was, once upon a time.

Submitted by mrG on Thu, 2005-02-03 11:22.


Comment viewing options
Select your preferred way to display the comments and click "Save settings" to activate your changes.

I miss TB already. Like the

I miss TB already. Like the bells that say when an angel gets their wings, it was those little pings that would show up in the margin saying some body loves you or hates you, as the case may be, but it was still a little bit of interactive feedback that stepped beyond the machine, beyond the next machine, out to some other person.

Trackback: TNG

I did mention Technorati cosmos bookmarklets and WhoLinks2Me, but there's a problem with these. It's a small problem, a solvable problem, but a problem where the solution is going to require someone with their sorts of resources to also grok the powers of webservices to do something really very similar to trackback to restore this line downed by the link-spam scum.

We could call it, Trackback, the Next Generation, but we won't.

Here's the model: QuickTopic.com -- this is my proof that it's doable.

  • the problem with technorati bookmarks or other cosmos-report blog-concordance services as they sit today is that they are out there, and not here, not inline, in the page where I am looking. Why do I want to know? This is the real crux of the issue: I want to know people are linking to the post I am reading because it tells me if there is a discussion going on. I don't want to walk the hallways opening rooms to see if there happens to be a discussion, I want to be told there is a discussion, how big it is, and then decide to go to that room -- sometimes just knowing there is a discussion and how big (or small) is all the additional metadata I need for my blog-reading purposes.
  • QuickTopics provides some javascript for every forum opened; this javascript is to be put inline into the subject article, and it will print itself as a link to the external forum thread and it reports on how many items are currently in the forum thread. It's real-time, computed for every page-read of the subject page, and it means the subject page might be slowed down in the rendering to wait for that external response, but nonetheless this is exactly the behaviour we want from TB:TNG
  • Trackback was already 'webservices' based, a simple aside call to some other site, but it was done at post time; TB:TNG must be done at read time, and that makes the efficiency and reliability of the server a bit of a problem. The late TB was also gloriously distributed with no central point of failure, and that's something to consider too. Centralized services are always doomed.
  • The late TB was killed by sociopaths. How do we protect against TB:TNG abuse? Because the concordance is computed by blog spidering done by one (or more?) of blogdex, technorati, daypop etc, link-spam cannot be blind-robot mass-injected, the link must actually exist somewhere. Spammers could, of course, create new sites that simply spider blogs and then link to all the links they find thereby polluting the cosmos-reports, but those pages would have to exist, on some server; perhaps we could retaliate (they will mask their IP, hole up with sympathetic webhosts but they will have to still serve non-infinite bandwidth) or we might blacklist/filter (back to square-one, always crashing in the same car?) or maybe, could there be some other means to contain/exclude them?

So what's the answer? I don't know. I started out this post pretty sure, but now, having thought about it? The above bullet-points do describe where TB has to go if it is to be re-instated, but that last item worries me, maybe even kills the whole idea before we get out the gate.

So maybe I don't know about resurrecting TB, but I do know this: I miss TB, and that little bell that rings whenever an angel gets their wings.

More bad news, discovered today

More bad news, discovered today on my own technical aggregator site where the combination of automated aggregators rebroadcasting the automated tbspam.pngTrackback on the popular Social Software page at TopicExchange prove a deadly combination, and likely spells the death knell for the TE service -- while a trip to the page as it is now shows that someone has been quick on the draw to delete the offending remark, and I expect they are mulling right now over various Bayesian misfortunes.

The thing is, the damage is already done.

Post-hoc don't cut it with RSS. Already, the instant it was posted, like my own, there are now thousands of replications of that initial hair-trigger RSS broadcast

which maybe brings up a problem with RSS in that it's yet another blind-faith trust channel we all leave wide open without so much as a cuss-word filtre

net result: although Drupal filters comments and trackbacks (which are easy to defeat) it tacitly trusts RSS, and TopicExchange is the Loose Lip that Sinks us. Sure enough, now that I know about it, a quick check shows that this current shot isn't the first, and ironically there was one floating about my other site almost at the same moment that I posted the item that starts this thread.

And once they are in your system, they are in your system. My site's sidebars now all carry the item, my aggregator pages with that cross-sectional categorization all carry it. Perfectly integrated, walked right in the unguarded RSS door and sat down like it belonged there.

Spammers: 1
WebFolks: 0

Great googly-mooglies there was

Great googly-mooglies there was a lot of crap in there! Just a few of the common keywords put through the SQL to the aggregator item titles, links and descriptions, and be damned if'n it didn't turn up hundreds of the slimey buggers. I'll bet there's more too, but I'm just going to close the TopicExchange feeds and let them sleuce out the pipe.

Bye-bye, TE. It was really nice, it really was. And who knows, maybe someday ...

More trouble in blog city: Turns

More trouble in blog city: Turns out, it only takes four inbound links to get you placed into the 22nd spot on Blogdex these days, and just 7 to get into the top-10 -- and once you do, you've successfully cracked into yet another direct RSS channel bound for auto-sidebar space on a host of blogs.

At the very least, I should think Blogdex might discount links coming from the same base domain name. Hard to imagine any useful scenario where such links would have any semantic meaning of importance -- except maybe Salon/Typepad bloggers citing each other.

A manual cosmos tracking tells

A manual cosmos tracking tells us HNBC also throws in the towel on the trackbacks:

Seeing as how I want to avoid the drudgery of installing additional filtering, throttling, moderation, and other hackage, and since it's only once in a blue moon that I get an actual trackback ping, I've opted to go the path of least resistance and turn off trackback -- utterly.

The post includes a good summary of others who've come to much the same conclusion. Meanwhile SEB is still hoping the unfunded amateurs can outrun the Jaguars...

This can be tedious, though, and a lot of folks understandably don’t want to be bothered with it. Most folks value comments enough to work at keeping those clear of spam, but trackbacks aren’t considered as important so the trend appears to be to just abandon them rather than fight with the spammers.

You gotta know when to hold 'em, know when to fold 'em. For my own money, yes, I loved the little bell of the TB and the Cosmos thing is also tedious, but I have other things to do with my time than trying to outguess the latest spam content. From what's happening over on the email channel, we can already see how the only reaction to even the cleverest filters is simply an explosion in more creative v4r!4ti0n5 on how to spell a certain erection drug.

FWIW, back in 1995 when we were putting up the Ontario Science Centre's internet cafe, there was a call for blacklisting porn sites through the proxy. I pointed out then that it was possible, yes, but would require a full-time staff of two or three to keep track of and implement the filters for it all, but added that the resulting list might have commercial value on the blackmarket ...

SEB also notes how 6Apart have been curiously silent on the whole tb-spam debate. Best they've offered is extending the same filters used for comments, and we already know how effective that was.

And finally, filed under Misery loves Company, After Gutenburg assures us we are not alone, and that all blog platforms have woken up to the realities of tb in the Real World -- I haven't heard anything from the Drupal crews on any proposed solutions, but just so you know, yes, Drupal already applies bayesian+regex comment/content spam filters to its trackbacks and no, it doesn't really work because the filtering isn't applied to name links and the body-filter isn't as clever as the spammers (although it is much easier to delete spam in Drupal as compared to MT, but you need to manually fold in the special fast-delete patch for the spam module) -- and if you already run TB in drupal, you don't need a manual to quit the game; turning it off is as simple as disengaging the module.

Another manual-mode pseudo-tb

Another manual-mode pseudo-tb replacement, for those with the hardware that can run FireFox (I can't, not enough RAM, so I'm stuck in Mozilla 1.7) there's apparently a sidebar About... page that brings together all sorts of web-refs on the current URL, including the Technoratic Cosmos and the Internet Archive page-history.

The Shape of TB to Come

First inklings stir of the Shape of TB to Come ... I have a process now, not perfect, takes a while to anneal to stable but it's working, secure, accountable and quasi-reliable, where I can get that instant gratifying feedback that we bloggers so enjoy, and have a CC on the ego massage arrive via cellphone.

Nothin' quite compares to looking up from the tiny screen and telling the next person in the cash-out line, "I've just been blogged."

yeah, right, ok bud. can you move it? your teller's waiting ...

Technically pretty simple to do, but operationally there's some puzzles left to solve, learning left to happen and some new habits to hone.

  • the cosmos discovery is patchy and sporatic; this morning I received notifications for citations made months ago, only just discovered. This tells me the coverage is more complete than I'd expected, but also tells me the alerts may not be precisely real-time
  • not sure if it's good or bad, but I can't (yet) directly share these citations with you. This may be a Good Thing because
    1. you don't get bothered with useless or repetitive repetitive repetitive TB pings as were so common back in MT MT MT days. If someone rebuilds their pages or repings my page, I'll get the phone alert, but otherwise no harm done
    2. fewer false positives would mean the TB count per article might actually mean something.
  • for now, the process to promote these pings to published is manual, so the TB is automatically gated.

That latter point is probably the inescapable future for all TB; it's maybe acceptable collateral damage to have the spam trackbacks flood the author (up to a point of a few dozen a day maybe) but unacceptable to have that same level of damage auto-promoted to public view, and as we see with the easy defeat of the Bayesians by simple random anecdote-prose noise, there's no robot filter that's going to do the task.

The parallel from the Real World is probably like those, egad, three full freakin' pages? of one-liner critical reviews pumping the pre-intro of William Gibson's new book (can you say 'media whore'?) -- anyone can and will say anything they like, even irrelevent or self-serving like a link-spamming, and the author/publisher will, by this new method I've got going now, get a clipping of each and every one, but it's only a raw material, a pool from which to draw what we will.

So if a trackback is good (or delectably bad) we might lift the lanyard and let it pass, and hold off our approval stamps for anything remotely spammish. Unlike the robotic as it was honesty of the auto-trackback, it is true this new gated and edited version is deliberately and selectively re-inforcing our own statement, like Gibson's publishers, in the service of our making that rhetorical 'sale' and you're right to lament that loss of stark reality -- for those rightfully suspect of any blog-author's trackback pruning impartiality, I suppose there will always be a Technorati or Bloglines to show up the full gory list ... although no doubt, as the scourge of TB spam overtakes them, even these sites will grow to be as selectively deliberate as we're seeing on the TopicExchange, but presumably, being apart from any vested interest in the discussions, we'd hope they'll be less biased to the actual politics of whatever it is being discussed in the cross-blog thread.

Still more bad news, this one

Still more bad news, this one is for those of you who think, "So what if a little spam seeps into my website ..." and as they say, the picture is worth more than words:

googlespam.jpg

I'm not saying you should click-through the buggers into bankruptcy, but it is true that the payouts are what keep many blogs going. I'll leave any conclusions there as an exercise for the reader ...

White is the new Black

White is the new Black -- there is one other way to resussitate TrackBack, but it's neither easy nor likely to find it's way into widespread use because it requires a fundamental re-engineering of every TB-enabled website (although it could interoperate with un-fixed TB). The key is to flip the access rule, and switch our approach from Blacklisting to Whitelisting.

Here's how it works:

  1. When a blog posts about another blog, the same auto-discovery occurs to extract the XML/RDF instructions for posting the TB notice.
  2. On the receiving site, the incoming TB is identified by the source address (probably IP)
    • if the IP is known to be trustworthy, the post appears, no problem.
    • if the IP is unknown, the new item goes into a holding bin awaiting approval, thereafter whitelisting the IP for future comments.
    • if the post is a known spammer, according to the patterns popularly used in the current fleet of filters, the post is silently accepted and discarded, although it's much easier on your server if that source IP is added to the .htaccess DENY rules (impractical for many IPs using dialup pools)
  3. Whitelisting could even offer shades of permissions, allowing links and HTML for some, only text for others

I'm going to assert that the general case for Trackbacks is the same as I observe on the blogs I frequent, where the same few people tend to contribute many trackbacks, with only a minority of posts from new unknown sources; unlike Email, where the incoming remarks are more often from previously unseen and unfamiliar potential-contact corespondents, the TrackBack case is far better suited to the Whitelist paradigm -- we are already faced with the nontrivial routine task of updating our Blacklists as it stands now, and as I see on TopicExchange, post hoc disqualifications are unacceptable because the RSS has already gone out spamming the subscriber sites.

With the Whitelist paradigm, your blog maintenance reduces to one-time approvals for each Trackback source, and I'll wager that's a far lighter burden.

Interesting ... quite remarkable

Interesting ... quite remarkable really, how little traction I've had with this post and the subsequent thread of footnotes. Maybe it's a Chomsky-Nader Effect that says you only get to blow the whistle once and thereafter they all just say, "Oh, but that's just him, y'know ..." and move along folks, nothin' to see here, you've all seen technology run off the rails before, you know the drill, nothin' to see here folks, just move along ...

Ah, nostalgia ... I remember those conversations, "Whaddaya mean 'comment spam'?" and "Oh, but BAYESIAN filters will catch all that ..." -- aye thems were the days of our innocent youth, wiled away in thoughtful posting into the wind of this or that of whatever our fancies'd.

If it's any matter for the consideration set, I still think the days of RSS are numbered, at least unless the ship turns around, which don't seem included in the venn-space of the possible futures, but then, maybe what this all says, apart from it all saying what our Internet's co-founder said about the friendly among-friends colleagular society of the early net, is that maybe there really isn't a vector of expansion taking this 'Net thing into all spaces. The lassez-faire of the status quo isn't our cruising towards a Close Encounter with Catastrophe Theory, it is only a sign of an inevitable and possibly impending fragmentation!

breaking the net apart

Don't laugh, it's already happening. Every day I hear tell of new dark-galaxies of the net that I haven't the foggiest clue how to find let alone how to cobble the connection, and I doubt I'd be welcome if I did. My kids have theirs as do my parents, my colleagues have theirs, my co-workers and neighbours have their own and I dare-say I have mine that wouldn't interest you; could it be possible that rather than our bridging these bubbles as was blog-discussed so much some time ago the real and actual trend is to carve up the expanding space and fortify the boundaries?

back to 'trust'

I never understood why Forums were so popular; awkward web-stations you had to awkwardly tune in and navigate with the most clumsy of interfaces made worse being rendered in the laughable edit-box of the 1995 browser ... and yet there they were, and still are, flourishing in their isolated ecosystems. Why there and not a pollutant-purged USENET?

It never made sense to me, back then.

Maybe it does now: The networking trend is not to progressively include, not to ever more inter-networking; the trend is to identify, isolate, thread and exclude the rest! The trend is a return to the median conversion reaction to staggering size with progressive divisions back to the manageable. It's as Bucky Fuller said of any closest-packed sphere-event topology, you start with one (being at minimum two) which is four (being one plus one), but all that aside, you pack other tetral minimum-tunable events round it and find that twelve fit nicely, and forty-two around those, and ninety-two around those.

But on the subsequent fourth layer a strange thing happens, we discover our first-envelope layer 12 sphere-events are now themselves each and every last man jack of them surrounded by a smooth complete layer of twelve, having come into their own, as it were, a dozen new focal hubs.

Failure of the linked-up

This is why the Artificial Social Network sites fail! This is the fatal flaw in the Linked-Incademrizer strategies that seek only to expand the rolodex of the traveling spam salesman; my own collection of business cards here on my desk is not a flat address space, it is partitioned!

The motivational tapes are wrong: My network isn't "larger than I imagine" -- I belong to many networks. And each of them small and manageable, trustworthy, safe, a colleagulum of workable but walled and gated, communities of trusted connections.

The network is therefore wrongly portrayed string-mandala like a 70's wall decoration, one-dimensional like a perceptron network -- the Net should be portrayed as an omni-directional grid of close-packed participants and as such, the network progressively divides ... by multiplying.

Out goes powerlaw, out goes every issue of large-scale connectionisms. Google fills with noise -- It's not an ever expanding network, it's an ever expanding loose-coupled aggregate of locally distinct newly self-discovered tight-coupled networks.

And that is a very different sort of animal.

Hmmm ... sounds like fodder for a whole new blog thread I can pitch before the Chomsky-Nader Effect ...

Only 6 months later, and now

Only 6 months later, and now the blogdex is abuzz with the latest TB defector, Jeremy Zawodny comes to his senses citing several others all basically saying what was said here last February ...

I'm convinced now more than ever. But I'll spare you all long rant about why it's dead, since others have already written this for me
[ Trackback is Dead ]

and I notice I'm not one of the three he cites, but hey, that's OK, I understand, can't read everything, y'know. I also notice that I noticed this old item of mine hasn't been detectably cited through the new IceRocket blog citations service; to salve my ego a bit, I do note there are three citations found on Blogdigger, all of them contemporary to the original post, and who know, maybe Technorati has more except they are collapsing under request pressure at the moment ...

Seems the traction hasn't improved

Seems the traction hasn't improved a whole lot. I finally got a signal and Technorati says 1 site links here, but shows seven links, all but one from way back last year.

Ok, hindsight is 20-20, 50% of the time, and maybe it wasn't such a great idea to expose the Emperor's very favourite Trackback Gown as the branch-point for the thesis, but thought it might turn a head my way not the least of because it is very true. Ah, but in the blogs today we have Mena calling for 'civility' in not diss'ing things or people just because they could use a diss, but be polite, dare say Victorian, keep on the sunny side, always on the sunny side, and that's a nice sentiment and well meant in her context of backstabbers at her conference of the in-crowd fraternity of bloggerites.

I would have thought at least one of the sharper industrial engineering types or even a keen paleoanthropologist or two might have clicked to my side-story embedded revelation of our Internet as a spontaneously fragmenting system of progressive societal dis-integration, that the loosely fitting broken subchannel pieces spontaneously move to fix themselves by fitting ever more loosely.

But no, as I write this here today, Yahoo! has just bought del.icio.us to fold in with Flickr, pulling more pieces together, centralizing services and bridging more bits, oblivious to the impending structurally inevitable close-packed namespace collapse that has plagued such all together now projects since Babylonian times

Post new comment
  • Allowed HTML tags: <em> <strong> <cite> <code> <div><ul> <ol> <li> <dl> <dt> <dd> <img> <u> <i> <b> <tt> <span><blockquote>
  • You can use Textile markup to format text between the [textile] and (optional) [/textile] tags.
  • Lines and paragraphs break automatically.

More information about formatting options