KBD

Keith Devens .com

Saturday, March 20, 2010 Flag waving
Code is poetry. – wordpress.org

Archive: August 12, 2003

← August 11, 2003August 13, 2003 →

Daily link icon Tuesday, August 12, 2003

Namespaces don't "make a damn bit of difference"

This is awesome. Tim Bray echoes exactly what I've been saying about Namespaces.

Right now, in the context of the Pie/Echo/Atom/whatever project, people assert that crystallizing the meaning of embedded namespaces is the key to interoperability, the central problem, and so on. Huh? When someone proposes markup from another namespace for inclusion in a syndication feed, there are three possible outcomes:

1. Nobody pays attention and it isn't much adopted.
2. It gets widely adopted, with semantics along the lines originally proposed.
3. It gets widely adopted, with some semantic drift away from the original proposal becoming evident in the implementations. (Note that this has already happened with some RSS 2.0 markup).

Oddly enough, this is exactly what will happen with proposed tags and attributes that aren't in a different namespace.

Earlier he makes the point that I've made repeatedly, that the meaning of something is only present in someone's mind, not in some text labels inside angle brackets. And it's great to see him make exactly the same point I've made, that the evolution of markup proceeds the same regardless of whether tags are segmented into different namespaces.

At the end of the day, markup is just a bunch of labels. We should be grateful that XML makes them (somewhat) human-readable and internationalized, and try to write down what we want them to mean as clearly as and cleanly as we can, with a view to the needs of the downstream implementors and users.

But we shouldn't try to kid ourselves that meaning is inherent in those pointy brackets, and we really shouldn't pretend that namespaces make a damn bit of difference.

Software has to be written to deal with the markup, and putting something in a namespace doesn't do anything to make writing the software easier or more automatic, or encapsulate any more meaning in the document than is present in the community understanding of a given tag and codified in source code meant to deal with those tags. In addition, besides being useless "semantic bloat", they just plain make XML harder to use.

What it comes down to is that if you normalize every tag, say, in the Dublin Core namespace and change how it appears in a document from "dc:tagname" to "http://purl.org/dc/elements/1.1/:tagname", the "namespace" goes away. It's all just strings that you're trying to keep unique.

Dave: "Namespaces create elements with names with colons in them." - exactly!

Filters that fight back

Paul Graham has a new article out, Filters that Fight Back. In the first part of the article he basically does a "State of the Union" on SPAM, focusing on techniques spammers use to try to get around spam, and concludes that none of them are really effective at concealing the fact that a message is spam. None of this was new to me, since I've been using POPFile [1] for a while and have been watching what goes on, but it's a great summary with actual statistics Smiley

The focus of the article is this:

As I mentioned in Will Filters Kill Spam?, following all the urls in a spam would have an amusing side-effect. If popular email clients did this in order to filter spam, the spammer's servers would take a serious pounding. The more I think about this, the better an idea it seems. This isn't just amusing; it would be hard to imagine a more perfectly targeted counterattack on spammers.

So I'd like to suggest an additional feature to those working on spam filters: a "punish" mode which, if turned on, would retrieve whatever's at the end of every url in a suspected spam n times, where n could be set by the user.

If widely used, auto-retrieving spam filters would make the email system rebound. The huge volume of the spam, which has so far worked in the spammer's favor, will now work against him, like a branch snapping back in his face. Auto-retrieving spam filters will drive the spammer's costs up, and his sales down: his bandwidth usage will go through the roof, and his servers will grind to a halt under the load, which will make them unavailable to the people who would have responded to the spam.

The whole point of spam fighting is to raise spammers' costs until it is no longer economically viable for anyone to spam. Unless you do that, spam will continue. This sounds like it might be an effective way to punish spammers, but I'm a little concerned about the potential for abuse. We'll see what happens...

What's great, however, is that we're beginning to win the war on spam, even if you don't realize it yet. Spammers are changing their behavior, which means Bayesian filtering is having an effect. If you're not using a Bayesian filter yet, why not? Graham has a huge list of them for you to check out, as well as a link to an article comparing POPFile and SpamBayes. Finally, check out this article about how SpamBayes is about to be rolled out at Cornell. The more widely installed anti-spam software is, the more we all benefit. Go Cornell.

Footnotes:
[1]: I'm now at 99.2% accuracy, though POPFile chokes on some messages containing Chinese and I have to change my account settings to not use POPFile to be able to get my mail. I'm not sure whether that can be fixed easily, or whether this is a problem Perl (or, just the version of Perl POPFile ships with) has dealing with Unicode, for example

Anti gun-rights people

This is what I don't understand about people who are anti gun-rights... maybe someone can tell me.

When in an argument with someone who's for gun control, against individuals owning and being able to carry guns, etc., I want to ask "So you're against people being able to defend themselves?"

If you're pro gun-control, how would you answer?

← August 11, 2003August 13, 2003 →
March 2010
SunMonTueWedThuFriSat
 123456
78910111213
14151617181920
21222324252627
28293031 



RSS feed RSS feed for Keith's Weblog
Atom feed Atom feed for Keith's Weblog
Weblog archive
Recent comments
  on 2 posts

Recent comments XML

I hate ASP.NET

I hate ASP... I was doing wonders​with PHP, then suddenly one of my​clients...

Johnies: Mar 17, 6:14am

Quantum physics and free will

I knew you were going to say that....

Tom Massey: Mar 15, 9:26pm

Generated in about 0.05s.

(Used 7 db queries)