KBD

Keith Devens .com

Saturday, July 5, 2008 Flag waving
The direct pursuit of happiness is a recipe for an unhappy life. – Donald Campbell

Archive: November 26, 2003

← November 25, 2003November 27, 2003 →

Daily link icon Wednesday, November 26, 2003

Pi Calculus

Manageability: Quick Take on the Pi Calculus

One of the best things about Bayesian spam filtering

One of the best things about Bayesian spam filtering is that any domain that's ever used in a spam e-mail immediately gets a high spam probability. So, once a domain is used, it becomes essentially useless (really, a hinderance) to the spammer. That's one of the ways in which spam filtering helps increase spammers' costs. Of course, they can try just using the IP address, but IP address octets wind up getting pretty high spam ratings themselves. Plus, the URL of any page they ever use on their site gets a high spam rating, so they have to keep creating new URLs, not to mention, of course, that their message still has to pass a spam filter's muster. I love it when I see messages that are mangled beyond recognition in an effort to avoid spam detection that still rate like a 97% spam. All of this increases their cost of doing business.

Everyone gets real doomsayer-ey real fast. Spam isn't that old, neither is e-mail, and neither are the methods for dealing with spam. It all will get better and work itself out over time.

Cox & Forkum are very talented

Cox & Forkum are very talented. I really enjoy their work.

Spidering Hacks, or Template::Extract

O'Reilly Network: Spidering Hacks

From the "Holy crap, I never knew this existed, but it's awesome!" department, check out Template::Extract:

One day, I was fiddling about with the Template Toolkit (http://www.template-toolkit.com/) and it dawned on me that all these sites were, at some level, generated with some templating engine. The Template Toolkit takes a template and some data and produces HTML output.

Okay, you might think, very interesting, but how does this relate to scraping web pages for RSS? Well, we know what the HTML looks like, and we can make a reasonable guess at what the template ought to look like, but we want only the data. If only I could apply the Template Toolkit backward somehow. Taking HTML output and a template that could conceivably generate the output, I could retrieve the original data structure and, from then on, generating RSS from the data structure would be a piece of cake.

Like most brilliant ideas, this is hardly original, and an equally brilliant man named Autrijus Tang not only had the idea a long time before me, but—and this is the hard part—actually worked out how to implement it. His Template::Extract Perl module (http://search.cpan.org/author/AUTRIJUS/Template-Extract/) does precisely this: extract a data structure from its template and output.

This tip was an excerpt from O'Reilly's new book, Spidering Hacks by Kevin Hemenway, author of AmphetaDesk, and Tara Calishain (of ResearchBuzz fame!)

Also check out Autrijus Tang's Template::Generate, which completes the trifecta of Perl template parsing tools:

 Template:           ($template + $data) ==> $document   # normal
 Template::Extract:  ($document + $template) ==> $data   # tricky
 Template::Generate: ($data + $document) ==> $template   # very tricky

You've got to be kidding me Smiley

Super Mario Brothers 3... in 11 minutes

Unbelievable: http://www.lowqualitycomics.com/video/moSMB3.wmv

Mirrored here in the short term to save these nice folks some bandwidth: http://keithdevens.com/files/moSMB3.wmv

Update: Here's a fun review of the video, with another download location:
http://www.people.virginia.edu/~scy8y/moSMB3.wmv

Update: That video's down. Here are some more mirrors. Please try these first as I'm getting close to exceeding my bandwidth for the month:

http://home.megapass.co.kr/~kys706700/moSMB3.wmv
http://www.lowqualitycomics.com/video/moSMB3.wmv (yes, same as above)
http://home.megapass.co.kr/~ipoet76/moSMB3.wmv
http://www.morecooler.com/plugins/uploads/moSMB3.wmv
http://club.nate.com/cfiles/952/cgipeople/moSMB3.wmv
http://digitalinkart.rmrlabz.com/attachments/1069801732-moSMB3.wmv
http://ebaumsworld.com/smb3beat.html
http://mywebpages.comcast.net/80mike/mario3.wmv

(more coming as I find them)

Update (Jan 23, '04):
http://www.ebaumsworld.com/smb3beat-r.html
http://www.albinoblacksheep.com/video/smb3.php -- streaming version!


I just came across the Metroid Prime speed run. You may be interested in that too.

Update: Oops, too much bandwidth. I've removed the download from my site. You'll have to use one of the others.

Update: Seems there's also a similar Super Mario Brothers 1 video (Mario Brothers.avi). It's 100 megs... I'm downloading now. Update: Funny video. He focuses mostly on glitches in the game.

Update: Also see Contra in 15 minutes. It's actually about 14. The video gets choppy sometimes, but it's still entertaining. Update: Here's Contra in 13 minutes by the same person, Peter Yang.

Update: This forum thread has links to other similar feats of game-playing mastery. I've linked to most of them already, but they have a link to the Megaman 2 video:
http://batman.jypoly.fi/~c2236/mock.wmv (44 megs)

"I think he’s very, very bright"

General Tommy Franks on George Bush:

“As I look at President Bush, I think he will ultimately be judged as a man of extremely high character. A very thoughtful man, not having been appraised properly by those who would say he’s not very smart. I find the contrary. I think he’s very, very bright. And I suspect that he’ll be judged as a man who led this country through a crease in history effectively. Probably we’ll think of him in years to come as an American hero.”

I really like Tommy Franks. He's a very wise man:

Franks ended his interview with a less-than-optimistic note. “It’s not in the history of civilization for peace ever to reign. Never has in the history of man. ... I doubt that we’ll ever have a time when the world will actually be at peace.”

I heard about this a while ago, and I think it fits here too: Real Bush 'At Odds with Media Caricature'

US President George Bush is “totally at odds” with his media image, Liberal Democrat foreign affairs spokesman Menzies Campbell said today.

Mr Campbell, an opponent of the war with Iraq, spoke out on the ePolitix website about his discussions with the President during the state visit.

He said that they discussed directly issues such as Iraq, the Middle East, Guantanamo Bay, Kyoto and trade sanctions.

“He is personally extremely engaging. He has a well-developed sense of humour, is self-deprecating and when he engages in a discussion with you he is warm and concentrates directly on you.

“He looks you straight in the eye and tells you exactly what he thinks.”

Mr Campbell, stressing that the President was “totally at odds” with his media image, went on: “I was not persuaded by what he said, but I was most certainly surprised at the extent to which the caricature of him was inaccurate.”

RSS math

The End of RSS

RSS may have the potential to be a saver on bandwidth, but when you are getting hit once an hour or more by thousands of sites, 24,000 extra hits ads up, and it's all the worse when so many are using broken clients that ignore the caching rules.

I stopped using an RSS reader a while ago, and now just have a bunch of bookmarks. I think it's crazy that we use software that checks a site automatically when you're not even actively looking to read anything right then. Eventually, I'll get around to finishing my software that lets me keep track of site changes (which will respect E-Tags, etc.), and will only check when I hit the "check" button Smiley

After reading Gary's section on an "RSS network", I was reminded of something I read a while ago, that RSS was usenet reinvented, and done poorly.

Why I recommend that people don't use IE...

The incessant stream of security holes. When my dad got his new computer recently, I told him never to use IE for security reasons. He liked Mozilla (Firebird) better anyway. Though, IE is still plugged into Outlook, etc. But, he has his preview pane turned off in his spam folder, and I'm not sure Outlook, with security patches applied, will execute Javascript.

Anyway, the moral of the story is, for your own good, stay far away from IE.

Update: Lots more info, including links to 14 of Microsoft's "culmulative security update"s from the past two years. With each, of course, containing fixes to multiple security vulnerabilities.

Ruby2

Matz: Visions for the Future, or "How Ruby Sucks" (see slide 3). (Via Why)

Looks like Matz wants to do a slightly backwards-incompatible rewrite of Ruby to take care some of the rough spots and add some new features. Some of the new features are straight out of Python, some are new. He wants to base Ruby2 on a new VM called Rite. That page has most of the information on the slides, plus it addresses the question I was wondering as I read through the whole thing: "What about Parrot?":

Parrot doesn't affect matz's plans at all: he will make his own VM (Rite), which will be the reference implementation if Parrot is ever able to run Ruby code and becomes more popular than Rite.

Parrot promises top-notch speed and the ability to share libraries written in several language; many people remain rather sceptic on that regard. It is widely believed that matz's independent creation of a VM specific for Ruby could be completed faster and better than Parrot's all-encompassing goal.

Could be. It's a shame that there's such duplication of effort, but we'll see what happens. I suppose if Rite has good ideas, they could benefit everybody. Or, if Parrot winds up not doing what they need when it's finished, then they're certainly right to be going down the path they are. Or, if they wind up finishing first, even though Parrot has a two or three year head start, then they're totally justified. Of course, I don't want to judge... they're not "wrong" even if the only reason for them not using Parrot is "we didn't feel like it, and feel like building our own VM".

In any case, the scripting language scene just got a little more interesting in the long term.

← November 25, 2003November 27, 2003 →
July 2008
SunMonTueWedThuFriSat
 12345
6789101112
13141516171819
20212223242526
2728293031 



RSS feed RSS feed for Keith's Weblog
Atom feed Atom feed for Keith's Weblog
Weblog archive
Recent comments
  on 5 posts

Recent comments XML

Girls, please don't get breast implants

> And no, you will not be receiving​a picture.

:-(...

Keith: Jul 2, 6:05am

Javascript clone function

This is a clever way to clone an​object if you are using YAHOO UI.​Same tec...

Antonio: Jul 1, 12:47pm

I hate Norton Antivirus

Oh just one other thing norton is​great at keeping people out of your​compu...

kevin.sands: Jul 1, 12:50am

Terminator 3 was awful

I think the biggest reason why T3​totally blew was because Edward​Furlong g...

76.167.172.64: Jun 29, 3:06am

Generated in about 0.106s.

(Used 7 db queries)

mobile phone