June 30, 2004

One in five children is solicited online?

The Ad Council is running some "public service" ads claiming that one in five children is sexually solicited online (see here for the claim and here for the advertisements. This claim is based on this report from the National Center for Missing and Exploited Children.

Unfortunately, while this claim is literally true, phrasing it this way is extremely misleading. First, at least 48% of the solicitations are from other people under 18. I know that parents aren't exactly crazy about their kids having sex with their peers, but it's not exactly what you'd call predatory behavior either. I imagine that your average teenager gets sexual solicitations in school on a fairly regular basis. Most of the remaining solicitations were either by individuals of unknown age (27%) or under 25. Only 4% of the solicitations were known to be from individuals over 25. Again, I'm sure that parents think the idea of their 17-year old kid being solicited by 20-year olds is a bit creepy, but that's the kind of thing that happened in my high school fairly regularly as well. I'm not sure it's that big a cause for concern.

Similarly, almost 60% of the solicitations were of kids 15-17. I know parents don't like to think of kids that age as being sexually active, but they have to know that it happens. It's not surprising that they'd be sexually solicited, online or otherwise. It would be interesting to know how often younger kids were solicited by adults over 25, but this report doesn't say.

Bottom line, this report isn't exactly convincing me that online cyberpredation is a major problem--at least no more than the general distaste adults seem to have that children are interested in sexual activity. It's too bad that certain constituencies think that they can make political hay by telling parents that the "The Internet" is going to get their kids molested, regardless of whether it happens to be true.

Posted by ekr at 09:16 PM | Comments (3) | TrackBack

June 29, 2004

Possibly the worst database in the world

AP is reporting that the Justice depertment has denied an FOIA request for it's database on foreign lobbyists because fulfilling the request could cause their computers to crash:
Implementing such a request risks a crash that cannot be fixed and could result in a major loss of data, which would be devastating," wrote Thomas J. McIntyre, chief in the Justice Department's office for information requests.

Advocates for open government said the government's assertion that it could not copy data from its computers was unprecedented but representative of generally negative responses to Freedom of Information Act requests.

"This was a new one on us. We weren't aware there were databases that could be destroyed just by copying them," Bob Williams of the Center for Public Integrity said Tuesday. The watchdog group in Washington made the request in January. He said the group expects to appeal the Justice Department's decision.

Many Justice Department computer systems, especially at the FBI, are considered outdated. The FBI is spending nearly $600 million to modernize its antiquated systems.

The Center for Public Integrity sought information about lobbying activities available under the U.S. Foreign Agents Registration Act, a 1938 law passed in response to German propaganda before World War II. Database records describe details of meetings among foreign lobbyists, the administration and Congress, and payments by foreign governments and some overseas groups for political advertisements and other campaigns.

"What they're asking for is a lot, and it's not something at this particular point in time we have the technical ability to do," Justice Department spokesman Bryan Sierra said Tuesday.

McIntyre explained in a May 24 letter that the computer system - operated in the counterespionage section of the Justice Department's criminal division - "was not designed for mass export of all stored images" and said the system experiences "substantial problems."

See the Justice department's letter here.

Ok, I'm certainly familiar with the concept that data can't be exported from a database. Databases are practically roach motels for data. But this risking a major crash stuff really sounds bogus. Surely the Justice Department keeps backups? If so, they should be able to bring up a new machine restored from backup and try the export there. Maybe it will crash it, maybe it won't, but no harm done either way.

It makes one suspect that Justice isn't that interested in fulfilling this request.

Posted by ekr at 04:39 PM | Comments (18) | TrackBack

June 28, 2004

DEET not quite as repellent as I would like

This weekend, Lisa and I discovered that the Ventana Wilderness has a lot of bugs. Between about 10 AM and 7 PM, if we stopped moving for more than about 10 minutes we were quickly surrounded by a cloud of bugs--all this despite heavy use of 95% DEET (the current gold standard in repellent chemicals) insect repellents. We weren't bitten much--at least I wasn't--but we were repeatedly dive bombed by gnats, mosquitos, etc. Upon investigation, it turns out that DEET has a very short range effect, no more than 4 cm. So, it will protect you from being bitten, but not pestered.
Posted by ekr at 10:16 PM | Comments (2) | TrackBack

Camping gear capsule reviews

Mountain Safety Research Whisperlite Shaker Jet
I've had this stove for about five years and it's generally performed fairly well for me. It's light, easy to set up, and relatively easy to light. Unfortunately, the Whisperlite has some problems that are making me consider replacing it. First, it can be flaky. The first time I used it in the field, I spent about 25 minutes wondering why my water wasn't boiling before I realized I needed to increase the pressure in the gas bottle. Since then it's performed fairly well but this weekend it seemed not to be burning as hot as it should. As recommended, I shook it to clean the jet, but that didn't seem to help as much as I expected. More pumping seemed to help some but the flame was clearly erratic. The Whisperlite is famously field serviceable but I don't really want to be debugging or servicing my stove in the field at 8:30 PM.

It's filthy. The way you prime the Whisperlite is to let some fuel in the priming cup below the stove. You then light the fuel and the flame heats up the generator tube. AOnce the flame dies down you open the gas valve and the stove lights. Unfortunately, this procedure gets soot all over the bottom of the stove. Unless you're pretty good about keeping it clean, whenever you set it up and tear it down you get soot all over your hands--just what you want when you're cooking dinner. Supposedly, it helps to use high-end white gas rather than the Coleman stuff I've been using, but again I'd prefer that things just worked. I've been thinking of replacing the Whisperlite with a JetBoil. A little pricy, but I've heard good things about it.

Primus Trek Kettle
This is the only piece of cookware we brought. It's a simple one liter aluminum pot with a foldable handle. Light, simple. Ideal.

GSI Outdoors Fairshare Mug
Another simple item: a 32-oz lexan mug that seals with a screw top. Perfect for in-the-field eating. Definitely better than eating out of the ziploc bag your freeze-dried dinner came in. Speaking of food, I recommend the Backpacker's Pantry Jamaican BBQ Chicken and Katmandu Curry. Very palatable.

Katadyn Hiker water filter
The Hiker is the standard field water filter. It's light and generally considered to be reliable. This is actually my second Hiker--I used to own one back when it was made by PUR. One problem the Hiker has always had is that the hoses are springy and so you've got to be careful to avoid having them fall into the water you're trying to purify--at which point you're hosed. It's not too hard once you get the trick, especially if you have two people.

This filter, however, gave me some trouble. As soon as we started pumping the retaining clip that holds the pump side of the filter together popped off and fell into the stream. Luckily Lisa noticed it and we fished it out before it floated away from us and we were able to reassemble the pump. However, one nice feature of the design is that the pump assembly is on the intake side of the filter, so getting the pump contaminated doesn't screw things up.

For the rest of the trip I kept my finger over the retaining clip, which worked fine, but doesn't seem like something I should have to do. I don't remember my other Hiker behaving this way, so I'll be contacting Katadyn to see if maybe this particular unit is just defective.

Nalgene Bottles
The standard field drinking bottle. Light, long-lasting, and so tough you can use them to pound tent stakes into hard ground. They also mate up with the output from your water filter.

Gregory Forester
For this trip, I replaced my old Camp Trails external frame pack with the Gregory Forester internal frame. Generally, the Forester seems to be a pretty nice pack. It definitely hugs the body a lot better than an external frame, which is nice on technical terrain. I do have one complaint, however. The foam in the waist belt is extremely stiff and with the pack seated properly it rides right on the anterior spine on the front of my pelvis. Each time I put my pack on, this region of my hips would get quite sore. After I'd worn the pack for a half hour or so, the soreness would go away, so it doesn't seem to be too big a problem in practice, though I do worry that it would be bad on a long trip.

Ex Officio Amphi convertible pants
These pants are definitely much better than jeans. Cool and fairly hard wearing. I think I would like some pants that had a slightly heavier weave to prevent brush poking through to my legs. Otherwise, quite nice.

Petzl Tikka Plus headlamp
Headlamps are vastly better than flashlights since they keep your hands free and the Tikka Plus uses LEDs instead of bulbs for long battery life. The Tikka Plus does have one problem: the battery door is hard to get closed and if you only get it half-closed, the light will work but turn off when jostled, which can be very annoying. It seems to be ok if you get the door closed properly, however.

Posted by ekr at 07:23 PM | Comments (37) | TrackBack

June 25, 2004

No blogging this weekend

There won't be any blogging this weekend. I'll be camping in Los Padres National Forest. Assuming I don't get eaten by a bear, blogging will resume on Monday.
Posted by ekr at 11:12 AM | Comments (2) | TrackBack

June 24, 2004

Online topographic maps

I recently had occasion to desire some topographic maps. After a little bit of web searching, I found Topozone, which provides a fairly nice interface to the USGS digital topographic maps. The maps are reasonably nice looking and come in a variety of resolutions. Although there are some rasterization artifacts which can cause problems if you try to read at an inappropriate resolution, the maps are generally quite readable, especially when printed, and it's certainly pretty nice for a free service.
Posted by ekr at 10:57 PM | Comments (2) | TrackBack

June 23, 2004

Uncomfirmed rumor department

One of my anonymous sources claims that some Google employees have been fired for selling Gmail accounts--whether for money or for gmailswap.com favors is unknown. It may not be true, but you (at least possibly) heard it here first.
Posted by ekr at 09:38 PM | Comments (54) | TrackBack

War, what is it good for?

On my run today I passed by two cars with bumper stickers reading:
Except for ending slavery, fascism, Nazism and communism, war has never solved anything.

Now, I'm no pacifist. Indeed, I'm somewhat sympathetic to the position being expressed here--though not to a number of the other positions that the company selling these stickers espouses. However, I think this is overstating the case a bit. Yes, I know that I'm arguing with a bumper sticker, but this kind of--rather self-righteous--sentiment seems common enough that it's worth explicitly debunking, so here goes.

Certainly, it's true that war ended slavery in the US, though of course other countries (such as the UK) managed to abolish it without a war. Moreover, slavery still exists, especially in Africa. So, in order to be entirely accurate, we need to revise our claim to read:

Except for ending slavery in the United States (but practically nowhere else), fascism, Nazism and communism, war has never solved anything.

It's also quite true that WWII ended fascism in a number of countries in Europe and Japan--though of course Franco remained in power through his death in 1975, so the claim that war ended fascism strikes me as fairly weak. So, revising again, we get:

Except for ending slavery in the United States (but practically nowhere else), fascism (except in Spain), Nazism and communism, war has never solved anything.

Even if we concede the point about fascism, the Nazis, of course, were fascists, so calling out both fascism and Nazism seems rather like double counting. And, of course, there are still neo-Nazi movements all over the world. Still, this is probably the strongest subclaim and only merits a minor revision.

Except for ending slavery in the United States (but practically nowhere else), fascism (except in Spain), Nazism (except where there are still Nazis) and communism, war has never solved anything.

The claim that war ended communism is just false. After all, the Soviet Union basically just collapsed without any war at all. I suppose if you subscribe to Dan Simon's thesis about the end of the Soviet Union you could argue that the various proxy wars in Central America gave the Soviet Union a bit of a push, but that's really the strongest case you can credibly make. And of course China, Cuba and Viet Nam remain nominally communist. Even if you claim that no longer communist--and that's certainly an arguable point in the case of China--it's hard to see how you could attribute that change to war.

Except for ending slavery in the United States (but practically nowhere else), fascism (except in Spain), Nazism (except where there are still Nazis) and maybe slightly pushing communism in the soviet union closer to its eventual collapse while leaving over a billion people under communist rule in china, war has never solved anything.

Not quite as satisfying, but rather more accurate.

Credit where it's due: I stole this style of argument from Allan Schiffman.

Posted by ekr at 09:11 PM | Comments (12) | TrackBack

June 22, 2004

Hey, Maybe *I* can get a Pap smear

Here's an interesting article in the New York Times. It seems that a lot of women who have had their cervixes removed are still getting Pap smears:
The 10 million women having unnecessary Pap tests constitute about 12 percent of the 85 million women currently being screened, Dr. Sirovich said.

No one is suggesting fraud or mendacity on the part of the doctors or laboratories. Instead, Dr. Sirovich and others say, the situation seems to reflect doctors' habits and women's expectations.

In their paper, published today in The Journal of the American Medical Association, Dr. Sirovich and her colleague, Dr. H. Gilbert Welch, analyzed national data on Pap testing and on hysterectomies over 10 years.

Not only are most women who have had hysterectomies having Pap tests, they found, but the proportion having them also held steady, at 68 percent, from 1992 to 2002. No professional organization recommends Pap tests for most women without a cervix.

The screening guidelines "either have not been heard or have been ignored," the investigators wrote.

Outstanding!

Posted by ekr at 09:21 PM | Comments (44) | TrackBack

Curse you, Firebird DNS cache!

Ever since the big Crooked Timber move I've been unable to get to the new CT site. I keep getting bounced to the old one. I originally assumed that it was just a DNS screwup on their end and yet after a couple of days it should have resolve and it's clear from all the comments appearing on the CT web site that others are quite able to get there.

Once I realized the problem was on my end, the answer was obvious: Firebird (at least the older version I use) does caching of DNS results as a performance optimization (resolution can be slow). However, apparently the cache never expires and so I was stuck with the old IP address for CT. Exiting Firebird and restarting it fixed the problem. I was all set to write a long flaming rant about how DNS data was dynamic and how people need to expire their caches and then I realized that the last program I wrote that cached DNS results didn't expire them either. I guess we'll skip the rant.

Posted by ekr at 09:38 AM | Comments (55) | TrackBack

On to tier 2!

No blogging yesterday due to spending half my day on the phone with Intuit trying to get QuickBooks payroll working again. You see, it needed to upgrade to the latest tax table to run payroll but for some reason the (automatic) upgrade procedure wasn't working. Oh, it said it was working but then you'd go check and it wouldn't have it and it absolutely refuses to run payroll without the latest tax table.

So, I called tech support. Since then, I've:

  • Spent about 4 hours on the phone with them spread out over five calls.
  • Reinstalled QuickBooks twice. This takes like an hour because you need to download 25 M of updates...
  • Deleted every trace of QuickBooks on the filesystem.
  • Asked them if they wanted me to remove it from the registry only to be asked "what's the registry".
  • Installed it again.

And it still doesn't stinking work. Now, don't get me wrong, everyone has been very nice, but nice doesn't get payroll run. So, I call in this morning and finally get what the answer I've been waiting for: there's nothing else we can do. We're escalating it to Tier 2. More as the situation develops.

Posted by ekr at 09:08 AM | Comments (2) | TrackBack

June 20, 2004

What you need to know about tapping VoIP

As you may have heard, the Justice Department wants to require Internet Telephony (VoIP) providers to make accomodations to allow them to tap calls made with VoIP systems [*]. This is actually a sort of complicated issue, due to the diverse nature, both of the requirements and of the systems that would potentially be tapped.

The Requirements
Typically, the feds want to be able to do one of two things:

  • Capture origin (often called "pen register") and destination (often called "trap and trace") information.
  • Capture the contents of the call itself.

Which one they want to do depends, of course, on the circumstances and the kind of investigation they're running.

Types of VoIP
Roughly speaking, there are three kinds of VoIP call:

  • VoIP to POTS (Plain Old Telephone Service)
  • VoIP to VoIP through a central server
  • VoIP to VoIP direct

In a VoIP-POTS call, someone with VoIP service is connected to someone with ordinary telephony service. In order for this to work properly, the VoIP provider needs to translate between VoIP and the analog voice network. Effectively, I make a VoIP call to the provider's server and then the provider makes a regular phone call to the person I'm calling.

Tapping this kind of call is technically straightforward. The VoIP server has to have access to the analog traffic in order to service the POTS half of the connection, so it's easy to give the feds access to it, provided that they have the right equipment. Of course, not all such central servers are built with this kind of tapping in mind--though the feds of course would like to require them to be. Note that it doesn't matter at all whether encryption is used for the VoIP half of the connection. As long as the server has access to the analog half, the connection is easy to tap.

The second type of call is a VoIP-VoIP call through a central server. One reason why you might have such a call is if it were between two incompatible VoIP systems, say SIP and Skype. In this case, you would need a central server to translate the traffic from one format to another, just as with a VoIP-POTS call. As with the previous case, tapping this kind of call is straightforward, provided the server in question is set up for that kind of tapping. Again, it doesn't matter whether encryption is used--the central server has access to the data and can therefore easily provide a copy to the feds.

The interesting case, however, is when you have a direct VoIP connection between point A and point B. Then the two ends can directly encrypt all traffic between them. This works even if there isn't a direct connection between A and B. For instance, in sometimes one of the machines is behind a firewall and so they need to have a central server which connects them. As I understand it, in Skype this is done by forwarding the calls through the machines of other Skype users. This kind of forwarding is purely a network issue and doesn't interfere with the encryption at all.

In an architecture with end-to-end encryption, there's not much that the VoIP provider can do to tap your calls. Sure, they can divert a copy of the connection to the feds, but since it's encrypted, that's not very useful. At most the feds can see addressing information, but they can't actually listen in on your calls--no matter how much they may want to. In order to tap calls where end-to-end encryption is being used, the feds would need to change the VoIP programs so that they could get at the keying material for the connection. Obviously, this is technically possible, but tappability isn't exactly a feature that most users are going to find desirable, and so actually getting wide deployment would probably require a law mandating it. While I'm sure the feds would like that, it doesn't seem very likely to get passed in the US, and even less likely to work if it were passed.

Posted by ekr at 10:40 PM | Comments (48) | TrackBack

June 18, 2004

The BBC on Iran's codes

The BBC has an article about the Iranian code break and quotes Ross Anderson coming to more or less the same conclusions I did a couple of weeks ago:
Ross Anderson of the Computer Laboratory at Cambridge University pointed to some of them: "As the former chief scientist of the NSA once remarked at one of our security workshops, almost all breaks of cipher systems are due to implementation errors, operational failures, burglary, blackmail and bribery.

As Glenn Reynolds would say: Advantage Educated Guesswork!

UPDATE:
Doh. I forgot a linkt to the BBC article. Here.

Posted by ekr at 09:33 PM | Comments (30) | TrackBack

What should I listen to while walking the dog?

I've recently acquired a loaner dog, who I will have for the next week or so. As I stepped out for this morning's walk, I thought to grab my iPod and enjoyed randomly (or at least pseudo-randomly) chosen music for the next twenty minutes or so. That's all well and good, but I'm not an enormous fan of music while I'm walking around. I started wondering whether there were some prerecorded lectures, books, etc. that I could download for my iPod and would make good listening. Oh, and free would be nice.... Any suggestions?
Posted by ekr at 08:06 AM | Comments (53) | TrackBack

June 17, 2004

How does Joe Viagra send e-mail?

As I've mentioned before, one of the nasty side effects of spam is that all the filtering you need to make e-mail usable results in false positives. Obviously, some people have it worse than others. No doubt whoever at Pfizer manages the Viagra product line gets a lot of his mail filtered. But at least for him it's a cost of doing business. The people I really feel sorry for are Jose Viagra, Julia Cialis, and Josh Ambien.
Posted by ekr at 10:08 PM | Comments (47) | TrackBack

Cisco to buy Procket

Fulfilling some rumors that had been floating around the networking community, Cisco has announced that they're going to buy Procket Networks for $89 million. Not a bad sounding score until you realize that as of March, Procket had received $277 million in funding. Now, I know that EG Enterprises doesn't have quite as crack a networking team as Procket, but if any VCs have a spare couple of hundred million they'd like to invest, I'm fairly confident I can arrange to sell the company for no less than $150 million.
Posted by ekr at 05:36 PM | Comments (2) | TrackBack

June 16, 2004

In which our hero inspires a poem

I recently posted a copy of my paper on bug finding to the Cryptography mailing list and a fair amount of discussion ensued. This prompted Jason Holt to post the following poem:
"Hiawatha's Research"
Jason Holt
June, 2004, released into the public domain.
Dedicated to Eric Rescorla, with apologies to Longfellow.
("E. Rescorla" may be substituted for "Hiawatha" throughout.)

Hiawatha, academic,
he could start ten research papers,
start them with such mighty study,
that the last had left his printer,
ere the first deadline extended.

Then, to serve the greater purpose,
he would post these master papers,
post them with such speed and swiftness,
to gain feedback from his cohorts,
for their mighty learned comments.

from his printer, Hiawatha
took his publication paper,
sent it to the preprint archive,
sent it out to all the newsgroups

Then he waited, watching, listening,
for the erudite discussion,
for the kudos and the errors,
that the others soon would send him.

But in this my Hiawatha
was most cruelly mistaken,
for not one did read his papers,
not one got past the simple abstract.

Still did they all grab their keyboards,
writing with great flaming fury
of the folly of his venture,
of his paper's great misgiving.
Of his obvious omissions,
of his great misunderstandings,
of his utter lack of vision,
of his blatant plagiarism.

(This last point he found most galling,
found it really quite dumbfounding,
since for prior art, he'd listed
ninety-three related papers.)

Now the mighty Hiawatha,
in his office still is sitting,
contemplating on his research,
thinking on his chosen topic.
Wondering, in idle moments,
if he had not chosen wrongly,
the position he had taken
as a research paper author

And he thinks, my Hiawatha,
if he might not have been better

served by a more lowly station,
as a cashier at McDonalds,
as a washer at the car wash,
as a cleaner of the bathrooms.
Thus departs my Hiawatha.

Pretty cool, eh?

Posted by ekr at 02:27 PM | Comments (9) | TrackBack

June 15, 2004

OQO

This is the coolest gadget I've seen in a long time. It's a full PC only a little bigger than a Hiptop and weighing in at < 1lb:

This thing is substantially more powerful than my Vaio! Not available yet, though.

Posted by ekr at 02:26 PM | Comments (2) | TrackBack

Eyewitness testimony under stress

Check out this article on a controlled trial of eyewitness testimony under stress:
The study, carried out on 509 military personnel participating in a survival training camp, found that stressful conditions impaired the accuracy of making an identification.

Participants were subjected to mock prisoner of war interrogations. Half were exposed to a high-stress 30-minute interrogation that included the threat of physical violence.

Twenty-four hours later, they were asked to identify the interrogator in a line-up, a photo spread or a series of photos.

The results: 30 per cent accuracy in the live line-up, 38 per cent in the photo spread and 49 per cent in the photo series.

By comparison, the other half of trainees, in less threatening interrogations, were able to pick out their interrogators with greater accuracy, from 62 per cent in the live line-up to 76 per cent in the series of photos.

If anyone has a reference to the original paper I'd love to see it it.

Posted by ekr at 06:38 AM | Comments (44) | TrackBack

June 14, 2004

Please report to torture duty

In response to my suggestion that John Ashcroft bears the cost of torturing people, Dan Simon points out that it's difficult to determine how much to incentivize John Ashcroft. The problem is that Ashcroft only gains a small fraction of the purported benefit of any terrorist action. I'm not sure that it wouldn't be possible to design an appropriate incentive, but the objection suggests an even fairer system: a torture lottery.

The way that this works is that whenever we want to torture some prisoner, we also randomly select some American off the street and torture him in the exact same way. This scheme has the nice property that the people bearing the cost are (on average) the same people obtaining the benefit--the citizens of the US. They can then use the democratic process to determine exactly how much torture they want to permit.

Now, I'm sure Dan will object that this is unfair or unworkable. If you like, then, think of it as purely a thought experiment: would you be willing to accept this kind of risk in favor of giving the government latitude to torture people? If not, what does that suggest about the validity of the cost/benefit-type arguments that are typically adduced for allowing torture?

Posted by ekr at 12:18 PM | Comments (53) | TrackBack

How much is it worth you to torture someone?

Assuming you're not dead, you've no doubt heard about the Justice Department's memo arguing that it's OK to torture suspected terrorists. What concerns me here isn't so much that they're justifying torture. I'm familiar with the usual ticking nuclear bomb-type arguments for torture and while I'm not entirely persuaded, they're not crazy either. What concerns me is that, as Mark Kleiman observes the memo endorses the theory that the president has "the supposed "inherent power" of the President as Commander-in-Chief to ignore the law (both statute and treaty) whenever in his sole and unreviewable judgment the public safety requires it."

This seems to me to be the important point. Whether it's torture, locking up the prisonersdetainees at Guantanamo, Jose Padilla, or military tribunals, the Bush administration insists that they have the basically unreviewable discretion to treat enemy combatants (or whatever they're called this week) however they deem appropriate. Even if you think that the US should be in the business of military tribunals or torture or whatever, this position should give you pause. If these techniques are legal and unreviewable, government officials have every incentive to overuse them. After all, they're rewarded for catching terrorists but not for protecting their civil liberties.

As I said, I'm sensitive to, though not necessarily persuaded by the claim that we occasionally need to employ these techniques, and it's clear that these decisions are tricky to make and even trickier to review, especially when they have to be reviewed months later by some judge back in the states. The fix, of course, is to make the decision about whether to use any given technique self-enforcing. When I previously suggested that the government should pay off people for various kinds of civil liberties violations, Some EG readers suggested that people would invite abuses in order to get the payoff.

I'm not entirely convinced by that argument, but I think it's a legitimate point. Accordingly, here's a revised plan designed solely to deal with torture: The US can torture as many people as it wants (and no nitpicking about exactly what constitutes torture at this point, please) but for each victim, John Ashcroft has to spend 3 days in solitary confinement. After all, surely it's worth 3 days in solitary to prevent a big terrorist strike on the US. On the other hand, if we're using it for less important reasons, then that would quickly become apparent. Surely Attorney General Ashcroft is willing to make sacrifices for the good of America, right?

Posted by ekr at 08:53 AM | Comments (46) | TrackBack

June 13, 2004

The pros and cons of copyable media

Kevin Dick pointed out to me that it might be nice to do a more thorough discussion of the economic implications of media copying. What follows is a sketch for such a discussion. Unfortunately, first you have to sit through a digression on the economics of monopolies.

BEGINNING OF EXPLANATION OF THE ECONOMICS OF MONOPOLIES

The Price of Wheat
Let's start with the standard economic analysis of price for a commodity. Commodities are things like wheat, gold, oil, etc. where every producer sells essentially the same thing. In a competitive market, the price P of a commodity gets driven down to the cost of production C. If any supplier tries to charge a price C' which is much greater than C, then another supplier can charge C'' such that C < C'' < C' and undercut the first supplier. So, if it takes $5 to produce a bushel of wheat, wheat will sell for approximately $5 a bushel. Obviously, life is a little more complicated, than this, but it's a useful first approximation.

The Price of a Can of Coke
Now let's look at a monopoly situation: the market for Coke. Coca-Cola isn't the only supplier of beverages but it is the only supplier of Coke, and for many soda drinkers, Pepsi isn't an acceptable substitute. Accordingly, Coca-Cola has some freedom to set its price. For simplicity, we'll treat this for the moment as a monopoly situation. In this situation, the producer's optimal price is set primarily by the demand curve. I.e., the producer chooses a price P which produces the maximal amount of profit. More formally: at price P, the producer will sell N units. At any other price P', the producer would sell some different number of units N' and would make a smaller profit.

Now, note that the price P that Coke sells for isn't really related to the cost C of manufacture. Coca-Cola just chooses an optimal P and as long as it's greater than C they stay in business. The difference between the optimal price P in a monopoly situation and the cost of goods C is called a monopoly rent.

Now let's take a closer look at the demand curve: Imagine that the market consists of I consumers I consumers who value a can of Coke at V[1], V[2], ... V[I]. Without loss of generality, let's say that these values are strictly ordered so consumer 1 values Coke the most and consumer I values it the least. Now, when the producer sets the price P, this splits the market between people who value Coke more and less than P. People who value Coke more than P will buy it and people who value it less than P will not. If we imagine that customer p is willing to pay exactly P for a Coke (i.e., he is indifferent), then we get the following outcome:

  • Coca-Cola sells p units [0], for a gross sale price of p * P. Coke's profit is p * (P-C). This profit is almost all monopoly rent.
  • Each customer who buys a can of Coke makes a profit of V[i] - P (equivalent to the difference between the price he would have paid and what he actually paid). The total customer profit is (V[1] + V[2] ... V[p])-(P*p).

But here's the interesting bit: there are customers who would be willing to buy Coke at price C but not at price P. In other words, Coca-Cola could theoretically sell to them at a profit but it doesn't because that would require reducing the price for everyone else and so Coca-Cola would make less money overall (remember that P is the optimal price). If we assume that customer c would be willing to pay exactly C for a can, then there are c-p such customers that Coca-Cola has priced out of the market. The amount of foregone profit for consumers is V[p+1] + V[p+2] ... V[c] - (c-p)*C. This value is called deadweight loss. Deadweight loss is a more or less inevitable feature of any monopoly situation because the monopolist's attempt to increase their profits reduces the number of sales below the efficient level [1].

END EXPLANATION OF ECONOMICS OF MONOPOLIES

The Price of Content
If copying were impossible, the economics of tapes, CDs, DVDs, etc. would be basically the same as the economics of Coke. The labels would set prices to the level that would yield them the maximum profit and there would be a substantial deadweight loss from all the people who would be willing to pay $5.00 but not $10.99 for The Full Custom Gospel Sounds of the Reverend Horton Heat. It's true that C is very small, but that doesn't actually change anything important. It just means that almost all of the sale price of the content is monopoly rent and so the deadweight loss is correspondingly bigger.

So far so good. Now, imagine what happens if copying suddenly becomes easy and legal and anyone who wants an album or song just downloads a copy (note that this isn't the situation we have now). Two things happen:

  1. Everyone who would originally have bought a copy of the album now downloads one instead.
  2. All the people who wanted the album at all but didn't buy it because it was too expensive now download a copy.

From an economic perspective, (1) is neither good nor bad. Sure, it's bad from the perspective of the album because they stop making money. However, it's good from the perspective of the consumers because that money stays in their pocket. Net cost/benefit effect: zero. However (2) is definitely good. Remember that before we had people who didn't have a copy of The Rev, and now they do--the deadweight loss has gone to zero. So, the net profit here for society as a whole is equal to the previous deadweight loss. All in all a good thing.

The bottom line, then, is that in the short term downloading is good for society as a whole. The loss to the media companies is more than cancelled out by the gain for consumers. That may sound unfair, but if it does then it's only because you've forgotten that economics isn't about fair--it's about efficient. And in the short term free content is more efficient: it provides more benefit for consumers than it costs producers.

The Long Run
The problem, of course, is the long term. If there's no money to be made in media, then the supply of new media will likely go down. It may not go to zero--after all, I'm not charging you for reading this blog--but it's likely to decline some. That decline is a real cost to society--remember that most of the consumers of any given piece of content are paying less than they would be willing to pay and so are making a profit on the transaction.

This, of course, is also a form of deadweight loss [2]. The producer would be willing to produce the content and the consumer would be willing to pay, but because the producer expects the consumer to not pay and the consumer can't credibly commit to paying, the content never gets produced. We don't know exactly how large the size of this deadweight loss is, however.

If we're interested in the question of what's most efficient (i.e. maximizes social welfare) then these choices are pretty stark. Either we accept the deadweight loss of people not enjoying content that's out there or the deadweight loss of content not getting produced. It's an empirical economics question which deadweight loss is bigger and since we don't know the answer it's hard to know what the best policy is, which is always a bad position to be in when big money is involved. It would be nice if there were a way to minimize both deadweight losses instead of just trading one for the other. I'll be talking about that some in future posts.

 

[0] We're arbitrarily assuming here that customer p buys a can. We could assume the opposite without substantially affecting the analysis.
[1] Assuming that the consumer demand curve is more or less smooth.
[2] Thanks to Kevin Dick for reminding me that this is also a form of deadweight loss.

Posted by ekr at 08:43 AM | Comments (56) | TrackBack

June 12, 2004

What's with the credit card ID thing?

About half the time when I go to use my credit card somewhere, they ask for my ID. When I ask why, they typically tell me--after some pressing--that it's because I haven't signed the back. Actually, I have, but it's worn off, and I really hate showing ID, just on principle; yes, I'm one of those credit card cranks that store clerks hate. Today, I responded by signing the car in front of them, which they grudgingly accepted.

As far as I can tell, this practice doesn't correspond to any rational threat model. It's true that it's fairly likely that a thief would have an unsigned credit card, either because he stole it out of someone's mail or because he erased the signature, but it's not like it's hard for the thief to sign the strip before trying to use the card. Even if they don't already know that stores will ask for ID, they'll find out when they try to use it and can then just apologize for not having ID and sign it for the next time.

Do any EG readers understand the background here?

Posted by ekr at 07:34 PM | Comments (40) | TrackBack

June 11, 2004

Can I get that burrito delivered?

Here's a question that's been bothering me: why can you get some kinds of food delivered but not others? In nearly every town you can get pizza delivery. A few low-end Chinese places will deliver and World Wrapps claims to deliver. Now, I get why fancy restaurants don't deliver--ambiance is a large part of their value propostion--but why don't In-n-Out Burger or my local burrito joint deliver?

I don't have anything approaching a grand unified theory of food delivery, but I have some observations:

  1. The primary audience for food delivery is people unable or too lazy to leave home. That means that it has to appeal to drunks, stoners and/or children, pretty much ruling out your average Fresh Choice.
  2. Food which is delivered has to keep well. It seems to take about 15-30 minutes for food to get from a restaurant to my house. Burgers tend to deteriorate pretty fast.
  3. The menu for delivered food has to be fairly simple, otherwise phone ordering is difficult. It also helps if the menu is reasonably standardized. I can pretty much call any pizza place and order some kind of pizza even if I've never seen a menu. Burritos are simple in theory but each place has their own style of ordering.

Any other theories?

Posted by ekr at 10:14 PM | Comments (36) | TrackBack

The curse of the missing remote

It's really great to have a remote control for your AV equipment, but I've noticed an alarming trend: AV equipment which can only be controlled by remote. My DVD player and my VCR (both Panasonic products) have only a small subset of their features available on the console. If you can't find your remote--which happens to me on a regular basis--and you want to, say, turn on subtitles, you seem to be pretty mush out of luck. The king of these featureless devices, hovever, is the TiVo, which has no front-panel controls of any kind. Without the remote it just sits there like a brick.

Now, I understand that the controls on a modern piece of AV equipment are pretty complicated and it would be hard to cram all of them onto the front console. Accordingly, I hereby present Lisa's suggestion for dealing with this problem: put a radio receiver keyed to the base station in each remote. When you press a button on the base station the remote beeps and you can find it. You can, of course, get gizmos like this on the aftermarket, but wouldn't it be nice to just have it built into your electronics?

Posted by ekr at 08:48 PM | Comments (46) | TrackBack

June 10, 2004

Cognitive dissonance and a solution

Conservatives today face a difficult choice. Now that Reagan is dead, Grover Norquist wants to put Reagan on the $10 bill. Unfortunately, the $10 currently has Alexander Hamilton--the first Secretary of the Treasury and a conservative hero--on it. The internal conflict must be excruciating. Luckily, we here at EG have anticipated the problem and our resident financial wizards have been working on it night and day. After literally 20 minutes of thought, we are proud to present the following proposal:

In commemoration of our fallen leader, the US Treasury will mint a special $8 trillion note bearing the face of Ronald Reagan on one side and an MX missile on the other. This note will be used to pay down the national debt, thus fulfilling Reagan's lifelong dream to put our great nation back in the black. I just hope the bank of Japan can make change.

Posted by ekr at 10:04 PM | Comments (36) | TrackBack

Speaking ill of the dead

Allan Schiffman isn't real fond of Ronald Reagan.
Posted by ekr at 09:43 PM | Comments (2) | TrackBack

June 09, 2004

Well that was clever

One of the minor perks of being on the Internet Architecture Board is that the IETF Secretariat makes your hotel reservations for you. This has the advantage that you don't need to race the teaming millions for your hotel reservation, which is nice since the hotels often fill up.

So, that's the good news. The bad news is that the secretariat sends an e-mail asking you whether you want a room, like so:

X-Mailer: QUALCOMM Windows Eudora Version 6.1.0.6
Date: Wed, 26 May 2004 16:19:19 -0400
To: Rob Austein <sra@hactrn.net>,
	Steve Bellovin <smb@research.att.com>,
	Michelle Cotton <cotton@icann.org>,
	Leslie Daigle <leslie@thinkingcat.com>,
	Patrik Falstrom <paf@cisco.com>,
	Bill Fenner <fenner@research.att.com>,
	Mark Handley <m.handley@cs.ucl.ac.uk>,
	Scott Hollenbeck <sah@428cobrajet.net>,
	Allison Mankin <mankin@psg.com>, Thomas Narten <narten@us.ibm.com>,
	Jon Peterson <jon.peterson@neustar.biz>,
	Eric Rescorla <ekr@rtfm.com>, Lynn St.Amour <st.amour@isoc.org>,
	Margaret Wasserman <margaret@thingmagic.com>,
	Bert Wijnen <bwijnen@lucent.com>
From: Marcia Beaulieu <mbeaulie@foretec.com>
Subject: 60th IEFT - Hotel Accommodations - REMINDER
Cc: agenda@ietf.org
X-CRM114-Status: Good  ( pR: 3.9661 )

If you do not want your reservations made through the secretariat,
please let us know.

[hotel details deleted...]

Please send in the following information by June 18 to agenda@ietf.org to
guarantee a room under this block. If you do not want your reservations
made through the secretariat, please let us know.
Check-in date:
Check-out date:
Room type:
traditional, deluxe or club
single or double
smoking or non-smoking
Credit card number:
Expiration date:

That's right, they want you to reply with your credit card number--in e-mail! Now, normally I just mutter something under my breath, remind myself that there's a $50 limit and send the card along. However, this time, as they say in the government "mistakes were made". If you studied the above message carefully, it shouldn't be hard to figure out what happened: I replied to the mail and didn't check the recipients lines and my mailer helpfully sent a copy of my credit card # to everyone who had gotten the original message. Outstanding.

And yes, before you ask, I've cancelled that card. I mostly trust my colleagues but not necessarily their system administrators.

Posted by ekr at 08:43 AM | Comments (39) | TrackBack

June 08, 2004

More on obesity and voting

Apparently I'm not the only one who noticed a potential connection between obesity and voting. Ken Hirsch pointed me to this article by Kevin Drum making a similar argument. Kevin points to the same figure which Craig pointed out in his comments on the original post:

Note: I have neither verified the figure nor attempted to actually compute correlations on the data. I'm just making a qualitative observation. See, however, Ken Hirsch's report of a .238 correlation in the original post.

Posted by ekr at 09:59 PM | Comments (32) | TrackBack

Where are the audits of VoteHere?

A number of the people who responded to my post on Open Source and e-voting (see both the comments section and Ed Felten's post at Freedom to Tinker) argued that voting was a potential special case and might see more audit than your average piece of Open Source software. This morning, I happened to notice that VoteHere published their source code back in April. It's not quite Open Source--there are some license restrictions, but they don't look particularly restrictive. And yet, I haven't seen any published audits of this code.

As usual, Bruce Schneier delivers the money quote in the above MSNBC article:

Schneier said he didn't plan to analyze the source code -- and wondered whether any serious security experts would take on the challenge. "That would take 80 to 100 hours of my time, and no one's going to pay me," he said.

I suspect a lot of people feel that way.

Posted by ekr at 07:18 AM | Comments (35) | TrackBack

80% of spam from Trojans?

I know I've mentioned the problem of spam zombies before. I just saw this Register article claiming that 80% of spam now comes from spam zombie networks. Unfortunately, the article just points to this one-pager that doesn't describe any of the details so it's impossible for me to assess whether it's true or not. It sure would be nice for companies who publish this kind of stuff to actually document their methods.
Posted by ekr at 07:03 AM | Comments (34) | TrackBack

June 07, 2004

Pimp My Ride

I am currently watching what may be the first great television show of the 21st century: MTV's Pimp My Ride. Xzibit and a bunch of insane car customizers are let loose on some loser and their old beater. They then proceed to completely remodel the vehicle, starting by fixing everything that's wrong and then going way over the top with some set of modifications that no sane person could want in their vehicle, generally involving a neon system and installing a video screen somewhere. It's kind of like Monster Garage meets Queer Eye for the Straight Guy.

In the edition of Pimp My Ride that I just finished watching, they install a whole new stereo system but leave it all under plexiglass so you can see the inner workings. I actually find this fairly disturbing because I figure that we're only about two years away from having a case modding show hosted by Commander Taco and called "Pimp My Computer".

Posted by ekr at 07:54 AM | Comments (14) | TrackBack

June 06, 2004

Open source won't save e-voting

In a NYT Magazine article published may 30th, Clive Thompson argues for Open Source code for electronic voting machines:
First off, the government should ditch the private-sector software makers. Then it should hire a crack team of programmers to write new code. Then -- and this is the crucial part -- it should put the source code online publicly, where anyone can critique or debug it. This honors the genius of the open-source movement. If you show something to a large enough group of critics, they'll notice (and find a way to remove) almost any possible flaw. If tens of thousands of programmers are scrutinizing the country's voting software, it's highly unlikely a serious bug will go uncaught.

It may very well be a good thing to have Open Source software for voting, but the assumption that underlies Thompsons's argument--that Open Source is somehow a magic engine for producing bug-free software--is transparently false. Open Source software, like all software, is riddled with bugs. Many of these bugs have security implications. Moreover, these bugs can persist for long periods of time. For instance, the Linux mremap() problem (CAN-2004-0077, described here) has been in every Linux kernel since at least 2.2 (released in 1999) and was only discovered in March of 2004. Alternately, consider the OpenSSL buffer overflows. These had been around since at least 1998 and probably earlier but were only found in 2002. So much for tens of thousands of programmers finding any serious bug.

The truth of the matter is that--contrary to popular myth--practically nobody bothers to audit any Open Source code. Auditing code is a mind-destroyingly boring exercise and it's not even clear what percentage of vulnerabilities a good audit actually finds (practically no research has been done on this point). I'm probably one of the 50-100 people most qualified to audit OpenSSL: I'm a security guy specializing in SSL who uses OpenSSL on a regular basis and I've spent substantial amounts of time groveling through the source code. But the only time I look for security holes is if I happen to run into something that looks fishy. Noone I know seriously believes that we've found the last security hole in OpenSSL or Linux.

This isn't to say that auditing source code isn't worthwhile. It's just that the idea that we can audit the bugs out of a piece of code is in my view fundamentally misguided. What audits are useful for is getting an idea of how good the overall code quality is. If you audit selected pieces of a piece of software and you find a bunch of serious errors, it's safe to conclude that the company needs to shoot the programmers and start over. So, for instance, when Kohno et al. found a bunch of problems in Diebold's e-voting system, the conclusion to draw wasn't that these were all the problems that there were but rather Diebold didn't have the first clue how to write secure software. As Avi Rubin's page on e-voting points out:

To help mitigate the risks identified in the security analyses, Maryland proposed a set of technological changes to Diebold's voting machines as well as procedural changes to the election process. While this may help "raise the bar," it is impossible to know whether any security analysis identifies all the possible vulnerabilities present in an analyzed system. By only patching the known vulnerabilities, Maryland is not actually ensuring that the voting system will be secure. Rather, Maryland should follow security engineering best practices, which state that security can only be assured through a rigorous design process that considers security from a project's conception, not through a set of patches applied after the fact.

So, if we're going to have e-voting, what we really need is a procedure that allows for routinized and systematic review of voting systems. The purpose of this review is not to find vulnerabilities--though of course it would be nice to fix any that are found--but rather to assess whether the vendor is following good software engineering practices, with serious consequences if they are not. Open Source might or might not help us get that sort of review--personally, I expect that to get really thorough review you'll need to pay people to do it--but just making the code Open Source doesn't improve the situation much at all.

Posted by ekr at 02:58 PM | Comments (26) | TrackBack

June 05, 2004

Reagan is dead

So, Reagan has finally died. It feels strange. My first real political memory is of Reagan being shot by John Hinckley. I'm not looking forward to having half the federal buildings in the country being named for Reagan, though...
Posted by ekr at 08:01 PM | Comments (44) | TrackBack

June 04, 2004

Buying DDoS attacks

It's been common knowledge in the security community that you could buy DDoS attacks on third parties, but generally the Black Hats don't make a big deal about it in public. However, Alexei Roudnev just received this advertisement which he posted to NANOG (the North American Network Operators Group):
Dear sirs.

We are glad to you to give qualitative service, on elimination of sites. We can kill any site by our attack, which have name 'DDos attack'. We have already killed hundreds Russian and foreign sites. If you have enemies, and It is necessary to get rid of them, ask us and we will help with pleasure.

The prices at us low, 60 dollars for 6 hours. 150 dollars day. Destroy any project on the Internet with the help of ours DDos service. Payment prinimaetsja in sisteme WebMoney.

Contacts: ICQ 783603.

(translated from Russian by Alexei).

Outstanding. Now I can finally get rid of my enemies over at I Could Be Wrong.

Posted by ekr at 08:17 AM | Comments (4) | TrackBack

June 03, 2004

Doh!

For the past month or two I've been regularly scanning the spam folder for false positives and deleting everything else. I always knew that this was a lousy, error-prone procedure but it just cost me something. I'd been waiting for a response from someone and noticed that I never got one so I went back to my mail logs and saw that it had been incorrectly filed as spam and I'd apparently deleted it. Luckily, I was able to extract it from my backups, but all in all kind of a close call. So, Command Decision: instead of deleting the spam, I now shove it into a second spam folder which I can keep around for a month or so for precisely this eventuality. Outstanding.
Posted by ekr at 07:47 AM | Comments (47) | TrackBack

June 02, 2004

How did the we break the Iranian code?

There's a really surprising report in today's New York Times. According to the report, Ahmad Chalabi told the Iranians that we'd broken their secret code. It's surprising isn't that Chalabi told them--I've figured he was a weasel for quite some time now--but that we were able to break their code at all.

Cryptographic systems generally consist of two pieces:

  • An algorithm--the cryptographic function itself.
  • A key--a short piece of secret information that is used to control the operation of the algorithm.

Now, here's the important bit: the systems are designed so that you can't read transmissions for which you don't have the key even if you know the algorithm. I and my worst enemy can use the same algorithm and he can't read my messages unless he knows my key--at least in principle.

So, with this in mind, here are some possibilities for what might have happened:

  1. The Iranians used a standard algorithm such as AES, 3DES, etc. but it turns out that the NSA (or whoever) knows how to cryptanalyze that algorithm. This would be an incredibly big deal, since at least in the public sector we have no good idea how to attack these algorithms. And AES and 3DES in particular have been vetted by the NSA, so they should be secure. If the NSA had broken these algorithms, I doubt they'd tell anyone who would even think of telling Chalabi. So, I don't think this explanation is that likely.
  2. The Iranians are using standard algorithms but have bungled the key management. For good compromise containment, each facility should use a different key and you should change those keys frequently. If the Iranians didn't do one of these things, then a single black bag job on their embassy might be sufficient to compromise a good fraction of their traffic. This seems pretty likely: bad key management is a frequent problem with cryptographic systems.
  3. The Iranians are using non-standard algorithms--they don't trust the NSA not to have broken the standard ones--and the NSA broke the Iranian algorithms. If you're a small country designing your own algorithms doesn't sound crazy initially, but actually it is. Algorithm design is hard and it's easy to design something that's badly insecure. There are a lot of believed to be good algorithms that the NSA had no hand in designing, and in fact were designed by people who considered the NSA part of their threat model. So, if you don't trust AES you can always use Blowfish or IDEA. Even if for some reason (national pride, paranoia) you feel you need to design your own algorithm, the safe thing to do is to superencrypt--encrypt the data first using your own algorithm and then encrypt the ciphertext with AES. This protects you against either algorithm being broken. Finally, if you really don't trust the NSA and you're just encrypting small amounts of diplomatic traffic, just use a One-Time Pad, which is provably secure. You can move they keying material around in DVDs in your diplomatic bags. Nevertheless, this seems like a fairly likely case. For some reason having a homegrown cipher algorithm seems to be a point of national pride, and superencrypting would be pretty much admitting you don't trust your own cryptographers.
  4. It's all a hoax and the NSA is reading the traffic some other way, perhaps by bugging the Iranian embassy. In this case, the NSA might actually want to have it spread around that they've broken the Iranian codes since it makes them look extremely competent and there's a good chance that the Iranians will change codes and then be confident that their communications are secure.
  5. It's all a complete hoax and the NSA never could read the Iranian communications.

It would be interesting to know what really happened.

UPDATE: Justin Mason suggests a variant of the fourth possibility: the Iranians were sold crypto equipment with backdoors. Given the easy availability of modern crypto software, you would think it would be pretty easy to avoid this kind of situation.

Posted by ekr at 03:07 PM | Comments (63) | TrackBack

Stupid disclaimers

Slate has a nice piece debunking those ubiquitous e-mail disclaimers that companies often require employees to put at the end of their messages. Unsurprisingly, the conclusion is that they're basically legally unenforceable. I know that I generally ignore them.
Posted by ekr at 09:11 AM | Comments (43) | TrackBack

A stuffed shirt full of hot air

This Spanish company has invented the Dressman. Basically, it's a gizmo that lets you press a shirt without an iron. Basically, it's a human upper-body form that inflates with hot air. You put the wet shirt on the dummy and turn it on and then it dries and presses the shirt in one operation. Kind of clever, actually. Would be more useful if I wore shirts that needed pressing more than once every 3 months.
Posted by ekr at 07:38 AM | Comments (37) | TrackBack

June 01, 2004

NA's anti-spam patent

NA has just been granted a patent on a bunch of spam filtering technologies. Claim 1 is:
A method for filtering unwanted electronic mail messages, comprising: receiving electronic mail messages; filtering the electronic mail messages that are unwanted utilizing: compound filters, paragraph hashing by hashing a plurality of paragraphs and utilizing a database of hashes of paragraphs, wherein the paragraph hashing excludes at least one of a first paragraph and a last paragraph of content of the electronic mail messages wherein a plurality of hashes each has a level associated therewith, and the hashes having a higher level associated therewith are applied to the electronic mail messages prior to the hashes having a lower level associated therewith, and Bayes rules; and categorizing the electronic mail messages that are filtered as being unwanted; wherein the utilization of the Bayes rules includes identifying words of the electronic mail messages; wherein the utilization of the Bayes rules further includes identifying a probability associated with each of the words; wherein the probability associated with each of the words is identified using a Bayes rules database; wherein the electronic mail messages are filtered as being unwanted based on a comparison involving the probability and a Bayes rules threshold; wherein the threshold is user-defined.

The strange thing is: this patent was filed December 2002. As far as I can tell, this is just a standard message hashing technique (a la Vipul's Razor) glued together with the Bayesian spam filtering described in Paul Graham's A Plan For Spam article. And that's just the article that popularized the technique. There were articles being published on the topic as early as 1998. I guess the composition of two known techniques isn't always obvious, but considering that SpamAssassin combined multiple filtering techniques at least as far back as May 2002 (probably earlier but that's the oldest version I have), it's hard to see what's new here.

Posted by ekr at 10:08 PM | Comments (46) | TrackBack