May 31, 2003

Scientific suffering

The major change in endurance training over the past two decades has been the wide availability of technology allowing you to finely calibrate how much you're suffering. For years, everyone used more or less the same tools: stopwatches and bicycle computers for measuring speed. The problem with these tools is that unless you are on a totally flat surface, speed doesn't correlate very well with how hard you're actually working. Since most endurance training is directed towards training your cardiovascular system, that means you don't necessarily get the optimal training load.

Heart Rate Monitors
The first piece of consumer performance technology to appear was the heart rate monitor (HRM). A HRM typically consists of a sensor built into a chest strap. It communicates via radio with a receiver built into a watch. The watch displays your current heart rate. With an HRM, you can target specific heart rates instead of specific speeds. Thus, no matter what the course you're working out on, you can get the exact desired training effort.

What is this good for? Three things. The first use is obvious: It's really easy to fool yourself into thinking you're working hard when you're really not. The HRM tells the true story. Thus, as you get better over time the HRM automatically makes you go faster to stay at a specific effort level. Second, the HRM keeps you in check. Your long distance days should be really really easy, but people feel guilty and often go fast. If you never exceed your target easy HR, you know you're going easy enough.

The final use for an HRM is for tracking day to day training status. When you're sick or overtrained your resting heart rate goes up. So, if you take your HR every morning and you notice an elevation, it's probably time to take it easy for a couple days.

CompuTrainer
The CompuTrainer is a computerized bicycle trainer. Essentially, it's a stand you put your bicycle on and it provides a programmable load at the back wheel. Bicycle trainers have been around for a long time, but they used to just generate resistance mechanically, which made them hard to control. The CompuTrainer brings two things to the party:

  • The load is generated electronically so it's tightly controllable in resolutions of 10 watts.
  • There is a computer interface allowing you to integrate with specialized training software.

The CompuTrainer is perfect for doing interval training on, because it allows you to set your load precisely. In combination with the HRM, you can hit exactly the desired training load. Best of all, you can use your own bike, so the ergonomics are the same as they would be in a real workout.

Power meters
The big problem with the CompuTrainer is that it's fixed in place. This makes it really boring and you don't get to train in different terrain. It's also fiendishly hot in the summer since there's no airflow unless you have a very good fan blowing directly on you. In the past 5 years or so, you've been able to buy power meters that you can attach directly to your bicycle. This lets you measure your power output in field conditions. Power meters are big with the pros and lately have been moving into the amateur ranks.

So what?
So, what's all this technology buying you? At least theoretically, it's helping you improve faster. There's some modest amount of research showing that training at the right effort levels leads to faster improvement. More importantly, it's making it easier to train yourself. If you don't have a coach--most people don't--it can be very hard to select your own training loads based on time because people's strengths and weaknesses are very individual. However, the appropriate effort levels are much more standardized. You can take a few relatively simple measurements and know exactly what heart rates you need to work at. So, amateurs can get better training and flail less.

Really, though, what it's buying you is a lot of pain. When you train alone, it's easy to tell yourself that you're working hard enough. The technology keeps you honest, which means you suffer more. But in a good way, or at least that's what we tell ourselves.

Posted by ekr at 10:24 AM | Comments (12) | TrackBack

May 30, 2003

What Matrix Reloaded should have been

This really is a lot better than Matrix Reloaded. Between this and Troops I'm starting to wonder if the world wouldn't be better if directors stopped making sequels and just let the fans do it.
Posted by ekr at 10:17 AM | Comments (5) | TrackBack

NIH isn't a compliment

Nullsoft (the people behind Winamp and Gnutella) have designed a new package called WASTE (something seems to be wrong with this URL. Try the mirror). for encrypted communications within small groups. For some unknown reason they have decided to invent their own security protocol rather than using the standard tool for this job--SSL. Here's what they say:
Note: It might be worth implementing WASTE using a subset of SSL, to avoid any concern of flaws in this protocol. Feedback is gladly accepted on any potential weaknesses of the negotiation. We have spent a decent amount of time analyzing this, and although we have found a few things that are not ideal (i.e. if you know public keys from a network, you can sniff some traffic and do an offline dictionary attack on the network name/ID), but overall it seems decent. The current implementation probably needs work, too.

Huh? SSL would have done the job, but we decide not to use it, for no particular reason.

Worse yet, they don't have a protocol spec, just annotated source code and a sort of overview doc. So far, I've been too lazy to read the code, but what's in their description doesn't look really promising.

WASTE secures the links of the WASTE network by using RSA to exchange session keys and authenticate the other end of the connection. Once the hosts have authenticated each other and both have the correct session keys, the connection is encrypted using Blowfish in PCBC mode (using different IVs for each direction of the connection). The oversimplified process for bringing a link up is (see comments in the code and the code itself for a more in depth view):
  • Both sides exchange public key hashes, and verify that they know that hash
  • Both sides exchange session keys and challenge-response tokens encrypted with each others public keys.
  • Both sides decrypt and verify the challenge-response tokens, and begin encrypted communication (a stream of messages, each message is verified using an MD5).

There's a lot more to it than that, but that's the basic idea. The reality of it is that there is also a "Network ID/Name" feature that allows you to easily keep networks from colliding, as well as efforts to obfuscate the whole process (to make WASTE connections difficult to detect). Another unique feature is the way session keys are exchanged and combined so that in order to decrypt past (recorded) traffic, both private keys of a connection need to be recovered.

This property may be unique, but it's not really an improvement over SSL. SSL includes modes which offer Perfect Forward Secrecy, in which even knowing both private keys isn't sufficient to recover recorded traffic.

So far, I don't see any way in which WASTE is better than SSL, and some design choices that look questionable. Why do people insist on designing their own thing when the standard tools will do the job perfectly well? This is even a worse idea than usual when designing communications security protocols, which are hard to get right and often have subtle bugs. The fantasy that you have to have all your own stuff is called NIH (Not Invented Here) and it's not a good thing.

Extra for security guys:
Why do people who design their own protocols always seem to use Blowfish? There's a reason we have AES. For bonus points, the geniuses at Nullsoft use PCBC mode. That's not exactly a sure sign of massive crypto expertise.

Posted by ekr at 08:22 AM | Comments (10) | TrackBack

And on the other hand...

Matthew Yglesias complains that journalists often present subjects where there actually is a right answer as just a "he said, she said" story:
Quite so. I'll be the first to conclude that the facts of the matter regarding economics are often non-obvious. If there's something unclear about the facts, however, then that is what a reporter ought to report, just as when it was unclear whether or not Saddam Hussein had been killed in the "decapitation" strike reporters reported that it was unclear whether or not Saddam had been killed. The whole purpose of reporters, after all, is to try and figure out what's going on and then convey that information to the reading public. When you just quote two people at random the reader still has to do all the work himself if he wants to know what the truth is.

In Matthew's comments section, Ikraem Saaed gets pretty close to the what I suspect the real answer is:

In the field of economics, as in many other fiedls, they are clueless. Many are innumerate. They attempt to turn factual matters into horseraces. Economic issues should be covered like science issues are, but instead they are covered like political issues are.

I think this is definitely part of the equation. Lots of people (myself included) have had the experience of reading media coverage in their field and seeing the writer totally botch it, even when the subject is totally non-controversial. So, it's pretty clear that there are lots of situations where the media really doesn't understand the field.

However, I think that calling reporters stupid isn't necessarily fair. I think of it more like boundedly rational. In most social situations, the best way of discovering objective truth--even when their objective truth to be had--isn't by taking measurements and doing statistics but rather by asking around and trying to figure out who's telling the truth. The techniques of science are very powerful but also heavyweight to employ and have a limited scope. They also take a long time to get into the habit of using. So, it's not surprising that reporters try to employ social techniques to figure out what's going on, even when they're inappropriate.

To make things worse, figuring out what objective truth is to be rather a messy business. Take something like evolution, which I consider to be about as much of a scientific fact as we have. Nevertheless, you can definitely find someone who will make scientific-sounding explanations as to why it's not true. If you give me two months and are willing to learn a lot of physics, biology, and chemistry, I can show you that all of those objections are false, but reporters don't have that kind of time. How then, are reporters supposed to distinguish such issues from ones where there is genuine controversy? Easier not even to try.

Posted by ekr at 07:55 AM | Comments (41) | TrackBack

May 29, 2003

New paradigm, huh?

A company called Titan Key has just started talking about their anti-spam technology, which they say is a "new paradigm" and "superior to EVERY other anti-spam product". This sort of hype reeks of snake oil and close inspection seems to bear out that impression.

Titan Key's "new paradigm" technology is an amalgam of a number of previous anti-spam ideas:

  • A filtering proxy.
  • Challenge response.
  • Bounces.
  • Custom-built addresses.

The Titan Key challenge response system differs from most other challenge response systems in that it's intended to be implemented in the mail server instead of by the user. If a message arrives which was not sent by a known sender, the system generates an SMTP bounce and then initiates the challenge response handshake. The reason that this is supposedly cool is that the spammer processes the bounce and then takes you off their list. In practice, however, that's not the case--spammers generally ignore these messages.

The downside of this strategy, of course, is that instead of just getting a challenge, anyone who tries to send mail to you for the first time gets both a challenge anda a rather disconcerting bounce.

The other thing that the Titan Key guys do is generate custom addresses so you can hand them out for mailing lists. They also use sender filtering to supposedly protect you from those custom addresses being leaked to spammers. Unfortunately, spammers often forge their "From" addresses, in which case they might be able to use custom addresses, making this trick not very attractive.

The bottom line here is that the "new paradigm" looks a lot like the old paradigm. I don't expect to buy a Titan Key any time soon.

Posted by ekr at 04:40 PM | Comments (63) | TrackBack

The beginning of the end for Mozilla?

News.com is reporting that AOL is going to license Internet Explorer. As part of the agreement, Microsoft will pay AOL $750 million (pocket money for MS) to settle their antitrust case and AOL will get a 7 year royalty-free license for IE.

Recall that AOL bought Netscape a while back and has been continuing to maintain the open source browser Mozilla as well as ship versions of Netscape Communicator based on Mozilla. If AOL now has an IE license it's hard to understand why they would want to continue to pay to develop Mozilla. As an outsider, it never seemed to me that Mozilla became self-sustaining the way that say Linux was, so I'm not sure how long it can survive under these circumstances.

Posted by ekr at 02:35 PM | Comments (12) | TrackBack

May 28, 2003

Knowing your place

J. of the Silver Rights blog takes me to task for my post about Annika Sorenstam:
I think it's reasonable to say that you're outclassed when it's really clear that you are and that you have no chance of making up the difference. Now, it's no doubt easier to say that if you just won, but it's certainly arguable that that makes you more of a wimp, not less, since you've proved that you can compete, at least at some level, if not the highest possible level. That's not my view, however. It's critical to know your limitations.

But, would he apply the same reasoning to a man? It seems to me that women are too often told to know their limitations and what those limitations are by men who want to limit competition -- for education, for jobs and for the PGA tour.

Not only would I apply the same reasoning to a man, I did apply the same reasoning to a man a few paragraphs before, where I said:

It seems to me that a higher weight class is more or less analogous to playing in the men's tour instead of the women. Bigger guys are just stronger and will pretty much always beat smaller guys of equivalent skill level. Actually, it understates the difference, since bigger guys are typically slower than smaller guys and men are not slower than women.

I've got absolutely no problem with women competing against men in any environment. In fact, I've had my ass handed to me by women in any number of triathlons. However, it's also fairly clear that in pretty much any athletic endeavor, the top men have a more or less insurmountable advantage over the top women. There's nothing shameful about women admitting that and demanding that they have a protected (i.e. female-only) environment in which to compete. If women would prefer to have no gender divisions in sports--with the likely result that women almost never win anything--I'm fine with that too.

Posted by ekr at 09:40 PM | Comments (37) | TrackBack

More threatening noises from SCO

This Marketwatch story on the SCO/Linux kerfuffle has an interesting bit at the end:
McBride added that unless more companies start licensing SCO's property, he may also sue Linus Torvalds, who is credited with inventing the Linux operating system, for patent infringement.

Huh? Remember, folks, Unix was invented in 1969, BSD 4.3 dates from 1986, and System V dates from 1987. Pretty much all those patents are expired now or will be in a year or two. Of course, they could punish Linux for his past infringement, but what would that buy them? It would be interesting to know what patents they claim he's infringing.

Posted by ekr at 09:08 PM | Comments (13) | TrackBack

How much should you pay for software?

Last night I discussed my first area of disagreement with Dave Winer's postings on software economics. This post discusses my second and third areas of disagreement.

First, Dave's fundamental premise--that people aren't willing to pay for software--seems to me to be basically wrong. Dave writes:

When I say there's no money for software, that's not a literal statement, of course. Sure there is some money. When you buy a new computer you probably pay a hundred dollars for software, most of it going to Microsoft. So they've figured out how to get money to flow.

This understates the case a bit. The total revenues of the US software industry in 2001 (the last year for which we have data) were $91 billion. That's a lot of money, more than the size of the movie and sound recording industries combined and more than 50% more than the size of the professional art and sports industries. So, people certainly seem pretty willing to pay for software to me.

Now, it's possible that Dave just thinks that people aren't willing to pay enough for software. However, making that evaluation requires deciding what "enough" means, which is always a tricky proposition. Based on this article, it seems to me that Dave is making several arguments that we're not willing to pay enough. I'll take them in turn.

The labor theory of value
Dave's first argument is basically the labor theory of value:

A professional software organization for a well-supported product has 10-20 people, maybe as many as 30 to 40. So when you hear yourself complaining about software quality, think about how much money the developer of the product has to fully support it. Could you run a car in the Indy 500 with no money? You could try, and that's what a lot of software developers do, to no avail. Sooner or later you have to pay the bills. It costs money to live. That's as true of software as it is of people.

The refutation of the labor theory of value is pretty much the same now as it was when it was first proposed. So what if you've spent thousands of dollars digging a hole in the ground? It's still just a hole. The value of something is determined by what people are willing to pay for it, not by what it costs to produce. That doesn't mean that the cost is irrelevant, of course, since it represents the lowest cost at which a commodity can be sold (ignore the difference between fixed cost and marginal cost for the moment). If the cost of producing something is more than people value it at, then it shouldn't be produced. That's the market at work.

Reverse labor theory of value
Dave's second argument seems to be that the amount of time using something should control how much you pay:

Let's say you spend 100 hours a year using a piece of software and assume your time is worth $50 per hour. So that's $5000 of your time flowing through the software. How much self-respect is there in paying nothing for software that leverages so much of your time?

Maybe I'm missing something important, but I don't understand this argument at all. What does the amount of my time I spend using a product have to do with how much I ought to value that product? I've probably spent hundreds of hours wearing the free t-shirt I got from Wired magazine, but I don't feel at all guilty for not paying for it. By contrast, I've got friends who go to restaurants and pay $50 for an after-dinner glass of Port, which takes them 5 minutes to drink.

The relevant question when deciding how much something is worth to you is how much you would be willing to pay to do without it. Since I have hundreds of t-shirts, and get a couple of free t-shirts every time I attend a conference, the marginal value of the Wired t-shirt is pretty much zero. Similarly, the relevant question when I value a software product isn't how much I use it but rather how much better having it has made my situation. So, if this hypothetical piece of software had saved me hundreds of hours, then we'd be talking some significant value. As it is, we have no idea.

You'll be sorry later
Finally, Dave argues that if we don't pay for software we'll get bad software and therefore we ought to:

If you don't pay, the bottom-line is that you lose. It may look like you're not losing, but you are. If you paid nothing for health care, you'd likely die sooner. If you pay nothing for software, you probably won't die from it, but you may lose data, you're virtually certain to waste time, and at some point, money.

This, at least, is a recognizable modern-looking economic argument, however, I don't think it's a correct one. There are four possibilities here:

  1. People are correctly estimating the value of software and it's less than the price. Thus, they're choosing not to buy it. Obviously, this is distressing if your business is software and it's your software that's not being bought, but otherwise is nothing to be alarmed about
  2. People are incorrectly estimating the value of software and need Dave to tell them. This is certainly possible, but ordinarily in economics we assume that people know their own preferences.
  3. Software is somehow a public good and so people are (rationally) not paying their fair share. This is, of course, possible, but I don't think particularly likely. There certainly are software problems that are public goods (or rather, public bads) such as vulnerability to worms, but I suspect that the majority of problems people experience affect them alone.
  4. Customers can't tell whether or not software is good and so they're not willing to pay for it. This is what's known as a "lemons market".

Only options (1) and (4) are particularly plausible. In the case of option (1), the market is working correctly, so there's nothing to worry about. The possibility that software is a lemons market, however, is more interesting.

The market for bad software
The term "lemons market" comes from the George Akerlof's papepr "The Market for Lemons" that introduced the concept in the context of cars. Any given software manufacturer has a choice of making good software or bad software. Customers are willing to pay more for good software than bad. However, bad software is cheaper to produce. If customers have no way of telling whether a given piece of software is good or bad, then manufacturers have an incentive to sell only bad software. Even if one manufacturer is honest, some dishonest manufacturer will undercut him and drive him out of business. Thus, everyone makes bad software. Even though the customer would be willing to pay for good software, and the manufacturer would be willing to produce it, the market has no way to provide it.

The standard solution to this problem, of course, is to have some third party review the product and vouch for it's quality. Unfortunately, it turns out to be very hard to really do a solid job of this on software because a lot of the problems that people encounter are unpredictable interactions with other parts of their environment, and it's precisely such effects that are hard to test for.

The Bottom Line
Dave's basic argument seems to be that if customers just realized what the true value of software is, they would pony up and all would be well. That argument just doesn't stand up under analysis. It's simply not at all clear that the value of software is in fact high enough to justify a price higher than we in fact pay now. That may be bad news for software manufacturers, who would like to extract more money from consumers, but it's not at all clear it's bad for consumers. Even if we in fact do have a lemons market (which, based on my experience seems quite likely), the problem isn't that consumers are cheap but rather that they insist on getting value for their money, a position I consider an eminently reasonable one.

Posted by ekr at 03:15 PM | Comments (33) | TrackBack

Wait, who owns Unix?

Oh, this is fun. Now Novell claims that they retained the intellectual property rights to Unix and merely licensed them to SCO. Moreover, they claim that SCO knows this:
"To Novell's knowledge, the 1995 agreement governing SCO's purchase of Unix from Novell does not convey to SCO the associated copyrights," Novell Chief Executive Jack Messman said in the letter to SCO Chief Executive Darl McBride. He said that SCO evidently realizes this because "over the last few months you have repeatedly asked Novell to transfer the copyrights to SCO, requests that Novell has rejected."

Moreover, Novell is making noises about suing SCO.

SCO's response is here.

CO owns the contract rights to the UNIX® operating system. SCO has the contractual right to prevent improper donations of UNIX code, methods or concepts into Linux by any UNIX vendor.

Copyrights and patents are protection against strangers. Contracts are what you use against parties you have relationships with. From a legal standpoint, contracts end up being far stronger than anything you could do with copyrights.

SCO's lawsuit against IBM does not involve patents or copyrights. SCO's complaint specifically alleges breach of contract, and SCO intends to protect and enforce all of the contracts that the company has with more than 6,000 licensees.

We formed SCOsource in January 2003 to enforce our UNIX rights and we intend to aggressively continue in this successful path of operation.

This is now pretty messy and I'm not lawyer enough to work out who's right. From a game theoretic perspective, it's hard to believe that SCO doesn't have some leg to stand on here, since surely they knew that this information would come out eventually and likely before IBM settled.

Posted by ekr at 09:27 AM | Comments (47) | TrackBack

May 27, 2003

Computer programming goes commodity

Dave Winer has a couple of posts complaining about people not being willing to pay for software. I disagree with these posts in a number of ways, but I'll just tackle one for now. Dave writes:

In the NY Times on Thursday, a stirring op-ed piece by Ellen Ullman, about what we've lost in software. In the 90s it was common for two or three generations of software developers to work in the same organization. There was a handing-down of ideas, practices, tradition -- the verbal history of how things came to be as they are, Ullman says. After the dotcom bust software is becoming a detail, again, something that workmen do, not artists.

This strikes me as a perfectly natural turn of events, not something to be bemoaned. Software has simply become commoditized. This is what happens to all industries as they mature. In the beginning, only wizards can make anything work, but over the years the technology gets standardized and eventually anyone can do it. That's why the United States needed the smartest people from (at least) 6 countries to build the first atomic bomb and six months ago people were worried about Saddam Hussein building one in his basement.

Consider the following developments:

  • The first FORTRAN compiler, built in the 1950s, took 3 years to design and implement. Today, CS undergrads write compilers as class projects.
  • Ken Thompson won the Turing award for designing UNIX in 1969. Today, lots of people use an operating system designed by a Finnish college student.
  • Fifteen years ago, getting an Internet connection was a nightmare of configuration and organization. Today, AOL will send you CDs giving you hundreds of hours of Internet service for free.

Now, don't get me wrong, it's a lot more fun to work in an environment where your skills are valued and you can't be replaced by some guy in Bangalore or Hyderabad, but that's life. I'm sure that the people who build cars for General Motors don't much like having their jobs outsourced to Mexico, either. It's silly to act as if it's somehow some great and unique tragedy that the computer hacker culture is turning into an industry like any other. There will always be room for smart technicians in some industry. At the moment, bioinformatics and nanotech are looking like pretty good bets.

What's especially weird about Dave's nostalgia for the supposed golden days of software development is that, at least for the 10+ years I've been in the field, simple programming has always been viewed as an essentially mechanical activity, mostly suitable for junior personnel. This is exactly why so many people aspired to be architects--a title which as far as I could tell was intended to signify that the holder did more thinking than programming. Rather than complaining, we should rejoice that we've managed to farm off the mundane tasks to people who will do them cheaply, leaving us to do the high-level big picture thinking while avoiding the messy details.

Posted by ekr at 09:02 PM | Comments (10) | TrackBack

Another run at the college student filesharing wall?

Looks like the RIAA and MPAA are taking another run at the wall of stopping college students from sharing files. A friend at Stanford just received the following letter and reports that one of his friends from Davis received a similar letter. I'll post that if I get it. If other EG readers have received letters of this type, please forward them to me and I'll put them up (with your name removed, if you wish).

Update 20030527 12:34:
I have just received a copy of the Davis letter, dated a week ago.

Posted by ekr at 08:25 AM | Comments (30) | TrackBack

The ephedra witch hunt

So, Illinois is the first state to ban ephedra. Ephedra is an herbal "supplement" containing ephedrine, which is a mild stimulant (stronger than caffeine but much weaker than amphetamines).

From the press coverage, you might get the idea that people are dying left and right from ephedra, but the truth is a little different. There have been a fair number of adverse reactions reported, but it's not at all clear how many of them are actually related to ephedra/ephedrine and how many were just coincidence is very hard to determine. After all, healthy appearing people do occasionally die without warning, it's just unlikely. On the other hand, the numbers we're talking about are very small. Here's the relevant section from the summary of the RAND report commissioned by the FDA:

Evidence from controlled trials was sufficient to conclude that the use of ephedrine and/or the use of ephedra-containing dietary supplements or ephedrine plus caffeine is associated with two to three times the risk of nausea, vomiting, psychiatric symptoms such as anxiety and change in mood, autonomic hyperactivity, and palpitations.

The majority of case reports are insufficiently documented to make an informed judgment about a relationship between the use of ephedrine or ephedra-containing dietary supplements and the adverse event in question. For prior consumption of ephedra-containing products, we identified two deaths, three myocardial infarctions, nine cerebrovascular accidents, three seizures, and five psychiatric cases as sentinel events; for prior consumption of ephedrine, we identified three deaths, two myocardial infarctions, two cerebrovascular accidents, one seizure, and three psychiatric cases as sentinel events. We identified 43 additional cases as possible sentinel events with prior ephedra consumption and seven additional cases as possible sentinel events for prior ephedrine consumption. About half the sentinel events occurred in persons aged 30 years or younger. Classification as a sentinel event does not imply a proven cause and effect relationship.

In other words, the total number of people who we might reasonably suspect to have died as a result of ephedra is less than 50. And that includes the "possibles". There is some evidence that the lack of standardization of ephedra products is a source of risk because it makes it hard to know how much you're taking. But of course, that could be easily fixed by requiring standardization, not a total ban.

So, why all the fuss? Attribute it to the usual human reaction to find a scapegoat. In Illinois, the law was passed after a 16-year old football player died. Sure, he was taking ephedra, but people do just occasionally die and there's not always someone to blame. That's life.

Posted by ekr at 07:50 AM | Comments (116) | TrackBack

May 26, 2003

Outlook viruses, etc.

Steve over at PM Style argues that the market doesn't want Microsoft to do what would be necessary to stop e-mail viruses:
So a friend of mine and I were talking, and he says to me, "Isn't it ironic that Microsoft wants to stop spam and they can't. However, Microsoft has the power to stop Outlook viruses and they won't. What's up with that?" My only answer -- the market doesn't want to stop Outlook mail viruses. Why would I make such a claim? Because Microsoft and experts in the field have known how to stop Outlook email viruses and worms for a long time and Microsoft hasn't done it. Even though their customers lose a lot of money fighting malware outbreaks, Microsoft references market pressures for failing to change the default behavior of Outlook such that the software would prevent the propagation of viruses and worms.

So who is really to blame for Outlook mail viruses? You! You now know the truth and what are you going to do about it? Probably nothing. You'll keep buying Windows and Office. You'll keep using the existing software on your PC. You'll keep supporting the Microsoft products which don't take care of the problems you think are important. Or you would rather keep HTML mail and executable attachments than protect yourself from Outlook malware. Whatever your reasons, just realize that every day you make a choice to fail to prevent Outlook malware. So don't whine about it.

One might be led to think from this description that the problem was a simple feature/security issue. That's not so. It's true that the most recent viruses (Klez, W32.Yaha, W32.Nimda) use executable attachments to spread, but they're not ordinary attachments. [0] Rather, they're attachments with special MIME types designed to exploit a bug in IE. Instead of prompting you like it's supposed to, IE just executes the attachment. Not good. Once Microsoft fixed this problem, the vulnerability went away with no loss in functionality. The reason that this vulnerability still exists in the wild is that people haven't deployed the fix. But then, there's no reason to believe they'd be any more diligent with some hypothetical no-HTML fix. On the contrary, they'd probably be less so since it removed functionality.

Don't get me wrong--executable content is a problem and stupid users do occasionally click on .EXEs they get in e-mail. But that's not what's going on here. It just so happens that there's a bug in IE that lets attackers convince it to execute executable content. So, it's true that Microsoft could control e-mail viruses to some extent by choking off their HTML and executable content support, but that's just because the particular piece of buggy software in question happens to be in that component. There's no guarantee that the next piece of malware would be in that component, so it doesn't really make any sense to suggest that that's some sort of general fix for viruses--unless, of course, that component is especially badly written, in which case MS should rewrite it.

None of this is to say that it's not true that customers are responsible for the sorry state of security. If they had held Microsoft's feet to the fire, Microsoft would have been forced to actually put out non-buggy code. It's even somewhat fair to characterize it as a feature/security tradeoff since presumably what MS was doing with the resources it didn't spend on QA was adding features. But, it's also not the case that if Microsoft had a fix, customers wouldn't take it--albeit in the usual slow way they take all new software. The problem isn't one specific feature but rather Microsoft's general sloppiness.

[0] BTW, this isn't the only kind of MS Outlook virus. For instance, BubbleBoy exploited a hole in ActiveX. Now, ActiveX and similar scripting-type things are notoriously susceptible to bugs, but it's also possible to get that kind of stuff right if you're exceedingly careful, which MS has not been.

Posted by ekr at 04:57 PM | Comments (52) | TrackBack

How to issue Top Level Domains

Karl Manheim and Lawrence Solum have a new article analyzing the economics of issuing domain names. I've certainly read other articles on this topic before, and although Manheim and Solum cover a lot of the same territory, this is the most complete exposition I've seen.

Manheim and Solum enumerate five alternatives for managing the top level domain (TLD) space:

  1. Static root--no more new TLDs are created
  2. First come first served.
  3. Case-by-Case Public Interest Assessment--this is what ICANN does now and the authors rightly label it a "beauty contest."
  4. Lotteries
  5. Auctions

After a bunch of analysis, the authors conclude that an auction is the best approach. They propose an interesting format where people bid not only on individual domain names but also on which domain names will be allocated. Essentially, bidders could bid on any number of domain names and the top N domains would get issued and allocated to the winners. One nice feature of this format is that it avoids the need for ICANN (or a successor) to decide which domain names will be created. The auction could be held periodically to allow the introduction of new domains at a controlled rate.

Posted by ekr at 09:18 AM | Comments (10) | TrackBack

May 25, 2003

Book Recommendations: Alastair Reynolds

Alastair Reynolds is responsible for my more or less complete failure to work Friday and Saturday. Revelation Space, Chasm City, and Redemption Ark (UK only, I got my copy from Kevin Dick) are set 500 years or so in the future. Humanity has settled most of the local star systems, but due to the impassable light speed barrier has broken up into a large number of fragmented factions, each with their own distinct culture and aims. I haven't seen this richly rendered an SF universe since Hyperion.

Reynolds takes the physics seriously so it never feels like he's just making up the rules as he goes along. However, unlike so many books where the science is right but the writing is terrible, the plot is tremendously gripping. Highly recommended if you like this sort of thing.

Posted by ekr at 08:07 PM | Comments (41) | TrackBack

I'm going to change my name to Eric Securityguy

Since Terence is falling down on his poker commentary duties, I'm going to have to pick up the slack. The guy who just won the World Series of Poker is named "Chris Moneymaker".
Posted by ekr at 07:43 AM | Comments (28) | TrackBack

Annika Sorenstam's response to losing

Dan Simon points out something interesting about Annika Sorenstam's response to missing the cut at the masters:
"It was a great week but I've got to go back to my tour, where I belong," Sorenstam said. "I'm glad I did it, but this is way over my head"

and

"I wasn't as tough as I thought I was," she said. "I was so nervous."

Dan's response to this is:

I simply can't imagine any top-level male professional athlete responding this way after failing in his first attempt to break into a new, higher tier of competition. More likely, he'd declare that he'd learned a lot from his defeat, and was looking forward to doing much better in his next try. If he announced that he was way out of his depth and planned in the future just to stick to his proper level, where he "belonged", he'd be derided as a wimp, a quitter, a loser.

My initial reaction was to agree with Dan, but now I'm not so sure. Here's a passage from the latest edition of Full Contact Fighter:

A the end of the 10 minute fight, Milton [Vieira] was declared the winner by unanimous decision and evened up the game again (2-2). "I'll never fight in this categority again. I won because my technique was superior, but I felt he was stronger than me all the time. Definitely my category is 75 kg," said Vieira.

It seems to me that a higher weight class is more or less analogous to playing in the men's tour instead of the women. Bigger guys are just stronger and will pretty much always beat smaller guys of equivalent skill level. Actually, it understates the difference, since bigger guys are typically slower than smaller guys and men are not slower than women.

I think it's reasonable to say that you're outclassed when it's really clear that you are and that you have no chance of making up the difference. Now, it's no doubt easier to say that if you just won, but it's certainly arguable that that makes you more of a wimp, not less, since you've proved that you can compete, at least at some level, if not the highest possible level. That's not my view, however. It's critical to know your limitations.

Posted by ekr at 07:29 AM | Comments (46) | TrackBack

May 24, 2003

More on the Ultimatum Game

Thomas Schelling famously observed that the way to win at chicken was to throw your steering wheel out the window. When your opponent sees you do that, he knows you can't possibly swerve and so he has to. The point here is that committing yourself to behaving irrationally can be a rational strategy.

In a similar game theoretic vein, we have the Ultimatum Game. Bob McGrew over at the Cardinal Collective has anticipated my next post on the ultimatum game.

To a game theorist, this looks like "repeated game effects" - if the game were played over and over again, you'd expect the Responder to reject anything much less than 50/50 to train the Proposer to over him as close to half as possible. But this can't be repeated game effects because the game is played only once, right? Well, maybe not.

There's two possibilities for why people might play a one-shot game like this as a repeated game. The first is bounded rationality - the Ultimatum Game is actually a good model for a lot of human interaction, and so people have developed strategies for dealing with them. These strategies often involve (as Eric points out) a notion of "fairness." Whether this is genetic or cultural is of course impossible to determine, but in either case, people may just play the strategies that are familiar to them instead of wasting time thinking carefully about just how much they should value that $1.

Another, perhaps more persuasive, reason for repeated game effects is a criticism of experimental economics in general. That is, in real-life, there are no one-shot games. After you play the Ultimatum Game with the attractive woman across from you, you might run into her around campus and talk to her. You wouldn't want her to think that you are selfish - it'll make things harder to you in the future. In the real world, every action someone takes affects his reputation, and it's very difficult to design experiments respecting total secrecy. Participants see each other later and talk about the experiment, usually despite the best efforts of the experimenter.

I suspect that the answer is some combination of Bob's answers above. trns out that if you do evolutionary modelling of the Ultimatum Game as an iterated game, fairer strategies tend to dominate. Interestingly, there's no requirement that the same players re-encounter each other, merely that there be a reputation effect.

What reputation allows you to do is to advertise to other players that you're not going to accept their low offer--by not accepting other people's low offers. If they are rational they will then offer you more. The problem, as with the chicken game, is that you might not follow through on your implied threat and instead decide to do the rational thing on your next game. Again, you want to commit yourself to behaving irrationally in the short term because it's rational to do so in the long term. If one is of the evolutionary psychology bent, it's popular to speculate that the reason you get annoyed at perceived unfairness is that it causes you to reject an unfair offer even though it would be good for you in the short term. Now, of course, this mechanism serves you poorly in this specific instance, but then again, the environment in which it evolved wasn't full of behavioral economists but rather of other primates trying to cheat you.

Posted by ekr at 03:23 PM | Comments (34) | TrackBack

The Ultimatum Game

One of the classic demonstrations of human irrationality is the Ultimatum Game. It works like this: There are two players, the Proposer and the Responder. The Proposer is offered a sum of money (conventionally $1) by the experimentor. He has to offer the Responder a fraction of that. If the Responder accepts, they each get their agreed-upon share. Otherwise, neither gets anything. That's it. Most importantly, there is no negotiation. Either the Responder takes the deal or he doesn't.

The rational analysis of the Ultimatum Game is simple. The Responder has the choice of whatever the Proposer gave him or $0. Clearly, anything at all is better than $0. Thus, we would expect the Responder to take whatever the Proposer offers him. Knowing this, we would expect the Proposer to propose low values.

But in real life, this isn't what happens. When people play the Ultimatum Game in the lab, Proposers generally offer around 40-50% of the total amount and Responders will frequently reject numbers less than 30%. Now, it's not irrational for Proposers to offer higher amounts. After all, if they know that Responders will reject lower amounts, that's very rational. However, as shown above, it is irrational for the Responder to reject the proposed division. So, why do they do it?

The general consensus is that Responders reject the division because they think it's "unfair". Now, don't ask me to define what fair and unfair is in this situation, but I guess people think they know it when they see it. Bottom line, though, is that people will behave in ways that hurt them in order to avoid feeling like they've been taken advantage of. Any practical use of economic theory has to take this kind of effect into account.

Posted by ekr at 10:29 AM | Comments (112) | TrackBack

Everybody's a winner!

Everybody likes to win stuff. Unfortunately, the whole point of competions like races, tournaments, playoffs, etc. is to narrow down the field until you have exactly one winner, which, of course, leaves everyone else less satisfied. This is probably fine when it's the NCAA tournament, but when you're running a local road race and people are paying you to participate, you're probably a little more worried about customer satisfaction and so less interested in branding everyone else a loser.

Luckily, you don't have to. After all, you want people who can't walk to compete but there's no way that runners can compete with people on wheelchairs (Yes, you heard that right, the wheelchair division in marathons is much faster than even the elite running division. Rolling is better than running). So, you need a separate wheelchair division. Similarly, women just aren't as fast as men, so you should probably have a separate women's division.

So far so good, but you've probably got a thousand or so people in your race and still only about 30 people who think they're in the running to win anything. The next clever innovation is to say that old people can't run as fast as young people. I know, we need age groups! Generally these are about 5 years wide, which is a little silly because--let's face it--people in 25-29 aren't really much faster or slower as a group than 30-34 year olds. Also, this leads to the funny phenomenon that triathletes call "aging up". Once you get into the older age brackets, each age group is significantly slower than the next younger group. So, for instance, when one of the top guys in 40-44 finally turns 45, he's quite likely to dominate the 45-49 group--until, of course, the best guy in 40-44 ages up as well.

But wait, fat people are slower too! So, let's have a division of fat people. I'm not joking here. The men's division for heavy people is called Clydesdale and the women's is Athena.

You know what the end game for all this is: everyone's a winner. In fact, I've heard that said at races more than once and there are lots of races that give out "finisher's medals" to anyone who crosses the line.

Don't get me wrong here: It's a real achievement to get to the point where you can do a triathlon or a half-marathon (not so much for a 10k). I'm not knocking anyone who does that, but isn't the satisfaction of completing enough? Does it really make people feel better to get that medal?

Posted by ekr at 07:51 AM | Comments (59) | TrackBack

May 23, 2003

Detecting gender from writing style

Was just leafing through an old (April 25) issue of Science and noticed a little article about detecting gender differences from writing. It turns out that by analyzing word frequency, you can gain a fair amount of information about whether a document was written by a man or a woman. Shlomo Argamon and his team report an 80% identification rate on a corpus of various British works.

Based on the article in science it appears that they're using some sort of learning-based classifier. Of course, such classifiers just produce a model and don't tell you how to interpret it. These guys seem to be computer scientists, so it's not too surprising that they come up with a pretty amateurish explanation:

Women use words such as "for," "with," and "and" more often then men, signifying their more commual tendencies. Men are more quantitative and use more determiners," such as "an," "a," and "no."

Luckily, it's not a requirement to understand how your classifier works in order to use one. In fact, that's sort of the point.

Still, it's definitely an interesting result, if it can be replicated. I don't have access to the original article, so I do worry a bit about the statistical significance of the result. Certainly, machine learning classifiers are a powerful technology, but it can be easy to fool yourself that your classifier works when it doesn't unless you're careful to design your experiment right.

Posted by ekr at 09:14 AM | Comments (87) | TrackBack

Why would the PGA want to ban women?

Ben Maller reports a rumor that in the wake of Annika Sorenstam, the PGA is considering banning women from the tour entirely. Other than sheer sexism, I can't imagine why they would want to do such a thing. As far as I can tell, the PGA has nothing to lose from a woman on the tour, provided that she can make the cuts on her own and isn't being given special treatment.

If there's anyone who would want to ban women from the PGA tour it should be the LPGA. The whole premise of the LPGA is that it's the best women golfers. If women regularly compete in the men's division, eventually the very best women will compete only in the PGA, and LPGA will be seen as more of a farm team. That would not be good for them.

Posted by ekr at 08:44 AM | Comments (61) | TrackBack

May 22, 2003

God, that sucked

A bunch of friends and I just saw Matrix Reloaded. It seems to be de rigeur to give your opinion, so here's mine: a debacle.

At the end, Hovav and I had this conversation:

Me: that was appalling.
Hovav: I was going to say "ass".
Me: I can live with that assessment.
Hovav: I don't ever think I've seen that boring a fight scene.

The basic problem is that it was completely soulless. Sure, the effects are impressive in some technical sense but they're completely uninspired. I was already bored with slo-mo bullet-cam 5 minutes in. Combine that with the inane freshman philosophy musing and the pompous "You disobeyed a direct order" dialogue and you've got something more like an episode of Deep Space Nine than a reasonable movie. There aren't many movies that would be improved by adding Jerry Bruckheimer to the mix but I suspect this was one of them.

Oh, and then I sat through 10 minutes of credits to watch a boring trailer for Matrix Revolution.

Posted by ekr at 05:36 PM | Comments (16) | TrackBack

Interpreting the New Atkins Diet Studies

Todays papers have lots of coverage of the two Atkins Diet studies published in this week's New England Journal of Medicine. The headlines in the mainstream press mostly say things like "Atkins Diet Works" but what these studies show is a little more nuanced.

There were two studies, one by Samaha et al. and one by Foster et al. Both studies were controlled comparisons of the Atkins diet to conventional low-calorie diets. The Samaha study lasted for six months and the Foster study for a year. Here are the high points:

  • Dropout rates are very high, in the area of 40% for six months. This makes it hard to analyze the data. Dropout rates aren't significantly different between the two diets, however, so there's no reason to believe it's easier to stay on one than the other.
  • Both studies showed the Atkins group to have lost significantly more weight after 6 months, but the longer study showed no significant difference at 12 months. It's not clear if there really is no difference or just if dropout rates were so high that the remaining group had insufficient statistical power.
  • At best, the differences are pretty small, on the order of 10 lbs after 6 months.
  • Triglycleride levels were better in the low carbohydrate group. This is good news for people worried that the low carbohydrate diet (and therefore probably high fat) would lead to heart disease.
While it looks like low-carb diets are somewhat useful, the data certainly doesn't suggest that they're a magic bullet.
Posted by ekr at 07:48 AM | Comments (66) | TrackBack

May 21, 2003

Statistical screening for bad lawsuits

A lot of the problem in large class-action medical lawsuits is figuring out whether or not people were really harmed. The issue, as I indicated earlier, is that a lot of people have bad outcomes in the normal course of events and so working out whether this is just bad luck or the fault of the defendant is the kind of thing that keeps statisticians and epidemiologists up nights.

My friend Kevin Dick suggests that when people bring this kind of lawsuit, they should first have to go before a panel of statisticians. If the statisticians rule it bogus than the suit is automatically dismissed and the judge has an opportunity to fine the plaintiff and/or lawyers if there's really no evidence.

Obviously, this kind of approach circumvents the jury system, but I think of this as the judge basically dismissing the case. On the other hand, the jury has no real ability to weigh the statistical evidence and people have a natural bias to believe confident people, which is pretty much the one thing that careful statisticians are not. Still, I think we'd want to set the bar fairly low so that the jury got to hear the questionable cases. The idea isn't to have experts decide every case but just to provide an initial screen to deter clearly frivolous suits. It would also incentivize drug companies to do a broader range of studies in order to provide themselves with statistical cover.

Posted by ekr at 09:02 PM | Comments (10) | TrackBack

Could SD-DVDs fail cleanly?

Ed Felten argues that Self-Destructng DVDs will fail uncleanly:
Worse yet (and despite a claim to the contrary in FlexPlay's press release), the nature of a chemical process like oxidation seems to imply that the disk's decay will be gradual. Since DVDs use error correction, FlexPlay's engineers can make the disk reliable for any desired period; but after that there will be an inevitable period of intermittent glitches as the disk gets worse and worse, until it becomes unusable. Seeing the decay, even if it lasts only for a short time, will only make consumers angrier.

This may be the case in practice, but it need not necessarily be so. I'm no expert on DVD formatting, but consider what happens if there's some section on the disk that absolutely must be read error free in order for the disk to be readable (maybe a table of contents?). If you arranged that that was the section of the disk that got destroyed by the chemical process (or at least destroyed first) then the disk would either work or not work depending on the level of the destruction. You could put a checksum over the critical section to ensure that if it had errors that the disk wouldn't play rather than generating glitches.

Of course, you would want to use error correction to make the disk reliable against small amounts of damage, so in the early stages of the process the disk would play fine, but eventually enough damage would accumulate that the critical section would be unreadable and then the disk would stop playing. Again, this may not be the way that SD-DVDs actually behave--Flexplay is pretty cagy about this--but it's not theoretically impossible to design a system that worked that way.

Posted by ekr at 09:09 AM | Comments (12) | TrackBack

What's a fair price for a DVD?

Ed Felten makes a really important psychological point about pricing of information goods:
The underlying problem is that because SD-DVDs will be sold for less than ordinary DVDs, they will draw consumers' attention to the fact that ordinary DVDs are priced well above the marginal cost of producing them. That seems unfair to many consumers.

At this point, readers who are armchair economists (or real ones, for that matter) are raising their hands and bouncing in their seats, eager to point out that marginal-cost pricing isn't sustainable in the movie business, given the high fixed cost of making a movie and the very low marginal cost of distributing a copy of it. That's true, but I think consumers' sense of fairness is based on a different kind of market in which variable costs of production dominate fixed costs.

As long as it seemed inherently expensive to manufacture and distribute a copy of a recorded movie, consumers tended not to notice that the copy was priced above marginal cost. As marginal cost approaches zero, the gap between marginal cost and price becomes much more apparent, and consumers increasingly conclude that the studios are ripping them off.

In fact, one of the major effects of Napster and other music sharing services has been to hammer home to customers just how low the marginal cost of content really is. And as Ed says, this pisses them off.

Self-Destructing DVDs (SD-DVDs) are actually an interesting example of just how differently economists and the rest of the population thinks. To an economist, this seems like a great idea, since it allows people who want to watch the movie only once to pay less. To a consumer, it seems like a total ripoff.

The problem here is that we're rapidly homing in on the lower bound. When the marginal cost of production is moderately high, then content producers don't need market segmentation to survive (though of course it's nice to make a little extra cash) and so end up selling closer to marginal cost than they really want to in order to avoid offending consumers. However, they still made enough to survive and so the market worked. However, if the marginal cost of production is zero, then economic realities start to kick in and it is no longer possible to sell close to marginal cost and customers will just have to get used to that.

However, customer's acceptance of above-marginal cost pricing is contingent on not feeling cheated. And the fact that the content industry is so obviously working hard to extract every last dollar from them doesn't exactly lead to that trusting feeling.

Posted by ekr at 09:01 AM | Comments (47) | TrackBack

Incentives for drug side effects

The New York Times reports that some attorneys are suing drug companies claiming that patients have been injured by side effects of their drugs. Medpundit argues that this is a bad thing:
That's one sure way of stifling innovation. Sue the pharmaceutical companies into bankruptcy or into paralysing fear. God save us from the lawyers.
and I'm sympathetic to that argument, but I think it's a little more complicated.

It seems to me that we need to ask ourselves two questions:

  1. Who do we want to bear the risk of side effects from drugs?
  2. Are lawsuits the right way to assign incentives?
Who bears the risk?
So, you take some drug, which (like all drugs) has a risk of side effects. Now, we can assign the risk one of two basic ways.
  1. It's your lookout, in which case if you suffer the side effects, too bad.
  2. It's the drug company's problem, so they pay you for the negative outcome.

Both assignments have problems. The problem with the first is moral hazard. The drug company is always going to know more about side effects than you do. Since the drug company doesn't pay when people have bad side effects, they're incentivized to shade the public information a bit to make their drugs look more attractive than they really are.

We can fix the moral hazard problem by making the drug company pay when you suffer side effects. However, most drugs have a fair number of side effects. For example, here's the side effect list for Clarinex, which is pretty innocuous as these things go.

Clarinex Placebo
Adverse Experience (n= 659) (n= 661)
Pharyngitis 5% 2%
Dry Mouth 4% 2%
Somnolence 3% 2%
Fatigue 3% 2%
Influenza- Like Symptoms 2% 1%
Myalgia Nausea Dizziness 2% 2% 2% <1% 1% 1%
Dry Throat 2% 1%

So, probably the drug company would be paying out claims to about 10-20% of people. This is a lot of damages to be paying. Worse yet, as you can see from the table, the incidence of "side effects" isn't that much higher than placebo. As a consequence, the drug company is likely paying a lot of people who really haven't been harmed by the drug at all. All this insurance has to get built into the price of the drug, which makes drugs inefficiently expensive (because part of the cost is headache insurance, which you would never buy if it weren't bundled into the price of the drug!).

We have sort of a weird hybrid system:

  1. Drug companies have a duty to disclose all the side effects they know about, but only to look for certain kinds of side effects. (One of the complaints in the lawsuit is that the Bayer deliberately avoided looking for a specific kind of side effect.)
  2. The FDA decides whether the side effects are too high.
  3. Patients are responsible for bearing the risk of side effects.
  4. Because most patients have medical insurance, any serious risks are mostly borne by the insurance company
  5. If people think that a drug company withheld material information, they can sue them (which is what's happening now). The FDA can also punish the drug company.

This system is naturally rife with incentive problems. For instance, there's still a big moral hazard problem with the drug companies. However, as we've seen, it's not easy to design a system without incentive problems. (This is the same kind of problem we saw with malpractice insurance).

Are lawsuits the right way to assign incentives?
As we just saw, the system we currently have has a big moral hazard problem. The drug companies are incentivized not too look to hard for side effects. One way to deal with this is to punish drug companies for not finding serious side effects that they probably should have. This could be done either by the FDA or by private lawsuit. In our system, it's a little of both, but let's focus on the lawsuit issue.

In principle, lawsuits aren't a crazy way to allocate responsibility. Certainly, the fear of being sued can incentivize drug companies to behave responsibly, since making mistakes can be very expensive. However, there are two serious practical problems.

First, lawsuits are a very expensive way to assign responsibility. A lot of money gets paid to lawyers in order to attempt to increase the probability that your side will prevail. This sort of expenditure is just wasted and can end up being a substantial fraction of the liability. Essentially, suing a drug company is a form of rent seeking, and so we get the usual deadweight loss as the drug companies and the customers compete for the money.

Second, it's not clear that the responsibility gets assigned correctly that often. Because it's so expensive to defend a lawsuit--and one of the strategies that attorneys use is to try to increase the cost to the other side in hopes of forcing a settlement--companies often incur substantial expenses regardless of the merit of the lawsuit, even if they ultimately prevail! (Dow-Corning, for instance, was forced into bankruptcy due to breast implant litigation even though there's no real evidence that breast implants made people sick.) This provides a perverse incentive to the company not to develop drugs at all, as well the desired incentive not to cheat. Either drug prices will go up to compensate or companies will just stop developing marginal drugs. Either outcome is inefficient.

What to do?
It's not clear to me exactly how to solve this problem. Clearly, we need to have some way of dealing with the situation where some drug has really serious side effects and the drug company withholds that information. Lawsuits seem like a reasonable way of dealing with that problem. However, to the extent to which drug companies are being forced to expend large amounts of money defending lawsuits where noone has actually been harmed, that's obviously a bad thing. Remember that the base rate of side effects when people take placebos is high enough that it's very hard to distinguish these two cases. I haven't seen anyone propose a system that would clearly allow most of the first type of litigation but disincentivize most of the second.

Posted by ekr at 07:14 AM | Comments (10) | TrackBack

Can Palladium stop zombies?

Steve over at PM Style points out (sorry, broken Blogger permalink) that you almost never see cell phones being turned into zombies for DDoS and spam and argues that this is because cell phones more tightly control what software you can run:
Simply put, cell phones are not general purpose computers which allow users to make substantial decisions about the software which runs on them. The general purpose computer market (including the home market) does allow users to make the choice. And that is the fundamental weakness.

When Microsoft talks about their new Palladium platform, they dance around the real shift in the computing market. Palladium is about two things: (1) protecting Microsoft's monopoly position and (2) limiting the choices of the software which can install and run on the system. It's only #2 that can make a real dent in the number of zombies.

Actually, I think this misses the mark fairly widely:
First, even if cell phones were designed to run arbitrary user software, they would be harder to zombify:

  1. They're not a monoculture. Although there are a lot of cell phones, they're made by a lot of different manufacturers and use a lot of different versions of the firmware. If you have to individually write attack scripts for a whole bunch of different platforms it makes an environment less attractive as a whole for viruses, worms, etc. (these are generically called malware). That's why you see a lot more malware for Windows than for *BSD and Linux.
  2. More central control makes it easier to stamp out zombies. The cell network is more centralized than the Internet, so if a DDoS is in progress it's easier for the provider to detect what's going on and stop it.
  3. Their data channels are more tightly restricted. Since the kinds of data a cell phone is expected to receive (phone calls and SMS messages mainly) are more stylized, the implementations can be less complicated and are probably harder to remotely compromise. Expect this to change some with the trend towards IP-capable cell devices.

Second, it's not at all clear that Palladium or something like it will solve the problem. This argument for why Palladium or something like it is a good thing gets made fairly often by the Trusted Computing guys, but I'm not convinced. Limiting the choices of packages that people will run could potentially improve the state of security in two ways:

  1. Stop them from installing software with known holes.
  2. Stop malware from installing itself.

The problem for the first theory is that most of the vulnerabilities that have lead to serious malware are vulnerabilities in standard Microsoft programs such as IIS and Outlook. This shouldn't be surprising since Microsoft controls so much of the software market and is therefore an attractive target for attack. So, limiting people's choices to the software Microsoft wants them to run would improve the situation only very slightly.

The second theory is a little more convincing, but only a litle. It's true that malware often installs itself as a separate program on your computer, but that's only because that's the most convenient way to do things. It's quite practical to have the malware take over an existing program, at which point limiting the software you can run helps not at all.

The key fact about Palladium and Trusted Computing in general is that it takes away the user's control of his own machine. It allows a third party to restrict the space of things that the user can do. If all you want to do is protect against malware, this is mostly unnecessary. All the same security techniques that you would use to protect against malware can be deployed to separate user-authorized from user-unauthorized software, while still giving the user the full range of choices. Controlling what kinds of software the user can run is neither necessary nor sufficient.

Posted by ekr at 06:00 AM | Comments (65) | TrackBack

May 20, 2003

Security and Distributed Computing

Most computers are idle most of the time. By and large, when you're sitting at your computer, it's waiting for you to type something, move the mouse, or something like that. It seems kind of silly to have all that computing power just sitting around doing nothing, and in the past 20 years or so, a lot of effort has gone into figuring out how to somehow tap those resources.

The general idea is that there would have sort of a big distributed computation network. if you had some large computational task you needed done you could break it into small pieces and have it executed on the various computers (cycle servers) of the network. This sort of distributed computing has seen a fair amount of public interest in the last 5 years or so, including such free projects as SETI@home (processing radio signals for evidence of extraterrestrial intelligence), Distributed.net (cryptographic key cracking) and Folding@Home (protein folding simulation). The Global Grid Forum is trying to design protocols that would allow generic applications of this type.

There are three major obstacles to building this kind of system:

  1. Many computational tasks cannot be parallelized. I.e., they consist of a sequence of operations where operation N+1 depends on operation N. Such tasks pretty much need to be done on one fast computer rather than a bunch of slower ones. This is basically a pure computer science problem and there's been some forward progress on designing parallelized versions of previously serial tasks. Nevertheless, many tasks are still inherently serial.

  2. Security needs to be provided for the cycle server. So, if you're executing some computational task for someone else, you need to be sure that that task doesn't interfere with your use of the computer or have access to your private data. Some of these problems have been solved via virtual machine-type technology, but arranging for payment and ensuring that only authorized sources dole out the work are still somewhat open problems.

  3. Security needs to be provided to the person who's computation is being performed. There are actually two problems here. The first is that they needs to be sure that the cycle server isn't lying. For instance, if the customer is paying the cycle server to do work, the cycle server might try to cheat by not doing the work and generating garbage results.

    The second problem is that the cycle server gets access to the raw data. This currently limits the kinds of tasks to those where the customer doesn't mind their data being publicly exposed. There's a work being done on this problem but it's still unsolved except for a few special cases.

Interestingly, there's one special case distributed computing application where none of these problems exist: DDoS and spam zombies. Spam forwarding and DDoS are eminently parallelizable. Because the attacker has broken into and controls your machine, they're probably not worried that you're going to cheat them, and they're certainly not worried about cheating you, since that's the whole purpose of the exercise. It's kind of depressing to realize that after all these years we finally have distributed computing but it's useful primarily for criminals.

Posted by ekr at 05:47 PM | Comments (10) | TrackBack

How to fight zombies

Today's New York Times has an article about how spammers use other people's computers to forward their messages. The spammer compromises a group of victim computers and installs a piece of software on them. That software works for the spammer and will relay whatever e-mail messages he wants. The compromised computers are known as zombies.

This kind of attack is getting more common. Almost exactly the same thing happens with Distributed Denial of Service (DDoS) attacks. The attacker's motivation in both cases is twofold:

  • To leverage his resources. An attacker can use a zombie network to generate much more spam or attack traffic than he could use on his own.
  • To hide his tracks. The attacker can send a single message to each zombie to kick things off (or even have a single zombie which controls all the other zombies). Since almost no traffic is coming from the attacker he is very hard to track down.

The reason this attack is so easy to perform is that many computers have known security holes that allow an attacker to break in. Generally, fixes are available for these problems, but users often haven't installed them. For instance, the Code-Red worm exploited a 6-month old vulnerability in the Internet Information Server and yet was able to compromise over 300,000 machines. The situation is similar for other vulnerabilities. I'm presenting a paper at USENIX Security 2003 that discusses a specific incident in 2002 (the OpenSSL buffer overflows). I found that only about 1/3 of the hosts surveyed had been fixed after a month and even worse, only about 2/3 had been fixed even after a worm that exploited the bug was released. (See here for the preprint version of this paper).

So, can we do to get people to apply fixes? One suggestion, made by my friend Kevin Dick, is that a consortium of large vendors who get a lot of spam (e.g. AOL, Hotmail, etc.) should pay people to upgrade their machines when bugs come out. The way this would work would be that the consortium would periodically scan your machine to see if you were up to date and pay you if you were. Ordinarily, this kind of approach would be really susceptible to the Free Rider problem, but the number of really big e-mail providers is small enough that they could probably manage to collaborate.

Another possibility, of course, is for ISPs to cut people who haven't upgraded off the network. However, that seems like a much harder selling proposition since it's going to lead to some really unhappy customers.

Posted by ekr at 02:49 PM | Comments (16) | TrackBack

Another reason why Challenge/Response won't work for SPAM

Ed Felten makes an excellent point about Challenge/Response approaches to spam. Since the challenges are manually processed, then a spammer can disguise their spam as a "challenge", thus forcing someone to receive their message.

There are, of course, countermeasures to this attack, but none of them are very good. One thing you might think is to have the user's software respond to the challenge automatically. This obviates the whole point of CR, which is to force a human to enter the loop. If the human's software can process it, then so can spammer's software. Make the challenge stylized so it can't carry spam--but Felten points out that this makes it easy for spammers to make challenge recognizers.

The only solution I've thought of that actually works is for the sender to remember all the mails they've sent out and screen out challenges that don't match them. Unfortunately, that means that CR can't work unless everyone changes their mail clients to accomodate it, which seems extremely undesirable from a deployment perspective.

Posted by ekr at 09:04 AM | Comments (16) | TrackBack

And now Bruce Bartlett chimes in

It's one thing when Warren Buffett, who seems like he might be a bit of a liberal, doesn't like your tax cut, but when Bruce Bartlett (via Matthew Yglesias) says it's nuts, it may be time to rethink the whole thing.
Posted by ekr at 08:39 AM | Comments (21) | TrackBack

Warren Buffett says the tax cut is stupid

In an opinion piece in the Washington Post, Buffett says that the tax cut is a stupid idea in theory, but even worse in practice. But the most important message, the one that people really need to hear, is the one at the end:
When you listen to tax-cut rhetoric, remember that giving one class of taxpayer a "break" requires -- now or down the line -- that an equivalent burden be imposed on other parties. In other words, if I get a break, someone else pays. Government can't deliver a free lunch to the country as a whole. It can, however, determine who pays for lunch. And last week the Senate handed the bill to the wrong party.

Look, I hate paying taxes too. But if you want America as a whole to pay less taxes then we need to cut spending, not taxes. If all you're trying to do is reduce the tax burden to yourself while increasing it on someone else, well that's perfectly understandable rational behavior, but it's pretty disingenuous to call it a tax cut.

Posted by ekr at 08:27 AM | Comments (38) | TrackBack

May 19, 2003

Is Microsoft bribing SCO?

Bruce Perens thinks Microsoft has essentially bribed SCO to threaten Linux users. I'm not sure I believe this, but it's not totally crazy. Of course, if you've been reading the comments section, you're already way ahead of the curve--EG reader Allan Schiffman assessed the situation thus on May 2nd:
What are they up to?

Rather ask, cui bono?

Therefore, don't think Lindon, think Redmond.

Educated Guesswork: if we can't figure it out, our readers will!

Posted by ekr at 02:02 PM | Comments (18) | TrackBack

Liquor now available on Sundays!

After living in Pennsylvania for my early years, one of the things I like most about living in California is the ready availability of alcohol. Thus, I read with pleasure in the New York Times that a bunch of states are repealing their ridiculous laws banning Sunday sales of liquor. It seems that they've got a bit of a budget deficit problem and hope to rake in more money via taxes on liquor sales.

It's always gratifying to see basic human greed win out over prudery.

Posted by ekr at 05:54 AM | Comments (52) | TrackBack

John Gilmore vs. the TSA

John Gilmore is suing the FAA over being required to show ID when he flies. I don't always agree with John Gilmore and I didn't initially agree with this, but I'm starting to come to the conclusion that he is right on this one--not necessarily as a matter of law, since I'm no constitutional scholar, but as a matter of what's a good policy. The basic problem is that a photo identification is a lousy form of identification. It's just too easy to fake them. ID printers are easy to get your hands on and I've got friends with very convincing-looking fake IDs.

One of the fundamental principles of security is that you have to first figure out your threat model and then design a set of security mechanisms that works in the face of that threat model. Driver's licenses and similar IDs work well when your threat model is unsophisticated attackers trying to buy beer. They work a lot less well when you have a dedicated attacker who can plan their attack well in advance. TSA needs to get its threat model straight.

Oh, one more thing: you can fly without showing ID if you submit to a more intense search. While I suppose that this is appealing from a compliance perspective it's relatively silly from a security perspective. The last thing a terrorist is going to want to do is call attention to himself by refusing to show ID.

Posted by ekr at 05:03 AM | Comments (63) | TrackBack

More on the Watch List

Via John Gilmore, here is a pointer to the Electronic Privacy Informtion Center (EPIC) page on the TSA's No-Fly watchlist.
Posted by ekr at 04:51 AM | Comments (32) | TrackBack

Just glad my name isn't David Nelson

There's an article in the Oregonian about the Transport Security Administration's watch list. The TSA maintains two lists:
  • A "no-fly" list of people who aren't allowed to fly.
  • A "selectee" list of people who get more carefully examined by security personnel before being allowed to fly.

Now, don't get me wrong. This isn't a completely stupid idea in principle. But what is bad about it is the way it's being executed. Apparently the list just contains names without sufficient disambiguating information. Thus, there's some guy named David Nelson who is on the list and so now anybody named David Nelson gets extra treatment. Worse yet, there's no way to get off:

Dave Nelson, the Salem lobbyist, spent a lot of time making phone calls after his trip to Atlanta, trying to learn how he could avoid the security hassles. "I thought I'd seen something on the news that you could get a pre-clearance, a photo I.D. We called the Port, and they knew nothing. I called the FBI and went up the ranks, and there's nothing like that. You're just stuck. I said, 'What if I used my full name, or just an initial?' They said, 'None of that would make a difference. You're on the list.' "

Aside from the obvious badness of hassling a bunch of innocent people, this is bad from a security perspective. The more false positives you have, the less likely that any particular suspect is a terrorist. So, the probability that when that David Nelson actually shows up that he'll be detected gets lower. Worse yet, all this attention paid to a specific group of non-terrorists is attention that can't be paid to random people whose names aren't on the watch list--terrorists with fake IDs, for instance.

Posted by ekr at 04:47 AM | Comments (34) | TrackBack

So, Microsoft will be the only company who can sell Linux?

Ok, now this is getting weird. Microsoft has licensed SCO's patents and source code. My previous comments about IBM potentially being the only vendor able to sell Linux of course apply equally well to Microsoft.

Update 0538:
One does wonder, however, what the antitrust implications of Microsoft owning the Linux market as well as the Windows market would be.

Posted by ekr at 04:31 AM | Comments (18) | TrackBack

May 18, 2003

I'd like highways that work better than my network

Wonko the Sane from PM Style responds to my commentary on his "AutoBGP" proposal for self-driving cars. I still think he's misunderestimating [0] the difficulty of the task he's proposed.

I wrote:

I wonder if Wonko knows how Internet packet routing deals with congestion: packets get dropped (i.e. discarded). I don't know about you, but I don't want my car discarded and I certainly don't want to be in it when it happens.
To which Wonko responds:
Of course I'm not suggesting that we literally apply an existing network routing protocol to cars. Cars are not packets. Individual cars are important; individual packets are not. I am also not suggesting that we use a CSMA/CD algorithm for determining how many cars we can get on to a given highway (despite the fact that a friend of mine insists this is how it is done in his homeland of the Philippines.) My point was that once a system knows about all the cars, congestion, accidents, construction, etc. decisions can be made that improve the throughput of the entire system.

The point I was trying to make here is that real networks have congestion. Even in situations where we have vastly more control over the behavior of the individual network elements and can do things we never could do with people (e.g. drop them) we still have network congestion. So, it's not at all clear that we know how to build the kind of switching system Wonko wants. Real networks have all sorts of weird instabilities and outages, which is why you can't always get to Amazon.com. Now, it's true that we have those things now, but my point is that it's not clear we can make something better. The keywords to look up here are: route flap and congestion collapse. Traffic engineering is very tricky.

This is a valid point. Pattern recognition is non-trivial. Maybe the solution here needs to be that "AutoDrive" functionality is only available on major highways, and select interconnections between the highways. Part of the catchall "necessary infrastructure" I referred to earlier would need to be a solution to keep deer, rocks, etc. off the highways. I'd be willing to bet that the majority of fatal accidents occur on a highway where the cars are going 60+ as opposed to residential areas where the speed limit is < 35 mph. (Maybe include a governor in vehicles that limits the speed to < 35 when not in an automatic mode?)

NNHTSA has lots of statistics on this and it's not as clear-cut as you suggest. Roughly 25% of fatal crashes occur on roads with speed limits of 40 mph or less and over 75% occur on roads with speed limits of 55 mph or less. so, at best you're going to only improve mortality by about 50%. And it's not clear how much of that is "major" highways as opposed to minor highways with high speed limits. At least in CA, most major highways have speed limits of 65 or above and about 50% of fatalities are on two lane undivided roads. Hardly what I'd call "major highways". Worse yet, most of the high speed crashes happen in rural areas, so that's a lot of area to cover. I'd be surprised if there was any situation in which fencing off all rural highways so deer couldn't get on was cost effective.

Curing cancer, etc. may never be possible because we may never understand enough about how the human body works. Now imagine there was a disease that killed 40,000+ people a year, injured probably 10 times that many, was the leading cause of death among young people, and curing it did not require any scientific breakthroughs? Wouldn't it make sense to invest significant resources to combat this plague?

It's a common mistake to think that just because something you'd like to see doesn't require any new science that it's easier than something that does. There are lots of pure engineering problems that are basically intractable or at least incredibly difficult. You may have noticed that we don't have artificially intelligent computers yet, for instance.

Ok, I'll grant that "self driving cars" is not necessarily the only logical conclusion to this argument. Maybe research in this area "only" provides a set of technologies that significantly assists a driver to operate a vehicle safely. I'd be pretty happy with that.

This sort of work is already being done extensively in the private sector. The approaches include things like traffic radar, intervehicle communications, braking indicators, alertness sensors, etc.

[0] Best thing George W. Bush ever said.

Posted by ekr at 01:10 PM | Comments (62) | TrackBack

James M. Buchanan told you so

It's pretty well understood that democracy depends on a degree of interparty cooperation. Matthew Yglesias thinks that we're getting beyond that point now.
A democratic government depends on the observance of certain norms marking the boundaries between what officials do in their role as makers and implementors of public policy and what they do as leaders of political parties. When substantial groups of people start regarding their entire job as a nihilistic pursuit of partisan advantage, the whole thing breaks down. I wouldn't want to say that the Democrats are virgin-pure on this mark, but I can't help but look back at the '98 impeachment, the 2000 election, and the cynical timing of the Iraq war vote before the '02 elections and think that the Republicans are in many ways out of control.

After reading this, it occurred to me that James M. Buchanan must be very pleased with this turn of events.

One of the favorite topics of economists--especially Public Choice economists--is the inherent failings of democracy. If you treat representative democracy in a game theoretical sense, one quickly comes to appreciate that many of the failings of our system such as log rolling, pork, etc. are really pretty much how you'd expect any collection of rational actors to behave. It's the occasional times when a politician acts in some way that really seems not to be in his self-interest at all that need explanation. From this perspective, it's not distressing that our government is so bad, but rather a bit puzzling that it works at all.

So, if Matthew is right and the Republicans are now behaving purely like vote seeking machines, then we might just get a system as bad as the Public Choice guys always said it should be. There's nothing as satisfying as predicting disaster and then having it happen.

Posted by ekr at 07:20 AM | Comments (11) | TrackBack

May 17, 2003

Surprise! Drug testing doesn't prevent student drug use

The New York Times reports that an extensive epidemiological study finds no evidence that student drug testing reduces student drug use. I've read the study and it looks pretty reasonable. This is actually a somewhat surprising result, which I suspect indicates that really aggressive testing is needed to make a dent (this wasn't controlled for in the study).

Of course, the Times article has the usual "I still think it's a good idea" quote from the man on the ground:

The study would not have swayed Randall Aultman, former principal of tiny Vernonia High School in Oregon whose decision to screen its athletes led to the Supreme Court's 1995 ruling. Drug use was so rampant among his students that he says "we had to do something drastic," without even knowing whether it was legal, much less effective.

"I don't think that drug testing works all the time, in all situations," Mr. Aultman said. "And the truth is there were many kids who said, `Yeah, we quit while we were in season and once the season was over we went back to using drugs.' "

Even so, Mr. Aultman added, other students quit for life, and "at that time, it really worked."

Note to Mr. Aultman: The plural of anecdote is not data.

Update 0713:
I've long wondered who originated that fantastic quip about anecdotes. A little googling turns up that it's Roger Brinner.

Posted by ekr at 07:05 AM | Comments (59) | TrackBack

That black box in your car

You know those black boxes they have in airplanes that they use to try to figure out what happened in crashes? Turns out that that the airbag-control computers used in your car often do the same kind of data recording and they're starting to be used in accident reconstruction.
* In January in Fort Myers, Fla., a black box caused jurors to question the prosecution's argument that John Robert Walker was speeding recklessly before a head-on crash with another vehicle. Two people died. Walker was found not guilty after a defense expert testified his truck's black box showed he was driving about 60 mph at the time -- not above 90 mph, as a witness said.

* In April, Charles Tiedje, a police officer in Arlington Heights, Ill., won a $10 million settlement for severe injuries he suffered when a hearse struck his squad car on Oct. 13, 2000. The hearse driver, Aleksandr Babayev, claimed a medical condition caused him to black out before he hit Tiedje's car. But the hearse's black box showed he had been an active driver who accelerated to 63 mph -- about 20 miles over the posted limit -- seconds before he approached the intersection, then slammed his brakes one second before impact. Tiedje's attorney, Robert Clifford, says the black-box information ''was an unbiased witness to the crash.''

Naturally, the usual suspects (Insurance Institute for Highway Safety, EPIC) are lining up for and against the technology. Unfortunately, for the anti side, USA Today quotes him as saying>

'It's only partly about privacy. It's mostly about fairness,'' says Marc Rotenberg, executive director of the Electronic Privacy Information Center in Washington, D.C. ''Invariably, the information is used against the driver.''

This looks a bit uninformed seeing as the first incident above is of the recording being used for a driver's advantage.

In any case, I suspect that barring some regulation to the contrary, these things are pretty much inevitable. If you're in the right it's tremendously advantageous to have the recording to show it. This gives consumers who are good drivers an incentive to have the boxes installed (assuming they have a choice, which of course, they don't really). And since most people believe that they are good drivers...

Posted by ekr at 06:54 AM | Comments (11) | TrackBack

May 16, 2003

What's wrong with certified virtual child pornography?

A couple of weeks ago, President Bush signed the PROTECT Act. The aspect of the act that's gotten the most attention is the creation of a nationwide "Amber Alert" system, which strikes me as silly but probably harmless. However, it also includes Congress's second pass at banning pornography that appears to be of children, whether real children were involved in its production or not. The first attempt, the Child Online Protection Act, was struck down by the Supreme Court in 2002.

What's the rationale for this? Apparently defendents have been claiming that their child pornography is virtual, and this is making the it difficult to prosecute them:

(10) Since the Supreme Court's decision in Free Speech Coalition, defendants in child pornography cases have almost universally raised the contention that the images in question could be virtual, thereby requiring the government, in nearly every child pornography prosecution, to find proof that the child is real. Some of these defense efforts have already been successful. In addition, the number of prosecutions being brought has been significantly and adversely affected as the resources required to be dedicated to each child pornography case now are significantly higher than ever before.

(11) Leading experts agree that, to the extent that the technology exists to computer generate realistic images of child pornography, the cost in terms of time, money, and expertise is--and for the foreseeable future will remain--prohibitively expensive. As a result, for the foreseeable future, it will be more cost-effective to produce child pornography using real children. It will not, however, be difficult or expensive to use readily available technology to disguise those depictions of real children to make them unidentifiable or to make them appear computer-generated.

Now, I have no independent knowledge of point (10), though I'm prepared to believe it. It's certainly the defense I would use if I were a child pornographer. However, I'm quite certain that at least part of (11) is wrong. The kind of Photoshop work that would be required to make virtual child pornography--or more likely make adults appear younger--is well within reach of current technology and will be easier in the future. I invite you to check out http://www.photoshopcontest.com to get an idea of the kind of quality output that people can produce with commodity equipment in their spare time. (Don't worry, there's no child porn here, just lots of people playing with photoshop.) Imagine what professionals can do.

Now, I'm not convinced that the burden of proving that a real child existed is sufficiently compelling to justify the first amendment burden of banning potentially virtual child pornography, but there's something wrong with this law that doesn't require that analysis: There's a simpler remedy that doesn't infringe on free speech as much. Just move the burden of proof so that demonstrating that the pornography in question didn't involve an actual child is an affirmative defense. Then you can still prosecute anyone who can't prove that no minors were harmed.

If "no minors were used" is an affirmative defense then we could have a market in "certified virtual child pornography"--child pornography which came with a certificate of non-authenticity. It's true that this stuff would be more expensive to create and buy but it would come with a guarantee of non-prosecutability, which would surely justify the additional expense to people who considered it an adequate substitute.

Unfortunately, the PROTECT Act explicitly considers this possibility and rejects it:

c) Section 2256 is amended by inserting at the end the following new paragraphs:

`(10) `graphic', when used with respect to a depiction of sexually explicit conduct, means that a viewer can observe any part of the genitals or pubic area of any depicted person or animal during any part of the time that the sexually explicit conduct is being depicted; and

`(11) the term `indistinguishable' used with respect to a depiction, means virtually indistinguishable, in that the depiction is such that an ordinary person viewing the depiction would conclude that the depiction is of an actual minor engaged in sexually explicit conduct. This definition does not apply to depictions that are drawings, cartoons, sculptures, or paintings depicting minors or adults.'.

(d) Section 2252A(c) of title 18, United States Code, is amended to read as follows:

`(c) It shall be an affirmative defense to a charge of violating paragraph (1), (2), (3)(A), (4), or (5) of subsection (a) that--

`(1)(A) the alleged child pornography was produced using an actual person or persons engaging in sexually explicit conduct; and

`(B) each such person was an adult at the time the material was produced; or

`(2) the alleged child pornography was not produced using any actual minor or minors.

No affirmative defense under subsection (c)(2) shall be available in any prosecution that involves child pornography as described in section 2256(8)(C).

and
(c) NONREQUIRED ELEMENT OF OFFENSE- It is not a required element of any offense under this section that the minor depicted actually exist.

The fact that Congress explicitly considered and rejected this possibility strongly suggests that the whole "harmful to children" rationale for banning child pornography is a sham. If lawmakers were really concerned about harm to children they'd welcome non-child-harming substitutes. Heck, they might even provide government funding to develop them. What's really going on, of course, is that they object to child pornography on principle, independently of whether children are harmed or not. However, since--like it or not--the rationale under which the Supreme Court (in Ferber) allowed restriction of child pornography was that its production was harmful to children, legislators can't use that as their rationale and are instead stuck with rather disingenuously hiding behind the "harmful to children" rationale.

Update 19:25:
Added disclaimer after photoshopcontest.com, cleaned up last paragraph and added reference to Ferber.

Posted by ekr at 06:26 PM | Comments (15) | TrackBack

Back on the air

I suffered a major database failure but I think I have things
mostly rebuilt. If you encounter weird behaviors, that's why. Please
let me know if something seems broken.

Posted by ekr at 06:23 PM | Comments (15)

The redistricting rathole

Josh Marshall has been following the Great Texas Redistricting Controversy. The story is complicated but the background is pretty simple: The Republican-controlled Texas legislature decided to redistrict (they're required to every ten years but this is being done earlier). Naturally, the plan they chose tended to favor Republicans. The Democrats didn't have the vote to block the plan so they fled the state in order to stop the legislature from having a quorum to vote on it at all. The Republicans sent the Texas Rangers after them. At this point we get into a confusing mess of claims about whether Homeland Security got involved, but I'm not here to talk about that. I'm here to talk about redistricting.

Districting
As all Americans learned in civics class, each State sends representatives to the U.S. House of Representatives. The number sent by each State is supposed to be roughly proportional to the percentage of the U.S. Population that lives in that state. (Deciding on exactly how many representatives each State gets is a rathole I won't get into now).

Now, once we know the number of representatives a State gets, we need to decide how the State chooses them. In general, the State gets divided up into a number of roughly equal-sized districts and each district gets to elect one representative. This division is generally done by the State legislature.

The problem is that that exactly how the division is divided into districts is very important. This is easy to see with a simple example. Imagine that we live in the newly created State of Educated Guesswork (EG) has 6 voters: 4 Republicans (denoted R1-R4) and 2 Democrats (denoted D1 and D2) EG is entitled to send 2 representatives to the House.

Consider the following two allocations:

Allocation 1
District 1 District 2
Voters R1, R2, R3 R4, D1, D2
Representative Republican Democrat

Allocation 2
District 1 District 2
Voters R1, R2, D1 R3, R4, D2
Representative Republican Republican

In Allocation 1, EG sends one Republican and one Democrat to the House. In Allocation 2, EG sends two Republicans. It's pretty standard practice for the party in power to attempt to choose an allocation that favors them as much as possible. Obviously, if Republicans control the district allocation they will choose Allocation 1 and if Democrats control it, they will choose Allocation 2. This practice is called gerrymandering. It's sort of expected that some gerrymandering will go on but that that allocations that are too obviously unnatural are considered inappropriate.

Unfortunately, figuring out what's fair or natural can be a tricky business. One could, of course, argue that Democrats represent almost half the State so they should have half the representatives. On the other hand, we haven't taken geography into account. What if there are two "large" cities, each with two Republicans and one Democrat. Having districts that divide each city just so you can elect a Democrat seems pretty strange.

Redistricting
In the past, roughly three kinds of events have lead to the allocations being changed (generally called redistricting).

  1. The Constitution requires it be done after every census (every 10 years).
  2. After court decisions ruling that the current districting was discriminatory (typically against blacks).
  3. When a new party comes into power and decides to consolidate their gains.

The third kind of redistricting is of course discretionary. Since the party in power this year may be the party out of power next year, it's easier on everyone if the redistricting is just done when necessary. Josh Marshall argues that there has been more or less a truce on the this kind of redistricting in the past half-century or so, but that the Republicans have broken that truce.

What about automation?
The basic problem here, of course, is that as long as redistricting is done by politicians they will attempt to cheat to further their own ends. The natural reaction that most people have at this point is to automate the process so that there's no way for humans to interfere. This turns out to be much more difficult than it would at first appear.

There are a number of considerations be taken into account:

  1. Roughly equal sized districts
  2. Geographical considerations
  3. Representation of minority interests
  4. Not overly biased towards one party.

Now, here comes the bad news. Even if we could all agree on the relative importance of each of these criteria, Micah Altman has shown that finding the best allocation is impractically computationally expensive (NP-complete). So, at the very best, we can only find approximate solutions.

Now, more or less accomodating goal (1) alone is pretty straightforward. Give everyone a randomly assigned number from 1 to the number of representatives your state has. People who have number 1 vote for representative slot 1, people with number 2 vote for representative slot 2, etc. This doesn't get absolute equality but it's pretty close. Unfortunately, your interests are probably most aligned with the people who live near you but most of them are voting for a different representative. So, this allocation scheme isn't very desirable.

There are deterministic (though not optimal) schemes for doing roughly proportional geographic allocation, but they often don't have desirable properties with respect to the other two properties--minority representation in particular. It turns out to be difficult even to determine whether a given allocation was the result of gerrymandering.

What is to be done?
So, we know that neither computers nor humans can do a perfect job, so we're going to have to settle for a less than perfect job. This gives us at least two options:

  • Have an essentially computer-based process that is statistically fair. While we may not be able to design a system that does an optimal job, we may be able to design a system which typically does an OK job and on average does a more or less fair job. This is obviously easier the simpler the technique is. For instance, you might use one of the geographic/proportional allocators with a random starting point. The main obstacle to this kind of approach is convincing people to accept a definitely suboptimal outcome. For a number of psychological reasons, people seem to be willing to accept human error on a scale that they'd never accept from machines.
  • Have an essentially human based process that is difficult to game. I'm not really sure how to design such a process but you might imagine taking turns doing allocations or something like that. Obviously, this would result in some weird allocations but it might at least avoid the dominant party having as much control as it does now.

Neither of these answers is very good. however. We're pretty much looking at the messy underside of democracy here. Once both sides decide they don't want to cooperate at all, democracy starts to pretty much disintegrate. That's what we've seen in the judicial confirmation meltdown and if some sort of truce isn't reestablished may be the endgame for redistricting as well.

Posted by ekr at 12:15 PM | Comments (51) | TrackBack

Peer to Peer Security

Caught Dan Wallach's Stanford Security Seminar talk on security for Peer To Peer networks yesterday. Some interesting ideas but the bottom line is that it's incredibly hard problem. In order to make any headway at all, Wallach had to assume that there was some central authority which assigned peer to peer node IDs. The difficulty of this, though, is that a central authority is what made Napster so easily attackable. This authority is somewhat less vulnerable, but not completely safe. Still, an interesting engineering problem if you think P2P networks are interesting.
Posted by ekr at 08:01 AM | Comments (49) | TrackBack

May 15, 2003

Just because it's public knowledge doesn't mean it's not classified

Mark Kleiman has an interesting post about how often things are classified merely because they're embarrassing. Another funny property of the classification system is that things can stay classified long after there's any useful purpose for them to be so, even to avoid embarrassment. A friend of mine used to work for the National Security Agency. At some time during his employment the topic of spy satellites came up and he was instructed that their existence was classified. But, he countered, everyone knows about them. The response: "Just because it's public knowledge doesn't mean it's not classified."
Posted by ekr at 02:23 PM | Comments (1) | TrackBack

Welcome to Iraq, have some red tape

Perry Metzger pointed me at this New York Times article talking about how the interim Iraqi authorities are refusing to approve all sorts of useful services.
Many other entrepreneurs are having the same experience. With ordinary telephone service all but nonexistent in much of the country, and likely to remain so for months, many companies want to roll out cellular phone systems that they say could be operating within weeks.

But American administrators say they will not license any commercial mobile telephone services, even temporarily, until they have a comprehensive regulatory system to govern issues from what they call "interoperability" to telephone numbers.

Meanwhile, Iraq's biggest pharmaceutical company, Dofar Pharmaceuticals, says it has large inventories of antibiotics and other medicines but remains unable to sell them because it cannot figure out whom to deal with in the American-led office of reconstruction.

That office, which is supposed to oversee the rebuilding of Iraq, is in the midst of an overhaul. Jay Garner, the retired lieutenant general who has been in charge, is being replaced by L. Paul Bremer III, a presidential envoy.

"It's been very difficult," said one exasperated American business consultant. "You can waste three hours a day walking around the halls trying to talk to people."

It's not clear what "interoperability" means. I imagine it's the ability to call from system A to system B with the same number space. But that's not necessary to have something working. The only scarce resource for telephony is frequency spectrum and that could be quickly auctioned off. Sure, it would be nice to have the systems be able to cross-connect, but it's not some sort of showstopping requirement.

Decent telecommunications are pretty much key to having a functioning business environment. Either we have to build up the infrastructure fast or get out of the way and let someone else do it.

Posted by ekr at 07:37 AM | Comments (85) | TrackBack

Hair, tumors, whatever

Scientists at the University of Michigan have discovered that beta-catenin, a protein which seems to be involved in some cancers also can be used to trigger massive hair growth. In tests in mice, they've managed to produce a lot of additional hair growth.

It's not clear if this can actually be used to produce hair growth in bald humans, or worse yet, if it would cause cancer as a side effect. Still, considering what bald men seem to be willing to go through to get their hair back, that's a risk a lot of them would probably be willing to take.

Posted by ekr at 06:52 AM | Comments (18) | TrackBack

Brad DeLong on Cuba

Brad DeLong has an interesting post on Cuba. He makes the interesting point that Cuba was actually in extremely good economic shape in 1957 (pre-Castro):
You take a look at the standard Human Development Indicator variables--GDP per capita, infant mortality, education--and you try to throw together an HDI for Cuba in the late 1950s, and you come out in the range of Japan, Ireland, Italy, Spain, Israel. Today? Today the UN puts Cuba's HDI in the range of Lithuania, Trinidad, and Mexico. (And Carmelo Mesa-Lago thinks the UN's calculations are seriously flawed: that Cuba's right HDI peers today are places like China, Tunisia, Iran, and South Africa.)
Posted by ekr at 06:20 AM | Comments (22) | TrackBack

Showing the copied code under NDA???

Linux Journal is reporting that SCO is going to show the allegedly copied Linux code to an "independent panel of experts" under NDA. Remind me again, why all the secrecy? It's not like Bruce Perens is going to get in his time machine and retroactively remove it all from every shopping Linux distro.
Posted by ekr at 06:04 AM | Comments (47) | TrackBack

May 14, 2003

IBM's response to SCO

You can find a copy of the response here. Essentially, it says "Did not!".

Also, IBM has filed to have the case moved to federal court.

Posted by ekr at 02:47 PM | Comments (13) | TrackBack

Reading SCO's complaint

I've now read SCO's complaint against IBM. Here are some relevant sections:
84. Prior to IBM's involvement, Linux was the software equivalent of a bicycle. UNIX was the software equivalent of a luxury car. To make Linux of necessary quality for use by enterprise customers, it must be re-designed so that Linux also becomes the software equivalent of a luxury car. This re-design is not technologically feasible or even possible at the enterprise level without (1) a high degree of design coordination, (2) access to expensive and sophisticated design and testing equipment; (3) access to UNIX code, methods and concepts; (4) UNIX architectural experience; and (5) a very significant financial investment.

85. For example, Linux is currently capable of coordinating the simultaneous performance of 4 computer processors. UNIX, on the other hand, commonly links 16 processors and can successfully link up to 32 processors for simultaneous operation. This difference in memory management performance is very significant to enterprise customers who need extremely high computing capabilities for complex tasks. The ability to accomplish this task successfully has taken AT&T, Novell and SCO at least 20 years, with access to expensive equipment for design and testing, well-trained UNIX engineers and a wealth of experience in UNIX methods and concepts.

86. It is not possible for Linux to rapidly reach UNIX performance standards for complete enterprise functionality without the misappropriation of UNIX code, methods or concepts to achieve such performance, and coordination by a larger developer, such as IBM.

This is to some extent true, but very misleading. That 20 years includes 20 years of academic research in high end symmetric multiprocessing. It's difficult, but it's not complete rocket science either. It's absolutely not the case that it couldn't be done without IBM's support or without misappropriating UNIX source code. In fact, the FreeBSD project is SMP-enabling FreeBSD without either of these things.

Then there are a bunch of what I suspect are out of context quotes from various IBM people in an attempt to suggest that IBM has behaved improperly. None of these are very convincing. Mainly it's IBMers saying that Linux is a natural extension of AIX. This is pretty standard marketing-speak and need not have anything to do with the technology.

What's totally missing from this complaint is any actual evidence that IBM misappropriated pieces of UNIX. The exhibits consist solely of copies of the various license agreements. It's really unclear to me what tactical advantage SCO hopes to obtain against IBM by refusing to show whatever evidence they have. On the other hand, if their goal is to spread fear among Linux users, as I suggested previously, then this makes a lot of sense.

Posted by ekr at 02:44 PM | Comments (18) | TrackBack

SCO round 2: SCO versus the world

In our last episode, SCO was merely making threatening noises about suing random Linux users. Now they've gone beyond that and published an open letter essentially threatening all commercial Linux users. The court filings are now also available.
Posted by ekr at 02:27 PM | Comments (46) | TrackBack

The MPA vs. Lyric Servers

One of the long-standing network services (predating the Web, in fact) is transcriptions of song lyrics. In the past, a number of lyrics servers have been shut down by the original copyright holders and today the The BBC reports that the Music Publisher's Association has forced LyricFind to shut down (it's still up, but most of the lyrics aren't).

While it seems pretty clear that the copyright holders have the right to forbid Internet publication of their lyrics, is this a good idea? As far as I know, they copyright on lyrics allows songwriters to make money in two ways:

  1. Royalties whenever a song is performed.
  2. By selling printed versions of the lyrics.

Clearly, lyrics servers don't cut into business (1) at all. On the contrary, they make it very slightly harder to perform songs and so probably help it to some insignificant extent. So, the question is whether they cut into (2) much. I doubt it. The only people I know who have ever bought copies of lyrics are people who bought the sheet music so that they could perform the piece. To the extent that that's true, lyrics servers don't really represent lost revenue to the music publishers at all. So, why bother to make a big stink?

It's of course possible that (like the RIAA) the MPA really believes that every downloaded file is a lost sale. I don't see how anyone who thinks about it for a minute could believe that. The only other alternative I see is that they really think of the copyright as an inherent property right and so they're suffering some psychic harm even if no actual harm. Can my readers do any better?

Posted by ekr at 07:55 AM | Comments (54) | TrackBack

Does Andy Kessler understand insurance?

I'm thinking the answer is no. In Andy's anti-Warren Buffett screed he criticizes Buffett for owning insurance companies:
I'm not just picking on Buffett here, I think all insurance companies are a drain on the U.S. economy, taking valuable capital out of circulation for the risk of a "rainy day" and investing them in "low risk" securities.

This is of course completely backwards. Let's imagine that you foresee the possibility of some large loss (say your death) but that insurance companies don't exist. So, you want to provide for that loss. What do you do? Well, you take a bunch of money and you stash it somewhere. Assuming you're risk averse (most people are) you stash it somewhere safe like your bank account or in T-bills. These are the classic low-risk securities.

Now, imagine that you instead take out that money and buy an insurance policy. Of course, lots of other people have done the same thing and so the insurance company has a big pile of money. Some of that money gets paid off to people who died but what does the insurance company do with the rest of it? They could just keep it under their mattress, but probably they invest it. Now, they could invest it in something really safe but generally they're less risk averse than you (that's why they're willing to sell you insurance anyway) and so they can afford to put some of it in the market. In fact, two paragraphs before, Kessler tells you that that's exactly what Buffet's insurance companies do. So, what's all this nonsense about taking capital out of circulation. Quite the contrary! Buffett is allowing more capital to be in circulation.

Even if insurance companies just invested in T-bills, Kessler would still be wrong. Insurance companies enable a broad class of transactions that otherwise could not occur. For example, say you're buying a house. Unless you're unusual, you almost certainly don't have enough money to pay for it so, you have to borrow the money from a bank. The bank uses the house itself as collateral for the loan. If you stop paying, the bank takes the house. Now, what happens if the house burns down? There's nothing for the bank to take and you might just decide to stop paying.

Obviously, the bank obviously doesn't want this to happen. So, they require you to carry insurance on the house. Therefore, they're guaranteed that if the house burns down they can still recover your money. Without that insurance, the bank would be much less happy to lend you the money. At best, they'd have to charge you a lot higher interest--in effect, self-insuring against the loss. At worst, they wouldn't lend you the money at all.

Of course, noone likes paying insurance, and it's doubly annoying when you realize that on average you're losing money. However, saying it's a drain on the U.S. economy is just silly. I guess that's why TCS has Andy Kessler is writing about the stock market and someone else writing about economics.

Update 14:33
Kevin Dick points out another substantial advantage of insurance. If you don't have insurance and you need to protect yourself from a given risk you need to have a large pool of cash sitting around to cover that risk. Insurance allows risk pooling and so reduces the amount of capital that's sitting around doing nothing but covering risk.

Posted by ekr at 07:20 AM | Comments (56) | TrackBack

May 13, 2003

You can't tell a song by it's title

As part of their campaign against song swapping the RIAA has been sending DMCA "takedown" notices to people suspected of swapping music copyrighted by RIAA members. Unfortunately, the tools that they've been using are apparently quite primitive and so generate false positives, for instance when someone is offering a file with the same name as a copyrighted song. CNET reports that the RIAA has withdrawn 24 such letters, blaming a temporary employee.

Given the number of possible song titles there are going to be a lot of such duplicates. The RIAA might want to develop some technology to actually verify whether two files are the same song.

Posted by ekr at 03:44 PM | Comments (10) | TrackBack

Infectious diseases getting worse?

Kevin Dick pointed me to this rather overheated story in the San Jose Mercury News about how infectious disease is getting much worse:
The nation's top scientists say that environmental, economic, social and scientific changes have helped to trigger an unprecedented explosion of more than 35 new infectious diseases that have burst upon the world in the past 30 years. The U.S. death rate from infectious disease, which dropped in the first part of the 20th century and then stabilized, is now double what it was in 1980.

It's not clear that this is really a general trend. The diseases they mention: SARS, ebola, e. coli, cryptosporidium, Legionnaires, Hep C, and AIDS, are really a mixed bag. The only one of these diseases that causes a significant number of deaths is AIDS. (Six diseases cause 90% of infectious disease deaths: respiratory infections such as flu and pneumonia, AIDS, diarrhoeal diseases, TB, malaria, and measles). The other 5 are of course very old.

So, to what extent to we have a generic infectious disease problem? SARS has probably killed less than 1000 people. The remainder have small outbreaks every so often but aren't really major killers. More likely, then, as our medical technology gets better it becomes easier for us to identify new diseases. That combined with our society's relatively new fascination with infectious diseases means that we have a lot more attention being paid to new diseases of all kinds.

Bottom line: we've got a serious AIDS problem and a raft of other not very important diseases that people are nevertheless unduly concerned with.

Posted by ekr at 01:26 PM | Comments (17) | TrackBack

Self-driving cars--maybe not

Wonko the Sane over at PM Style thinks it would be a good idea to spend a lot of money developing self-driving cars:
Steve's posting regarding fatalities from driving related accidents is one of my hot-buttons. According to the Centers for Disease Control and Prevention web-based Injury Statistics Query and Reporting System (WISQARS), in the year 2000 there were 43,604 motor vehicle related deaths, and motor vehicle traffic was the leading cause of death in persons aged 1 to 24. (Assuming I am reading the data correctly. Check it out if you are in a morbid mood.)

Compared to curing cancer or heart disease, this is an easily solvable problem; just really expensive. We should be spending a significant amount of money (e.g. relative to our national defense budget) on self-driving cars and associated necessary infrastructure. Then network and route 'em like packets. In addition to significantly reducing accidents, there would be no more traffic, decreased fuel consumption, and I could nap / eat / watch TV on my way into work.

There are a number of reasons why this isn't a good idea. Let's start with the analogy to packet routing. I wonder if Wonko knows how Internet packet routing deals with congestion: packets get dropped (i.e. discarded). I don't know about you, but I don't want my car discarded and I certainly don't want to be in it when it happens. Worse yet, the Internet protocols only control congestion well when all the agents cooperate. What does this say about transition strategies?

Second, we don't really have the first clue how to do this. It's one thing to design a system that works on the freeway, where traffic behavior is (relatively) predictable. It's quite another to build a car smart enough to avoid cats, deer, and small children in residential areas. That involves having pattern recognition systems we're not even close to knowing how to build.

Third, large automated systems are inherently very tricky. Anyone who reads the RISKS digest is naturally pretty wary of designing a large centralized automated system. Ask yourself how often you can't get to a web site of your choice and whether this is the level of reliability you want from your car. Moreover, we haven't even mentioned the potential threats from someone taking over the routing system. The mind boggles.

It's worth pointing out at this juncture that even though airplanes mostly have computerized autopilots, the air traffic control system isn't 100% automated. Let's try this idea on a nice "simple" system like that before we try to convert the 140 million or so cars on the road.

Posted by ekr at 08:03 AM | Comments (55) | TrackBack

May 12, 2003

More on smallpox

Medpundit responds to my post on smallpox in the comments section. It's nice to get a medical opinion, so even though I disagree with her, I wanted to respond to this in the main blog, rather than the comments section.
1) An "unsophisticated" smallpox attack would be just as deadly as a "sophisticated" one. The disease is highly contagious. One person can infect scores of others, especially in a population with no immunity. (Everyone under thirty here.) This is exactly what happened to Native Americans in North and Central America in the 16th to 18th centuries. Whole villages were wiped out by the disease.

It's true that smallpox is quite contagious, but I can't agree that an unsophisticated attack would be as deadly as a sophisticated attack. The RAND study I referenced explicitly modelled a number of different kinds of attack, ranging from "a few infected people cross the border from Mexico" to "spraying virus into airport terminals". The death rates, even with no prior vaccination, span over three orders of magnitude (from about 20 to 50,000). Now, of course, it's possible that the RAND people are wrong, but I don't think they're likely to be so wrong as for it to make no difference how the virus is introduced into the population.

2) The smallpox virus is a very large virus. It has a lot of different proteins on its surface which stimulate our immune response. It would be difficult to alter them enough to dodge the immune system while still maintaining the viruses's virulence. Anything's possible, but this is particularly unlikely to have happened.

Would that this were so. Unfortunately, the news on this front looks pretty bad. In 2001, a bunch of Australian scientists engineered a mousepox virus that produced large amounts of interleukin 4. It ripped through mouse populations with something like 100% mortality and was specifically resistant to attempts to vaccinate. Only about half the vaccinated mice actually seemed to have immunity. Mousepox and smallpox are closely related so it seems likely the same techniques could be applied to smallpox.

3) Infected healthcare workers, unless they get the vaccine within hours of exposure, will be very sick - too sick to care for their patients and too sick to vaccinate the rest of the population (i.e. everyone) who would then need to be vaccinated after an attack. It makes more sense to vaccinate healthcare workers now to avoid just such a scenario. The risks of doing so now are nothing compared to the risks of not vaccinating. Unless there is absolutely no smallpox virus left in the world. And we know that's not true.

According to Mortimer (referenced in the original post) the window of effectiveness from exposure to primary vaccination is something like 4 days. That's really plenty of time.

As you can see from my response, I'm not convinced that in fact the risks of not vaccinating are sufficiently high enough to motivate vaccination. And even if they are, I think we need to find some way to compensate health care workers for the risk they're being asked to take.

Posted by ekr at 08:11 PM | Comments (9) | TrackBack

The incredible non-shrinking sheep

Scientists at Australia's Commonwealth Scientific and Industrial Research Organization (CSIRO) think they can breed sheep which produce wool that doesn't shrink in the wash, or at least shrinks less. They found that some sheep produce wool that shrinks a lot less than other wool, which suggests that they can breed for this trait. Lisa will be so pleased.
Posted by ekr at 05:29 PM | Comments (61) | TrackBack

Should health care workers be vaccinated for smallpox?

Medpundit is observes that health care workers don't seem to want to get vaccinated for smallpox. I'm not sure this is actually such a bad thing.

How much (if any) smallpox vaccination to use is a difficult public health decision theory problem. We have a pretty good idea of the negative consequences of widespread smallpox vaccination of health people but don't have any real idea of the likely number of deaths from smallpox if we don't vaccinate. The difficulty here is that it's very hard to estimate the likelihood or potential impact of an attack using smallpox with any certainty at all. The standard procedure in these cirumstances is to invent some scenarios and try to figure out whether the policy being considered pays off under those circumstances. When RAND studied this issue back in January, they concluded that vaccinating health care workers probably made sense but vaccinating the public did not.

However, the RAND modellers don't seem to have considered the possibility that a potential smallpox outbreak might involve a strain against which the vaccine is ineffective (which it's believed the Soviets were preparing). Kevin Dick pointed out to me that the situation is bimodal. Either the attacker is sophisticated in which case we would expect both a sophisticated attack and a vaccine-resistant smallpox strain or the attacker is unsophisticated in which case the attack will be unsophisticated and likely involve a resistant smallpox strain. If we believe this analysis that prior vaccination starts to look like a really bad bet, since the number of deaths in the unsophisticated case would be small anyway.

Moreover, health care workers are being asked to bear a really disproportionate amount of risk. Post-exposure prophylaxis of smallpox appears to be very effective [0] and although health care workers are more likely to come in contact with smallpox they're also more likely to be in an environment where it is properly recognized and therefore most likely to get the vaccine. Therefore, it's not clear how substantial the benefit of prior vaccination of health workers is for them as opposed to for the benefit to the general public.

Kevin's analysis has made me quite unsure as to whether it's worth doing any prior vaccination at all. We certainly shouldn't be surprised if health care workers balk. Moreover, if we're to ask them to bear most of risk, we should find some way to compensate them for bearing it.

[0] Mortimer, P.P., "Can postexposure vaccination against smallpox succeed?", Clinical Infectious Diseases, March 1, 2003. (not on the web).

Posted by ekr at 05:08 PM | Comments (16) | TrackBack

Gas station under new management?

The Shell gas station down the street from my house is flying a big balloon that says "under new management". I'm trying to figure out what's up with that. Gas is pretty much a commodity item and it's still a Shell, it's just owned by someone else now. How is this relevant to me at all? To make things even more silly, it's right off the freeway so probably a very substantial amount of its business is from people who aren't local and therefore wouldn't notice a management change anyway.

Baffling.

Posted by ekr at 03:26 PM | Comments (66) | TrackBack

Google to start a separate blog index

The Register reports that Google is going to start a new index for blogs and (presumably) remove them from the main search index. The problem, for people who aren't familiar with Google, is that Google weights pages by the number of pages who link to them. Since bloggers do a lot of linking to other blogs, this means that Google tends to weight them very heavily, which isn't necessarily what you want.

It remains to be seen how well this works. The obvious danger is that as blogging software is becoming an increasingly popular way to self-publish, more and more web pages will become blogs and it won't be possible to make a useful distinction between blogs and other pages.

Posted by ekr at 08:28 AM | Comments (10) | TrackBack

May 11, 2003

Doctors vs. Pharma companies: a category error

DB of DB's Med Rants writes
The pharmaceutical companies have a goal - profits. Physicians generally (I will admit to some exceptions) have the joint goals of making money and helping their patients. Once one understands and accepts the pharmaceutical companies goals as a given, then one understands that they need not do studies unless they think those studies will result in a marketing edge. That is why I want a pharmaceutical tax which would fund the important studies.

This comparison is a serious category error. Pharma companies, of course, don't have desires at all--they're just organizations. I'm not convinced that in person doctors are any more or less altruistic than drug researchers (DB doesn't seem to me to be any more altrustic than Derek Lowe) nor that pharma companies (organizations of people who make and sell drugs) are any more or less altruistic than hospitals (organizations of people who deliver medical care). Ordinarily, it's pretty common to talk about companies as if they have monolithic desires but it's pretty misleading here, in a way that is designed to make doctors look better than they necessarily are. If we're to think rationally about pharmaceutical pricing and supply, it would help to start by talking rationally about the players.

As for the suggestion of funding studies by a tax on pharmaceuticals--why? The usual effect of taxing some good is to decrease the quantity demanded. Does DB think that too many pharmaceuticals are consumed? I'm not saying that these studies aren't worth doing--they probably are--but I wish that people would think about the economics a bit before proposing distortionate taxes.

Posted by ekr at 05:23 PM | Comments (20) | TrackBack

Why the SCO suit could be good for IBM

I've mentioned previously that SCO is suing IBM for allegedly putting sections of System V Unix (to which SCO holds the copyright) into Linux. Seeing as IBM can deploy an army of lawyers, I never thought that this represented much of a threat to IBM, but I just realized that it might actually be good for them.

IBM sells hardware that runs Linux. The problem is that both hardware and Linux are commodities. There's a glut of PCs and any schmuck with a commodity PC can download Linux (or buy it from RedHat) and put it on their machine. This limits the amount of money that IBM can charge for their Linux-based machines. The SCO lawsuit represents an opportunity for IBM to decommodify Linux.

Consider what happens if the threat of a big lawsuit by SCO (or anyone else who thinks that Linux infringes their copyrights) frightens consumers who would otherwise use Linux? The number of people downloading free copies of Linux or using the commercial distros drops dramatically. But IBM can immunize themselves from this by simply indemnifying anone who buys Linux from IBM.

Of course, SuSE and Red Hat could follow suit and indemnify their customers, but would this be credible? The Linux companies--as opposed to IBM--aren't particularly rich, so it's not clear that they could really indemnify big customers. IBM, however, easily could. Thus, even though Linux is basically free, IBM could achieve a substantial edge in the critical business market because they would be the only company who could offer you an ironclad guarantee that noone would come suing you for copyright or patent infringement just for using Linux.

Just a thought...

Posted by ekr at 02:57 PM | Comments (66) | TrackBack

Spam gets weird

I just got the following piece of e-mail:
Subject: Has your life been Ruined by Evil? putjhpckd jn cu a z

Have you really, really, really been hurt to the point where your live is a living hell?

~ Has somebody or something drastically altered your life? ~
~ Would you give anything to take back your stolen life? ~
~ What if there was a way to undo all done to you for $100,000? ~

What I am referring to is something which is well covered up from the general public! I have access to the way, and need just one single person to work with.

Who I pick will be determined on the severity of their situation. This is your one and only chance to live life over, and take control over what was stolen from you. Mentally stable open minded individuals a must! Someone close to the Boston area is preferred.

If you want your life back and would like for me to consider you, email a brief description of your situation to me at powercrystals@firemail.de

Please do not reply directly back to this email as it will only be bounced back to you

I must say, I'm pretty curious as to what this guy is offering. It sure looks like it's a contract killing, so probably he's looking to suck you in and then blackmail you. I'm thinking maybe it's better not to answer...

Posted by ekr at 10:11 AM | Comments (29) | TrackBack

Three cheers for the green revolution

R.E. Everson and D. Gollin have a nice review article on the green revolution in the May 2nd Science. Unsurprisingly, they conclude that the green revolution has been a big success. Without the Green Revolution, caloric intake for people in developing countries would have been 13.3-14.4% less and 6.1-7.9% more children would have been malnourished. Other observations:
  • There wasn't a big one time jump in the 1960s, but rather there has been relatively steady growth over the last 40 years.
  • Sub Saharan Africa improved a lot more in the 1980s and 1990s than in the 1960s and 1970s. Their conclusion is that the modern crops available in the early years weren't suitable for Sub-Saharan Africa.
  • A lot of the improvement (about 60%) is due to fertilizers and other modern cultivation methods, not to modern crop varieties. This was more true in the early years than the later years, especially in Africa. This suggests that we're really learning how to develop better crops.
  • The green revolution has produced a big improvement for consumers but not so big for farmers, since prices have dropped a lot to keep pace with improved production. This is about what you would expect from increased production of commodities.
The prognosis for the future is mixed. On the one hand, Everson and Gollin say the pipeline for future research is full so we ought to expect even better crops. However, the big government funded research centers that did a lot of the new crop development are having funding problems and it's not clear if the private sector will pick up the slack.
Posted by ekr at 10:02 AM | Comments (29) | TrackBack

I guess eggs are good for you

Last week's Science has an interesting article on growing human proteins inside of chicken eggs. Say you have some drug that's based on a protein. How do you make it? The traditional way would be to use cultured and engineered cells, but this requires a lot of maintenance. The obvious solution is to use transgenic animals and some forward progress has been made on using goats and cows. Chickens make a better production system since they're easier to ramp up and keep, but over the past 20 years it's been very difficult to actually make the appropriate transgenic chickens.

The problem is that the standard way to make transgenic animals is to work on a newly fertilized egg. But the chicken egg is aready 60,000 cells by the time it's been laid. In one of those wonderful "idea who's time has come moments" three research groups have come up with three independent techniques for solving this problem:

  • Helen Sang at the Roslin institute (where Dolly was made, you'll recall) has managed to do gene therapy using a lentivirus on a freshly laid egg. Her chickens express a green fluorescent protein.
  • Jin Qian at BioAgra does gene therapy on the sperm to make chickens which produce alpha and beta interferon.
  • Jim Petitte at NC Raleigh used some other method that frustratingly isn't mentioned in the article. Harvesting from the oviduct, perhaps?

Word is we can look for transgenic chickens producing pharmaceuticals in the near future.

Posted by ekr at 07:53 AM | Comments (17) | TrackBack

May 10, 2003

I don't like paying taxes any more than the next guy

But apparently the next guy isn't Gene Austin, who is fasting out front the Austin IRS office (for 25 days now) until he gets an answer to the question "Where is MY tax liability in the law?"
Posted by ekr at 07:52 AM | Comments (1) | TrackBack

I've paid my taxes!

Just happened to be looking at We The People's 97-page anti-tax manifesto (look at your own risk, it will just make your head hurt) and noticed that they claim that "the average Wisconsin citizen had to work until May 9th this year to pay all alleged tax obligations". Now, I'm sure that I'm not paying exactly the same income tax as the average Wisconsin resident, but I figure I ought to have spent my 5-6 months of the year working for the government real soon now.

I'm utterly baffled by groups like We The People who try to "prove" that we don't have to pay taxes. What the hell do they hope to accomplish? The law's not like mathematics where everyone just has to (or so they may fantasize) accept the force of your superior reason. It's enforced by guys with guns. Now, I'm not saying that sometimes a really good argument doesn't convince the courts to do something they don't want to do (though it's pretty rare). But pretty much any conceivable anti-tax argument has already been tried out in the courts with no success. Give it up, people, you lost.

Posted by ekr at 07:50 AM | Comments (13) | TrackBack

May 09, 2003

If you don't know how to pronounce a word, say it loud

Reuters reports that WHO Director-General Brundtland wants the heads of food companies to "play a key role in shifting public taste in its campaign for healthier diets worldwide". That sounds nice, but what does WHO mean by healthier diets? Well, let's see...
A recent report by WHO and sister U.N. agency, the Food and Agriculture Organization (FAO) recommended cutting down on fats, sugars and salt, and increasing consumption of vegetables and fruit, saying such changes would have a major impact on the death toll from the so-called "lifestyle" diseases.

The recommendation to eat more fruits and vegetables sounds healthy, but as I've blogged about previously the data that suggest that we ought to eat any particular macronutrient composition are pretty inconclusive. Given that, maybe the guys from WHO could stick to the scientific evidence and concentrate on the health problems we really do understand rather than hectoring us about the ones we don't. I know that a lot of people die from lifestyle diseases, and that produces a lot of pressure to do something but better to just admit we don't know than to confidently try to get people to change their behavior when we're just guessing.

Posted by ekr at 06:02 PM | Comments (28) | TrackBack

How we got the Space Shuttle

Space Daily has a pretty devastating article about Robert F. Thompson (the space Shuttle program's manager between 1970 and 1981) testifying before the Columbia Accident Investigation board. In particular, someone finally admitted in public that NASA always knew that their cost numbers were absurd:
This, however, is a good deal less shocking than his next statement, on the origins of that $118-per-payload-pound operating cost estimate:

"At the time that we were selling the program at the start of Phase B, the people in Washington got a company called Mathematica to come in and do an analysis of operating costs. Mathematica discovered that the more you flew, the cheaper it got per flight."

"Fabulous... So they added as many flights as they could. They got up to 40 or 50 flights a year. Hell, anyone reasonable knew you weren't going to fly 50 times a year.

"The most capability we EVER put in the program is when we built the facilities for the [External] Tank at Michoud -- we left growth capability to where you could get up to 24 flights a year by producing tanks, if you really wanted to get that high. We never thought you'd ever get above 10 or 12 flights a year.

"So when you say, 'Could you fly it for X million dollars?', some of the charts of the document I sent you today look ridiculous in today's world...Those costs per flight were not the cost of ownership... We didn't try to throw the cost of ownership into that. It would have made it look much bigger. So that's where those very low cost-per-flight numbers came from. They were never real."

This is no surprise to anyone who's been following the program, which has pretty much been a botch from start to finish. It certainly shouldn't be a surprise post-Challenger. After reading Feynman's report it should have been clear that NASA never felt any obligation to tell the truth, just whatever they thought was most likely to get them money.

If a reasonable launch schedule is to be maintained, engineering often cannot be done fast enough to keep up with the expectations of originally conservative certification criteria designed to guarantee a very safe vehicle. In these situations, subtly, and often with apparently logical arguments, the criteria are altered so that flights may still be certified in time. They therefore fly in a relatively unsafe condition, with a chance of failure of the order of a percent (it is difficult to be more accurate).

Official management, on the other hand, claims to believe the probability of failure is a thousand times less. One reason for this may be an attempt to assure the government of NASA perfection and success in order to ensure the supply of funds. The other may be that they sincerely believed it to be true, demonstrating an almost incredible lack of communication between themselves and their working engineers.

But of course these recommendations weren't adopted as the main report and nothing was ever done. Instead, we got the Space Station, which is in some sense even more of an achievement for NASA, being almost totally useless instead of just inferior to other existing technologies, as the Shuttle was.

It's pretty clear at this point that the Space Shuttle is a botch and we ought to give up. The only question is: what should replace it. The short term answer: cheap disposable rockets. The long term answer: a Space Elevator.

Posted by ekr at 07:13 AM | Comments (20) | TrackBack

Russians and everyone else

Eugene Volokh writes:
The Russians -- in this post, I include under that name all the other nations of the Russian Empire, such as the Ukrainians -- did tremendous evil in Eastern Europe (though probably less than they did to their own people, in part because the absolute worst of the Soviet regime was mostly over by 1945); and many of their devastating losses in World War II (the number that I'd heard was 20 million Russians killed) flowed from Stalin's crimes and folly, both before the war and during, as well as from Hitler's invasion. Nonetheless, had the Soviet Union surrendered, and spared themselves some of that blood, Hitler would likely have been able to retain Europe, and kill and enslave who knows how many more people.

This is of course true and generally forgotten by Americans, who like to take credit for winning WWII. Still, two comments in response to Eugene:

  • I'm not sure the Ukrainians would be the first people I would have included with the Russians in doing tremendous evil. The Ukrainians were generally treated pretty badly by Lenin and Stalin. Recommended reading on this topic: Robert Conquest's The Harvest of Sorrow

  • The worst of the Soviet regime was over by 1945 mainly because Stalin died in 1953. If we're to engage in counter-factuals and imagine he died later, it's not entirely clear that being killed and enslaved by Hitler was much (if at all) worse than being killed and enslaved by Stalin. Had the Americans not gotten involved and the Russians somehow survived, it's quite possible that Stalin would have taken over all of Europe, with just as evil consequences. Recommended reading on this topic: Martin Amis's Koba the Dread.
Posted by ekr at 06:19 AM | Comments (19) | TrackBack

May 08, 2003

Crypto topic generator

Want to write a cryptography paper? Need an idea? Nagendra Modadugu pointed at the Crypto Research Topic Generator.

My problem, unfortunately, is too many ideas and not enough hands. If only they had the security grad student generator....

Posted by ekr at 04:45 PM | Comments (27) | TrackBack

PFIR's Tripoli Project

People For Internet Responsibility founders Lauren Weinstein and Peter G. Neumann have come up with their own plan for preventing spam, called Tripoli. Basically, they want to use cryptography to authenticate the senders of given messages. Senders would be certified by CAs who enforce no-spamming rules. This is a variant of the "requiring cryptographic authentication" approach that I mentioned earlier.

For reasons that aren't entirely clear, Weinstein and Neumann want to at the same time prevent ISPs from blocking mail transmission by users and have generic e-mail encryption. While these may be good things, it's not clear to me that they're much related to preventing spam. On the other hand, they pretty much mandate a completely new infrastructure, rather than enhancing existing protocols--and indeed that's what Weinstein and Neumann propose.

The problem here is the usual "network effect" problem. Especially at the beginning, rolling out the Tripoli services offers essentially no value to the user (since they have noone to talk to, and the service being offered is more or less the same as the one they have now) so they have no incentive to do it. Color me skeptical.

Posted by ekr at 03:52 PM | Comments (15) | TrackBack

How about doctors stick to medical opinions?

DB of DB's medical rants points to a New York Times article on the topic of cost-effectiveness for health care:
Implanted under the skin of the upper chest, biventricular pacemakers have been shown to relieve heart-failure symptoms, like breathlessness, and to decrease the frequency of hospitalizations. A study reported last month suggests they may even prolong life.

But the devices cost about $20,000 each.

In the United States, more than five million patients have heart failure, and half a million new cases are diagnosed each year.

If even a small fraction of these patients received this implantable device, the costs could reach billions of dollars. Cardiologists are beginning to ask, Is this a sensible way to spend health care resources?

As you can imagine, I think this way of looking at things is completely confused. Who asked doctors to decide who lives and who dies?

Pay for service
As usual, it's easiest to see what the problem is by simplifying. So, let's pretend there's a world where there's no insurance. I walk into the doctor's office and he discovers that I have some kind of heart problem. That problem can be fixed by installing one of these pacemakers at a total cost of $30,000 (I added some padding for the cost of the surgery itself). I'm willing to pay up to $40,000 for the surgery. Now, there are two possibilities:

  1. The surgery happens. I pay $30,000. I'm $10,000 happier. Net benefit to me: $10,000. Net benefit to the hospital, doctors, etc: whatever profit they make.
  2. The doctor decides that this isn't a "sensible way to spend health care dollars" so I don't get the surgery. I keep my money. Net benefit to me, $0. Net benefit to the hospital: $0.
Clearly, having me have the surgery is in every respect superior to me not having it. In the jargon of economics, it's Pareto-dominant. Now, you may be asking at this point: "what about society's resources?" What about them? If I didn't spend the money on surgery, it would just be sitting in my bank account, not benefitting anyone else.

Health Insurance
Health insurance complicates the matter somewhat because it's the insurance company's money I'm spending, not my own, but the essential analysis is the same. Imagine for the moment that instead of buying one big health insurance policy I instead buy a series of policies, one for each condition. Accordingly, I've either bought the "implanted defibrillator" policy or I haven't. If I haven't I don't have any grounds for complaint if the insurance company doesn't want to pick up the tab, but if I have, I sure as hell want them to pay for it, not decide that it's not "cost-effective." Remember that at this point in the game they've got my money. Spending anything isn't cost effective for them. What sane person would buy insurance if insurance companies could repudiate contracts on these grounds?

Now, life is way too complicated to purchase insurance this way, and although insurance contracts are festooned with restrictions and details, plenty of stuff is of necessity left unspecified, which leads to the kinds of discretion we've been discussing. However, the basic point remains: I've paid for the service upfront. The doctor's job isn't to decide whether or not it's to society's benefit to provide it to me but (at most) whether it's the service that I contracted for. The fact that I was willing to pay for the insurance in the first place indicates that it was worth it to me--exactly as in the fee for service case. That's all the information we need to know that the procedure is efficient.

It's certainly arguable that there would be some societal benefit in having more flexible insurance contracts so that people could decide for themselves the cost-benefit ratio of each individual procedure and contract for it or not. On the other hand, it's also possible that the information and contracting costs would make this inefficient. However, even in this case it would be patients deciding, not doctors.

Special Cases
There are two special cases where this analysis doesn't apply. The first is when the taxpayer is picking up the tab. In that case, cost/benefit analysis is certainly appropriate since that money is supposedly being spent for the good of society.

The second special case is where there is some really scarce resource like donated organs. Since there's no market for organs (though of course some people, notably Richard Epstein, have argued that there should be), we really need to have rationing and it's arguable that doctors should be the people to do it.

Posted by ekr at 12:21 PM | Comments (72) | TrackBack

More paying to send e-mail

Declan McCullagh climbs onto the "charging to receive e-mail" anti-spam bandwagon. Unfortunately, he doesn't seem to know how to get past the inevitable deployment problems either. Without a coherent answer to that question, this proposal, intuitively attractive as it might be, is little better than proposing that Santa Claus give us all magic spam-filters for Christmas.
Posted by ekr at 10:10 AM | Comments (19) | TrackBack

May 07, 2003

Hyponatremia and the endurance athlete

DB correctly suggests that people should use fluid replacement drinks such as gatorade instead of water in long races. The salt in gatorade reduces the risk of hyponatremia. However, it doesn't prevent it entirely because there's not enough salt in the Gatorade.

Gatorade has about 110 mg of sodium/cup = 458 mg/l. However, sweat contains 2.25-3.4 grams of salt (.9-1.28 grams of sodium) per liter and in a hot race you can sweat 1 liter/hr. In order to replenish this amount you'd need to drink over 2 liters of gatorade per hour. This is pretty impractical since most people can't handle much more than a liter an hour when racing. As a consequence, during long races (3+ hours) many people attempt to sodium supplement either by eating salty foods or using salt tablets. This isn't a serious problem for half marathon and under but is important for marathoners and really important fr triathletes.

For more information, check out Dr. Mark Jenkins's SportsMed web page which has a lot more information on hyponatremia. Most of the information in this post is drawn from Dr. Jenkins's web site.

Posted by ekr at 02:14 PM | Comments (10) | TrackBack

More on comps

In a comment on my Bill Bennett item, Kevin points out the value of comps:
One quick thought. Some gamblers include their "comps" (room, cars, golf, meals, drinks) when they talk about "breaking even". It is definitely possible to lose money on gambling at blackjack but come out ahead with comps if you know how to play blackjack better than average.

This is a good point. Since the casino knows almost exactly how much you're losing, they are in a position to tightly calibrate how much they comp you. Naturally, they have pretty complicated algorithms. Terence says:

Patron activity is tracked into a computer, noting hours of play, and play style (optimal play at blackjack, taking sucker bets at craps, etc.) to estimate the negative EV. Actual losses are also noted for bigger gamblers.

This goes into a very non-linear decision about how much to kick back. For small time guys, it's like 10-25%. For big gamblers on bad losing streaks, it can go over 100% for short periods of time, because it's crucial to get those guys back. (large gamblers typically have agents that deal with casinos, and may negotiate things up to and including cash givebacks)

Booking 8MM dollar losses is going to get you some pretty royal treatment. Including probably a private room for gambling.

Losing $8 million (even if it's just the total loss, not the net) is enough to make Bennett what's called a "whale" which means he would be able to negotiate comps on a case-by-case basis. But remember, the casino's objective is always to take your money, so over the long term he's not going to get comped more than he is giving the casino.

Still, if you believe Bennett values the comps (i.e. he'd be willing to pay something like what the casino is "charging" him for them) then it's possible that he's not really hemorrhaging money on this--he's just wasting a lot of money to stay fancy hotel rooms and experience the other charms of Vegas.

Posted by ekr at 10:57 AM | Comments (30) | TrackBack

What's the deal with audio formats? Part 2

When we last talked about digital music, we know how to record and play it back. The resulting files were just big lists of times and numbers as shown below. However, they're also prohibitively large. In this part, we'll talk about how to make them smaller.

Table 1: Some recorded audio
Time Intensity
0.0 1.00
.01 1.21
.02 1.43
.03 -.413
... ...

Now, the first thing we can do is really simple. We don't need all these stinking time values. If we're sampling at a constant rate (say 44.1 kHz for CDs) then we know that each additional sample is 1/44100 (.0000227) seconds apart. So, what we write down can just be a list of intensities and then the player can reproduce the time offsets by adding .000027 seconds each time. Congratulations, we've just reduced the size of the data by about 50%.

To do any more we need to understand a little bit about how the data is actually structured. We've been treating the samples as simple decimal numbers but actually they're coded as numbers from 0 to 65535. That doesn't really limit us. It's just a unit change. It's easy to map these recorded values onto true intensities and back by a little multiplication. [0] To go from intensities to recorded values we do:
Recorded_Value=(Intensity*10923)+32767

And to go back we do:
Intensity = (Recorded_Value-32767)*.0000915

This gives us the following table:

Table 2: Conversion of intensities to integers
Time Intensity Recorded Value
0.0 1.00 43690
.01 1.21 45983
.02 1.43 48386
.03 -.413 28256
... ... ...

Of course, what gets recorded on disk is just the sequence 43690, 45983, 48386, 28256, .... Actually, that's not quite true either. What gets recorded is the binary representations of these numbers. So, each sample takes up 2 bytes and each two byte value is called a code word. (Actually, CDs are recorded in stereo, so each sample takes up 4 bytes, 2 for each channel. But for the moment we can ignore this fact.) So, why the number 65535? Because that's the maximum amount of data we can stuff in two bytes! Anything else and we'd need more bits. (If you don't know what bits and bytes are, click here). Two bytes gives us the minimum number of intensity levels to get acceptable recording quality so that's what CDs use.

Compression
So, say we want to decrease the size of the data some more. At this point we need to do some data compression. Data compression makes use of the fact that not all sample values are equally likely. For example, if you look at the sequence above, you can see that the values all are hovering in the region 20000-50000. Why? Because most music isn't that loud. So, in order to accomodate really loud music we need intensity levels that go up to 65000 (this is called headroom), but we don't use these values very often. We can take advantage of this fact to compress the data.

If we ordinarily only need to handle numbers in the range 20000-50000, why bother wasting space on larger numbers? Instead of using 16-bit encoding we can use 15-bit encoding. This lets us encode 32768 values, which is enough to get the important central range (the transformation is the same as before just with different scaling numbers) and we've just shrunk the size of the data by about 6%. Not bad for 30 seconds work.

Of course, what happens if we have a quiet piece of music with one loud bit? No problem! We just have a special 15-bit value that means: the next sample is a 16-bit sample. Let's sacrifice the highest value, 32767 for this purpose (this is called an escape character. So, imagine we need to encode the 16-bit data stream 43690, 45983, 48386, 28256, 60211... We do that as: 27306, 29599, 32002, 11872, 32767, 60211 The 60211 is shown in bold above to indicate that it takes up 16 bits of storage, not 15 bits.

Now, it's important to note that this kind of compression isn't always worth doing. In fact, in the case of the little stanza we just showed it's not efficient. We consumed 91 bits to store five 16 bits samples. We would have been better off just encoding everything in 16 bit mode (consuming 80 bits). However, in situations where almost all of the music is quiet this kind of transformation can pay off handsomely.

We can imagine a number of similar types of compression, all taking advantage of various observations we make about music. However, each little tweak we do makes the system more complicated and requires making a assumptions about what's being recorded. If those assumptions go wrong, the compression can go very bad very quickly. In fact, as we saw above, it can take up more space than the uncompressed data! Conveniently, there's a better way. It turns out to be possible to design a generic compression system that performs better than these any ad hoc systems. We'll talk about how to do that next.

The World's Simplest Compression Code
Our previous ad hoc compression scheme took advantage of the fact that some code-words are more common than others. However, we can exploit this general property in a more systematic way.

This will be a lot easier to see if we start with a simpler case. So, let's start with the simplest case that allows any compression at all: 4 values. Now, as before we could talk about the integers 1-3, but it's less confusing if we assume we want to represent the letters A, B, C, D, since then code words like 100 don't get confused with their values 0. If you like, mentally may A to 0 and so forth. Here are the code words we'll be using:

Table 3: The world's second-simplest code
Value Code word
A 00
B 01
C 10
D 11

Now, as I mentioned previously, not all the code words are equally common. So, let's assume the following frequency table:

Table 4: Code-word frequencies
Value Code word Frequency (fraction)
A 00 .10
B 01 .40
C 10 .20
D 11 .30

So, how do we use this information to design our compression scheme? It turns out that there's a simple constructing compression schemes from frequency tables. It was invented by D. A. Huffman in 1952 and is called a Huffman Code The key insight here is that we can have a variable-length code word. So far, all of our code words have been the same length, e.g. 15 bits, 16 bits, or in the above table, 2 bits. In the one case where we wanted to use two different lengths we had to use an escape character to indicate that we were changing word length. But as long as we're careful designing our code we can have each code word be a different length.

Obviously, if some code words are going to be shorter than others, we want to assign the shortest code words to the values we want to represent most frequently. In the table above, B is the most common value and so it should have the shortest code word. Since there's only one positive integer shorter than 2, that means that B has to have a 1-bit code word. It doesn't matter whether we choose 0 or 1, but it's traditional to choose 0. If we just make this replacement, we get the following code.

Table 5: A broken code
Value Code word Frequency
A 00 .10
B 0 .40
C 10 .20
D 11 .30

There's just one problem, which is that when we write them down or transmit them there are no breaks between code words. This causes a problem with the previous code. [1] Consider the string of bits 000. Does this represent the characters BBB, AB or BA? All are plausible interpretations. This is obviously no good so we need a code that doesn't have this problem.

In order to avoid this kind of confusion, we're going to adopt the following rule: we can't have two code words so that code word i can be formed by adding bits onto the end of code word j. The technical term for this is that the code is prefix-free or instantaneous. If we follow this rule then it is always possible to tell when a code word is finished as soon as you've read it, which is obviously a very nice property. There are unambiguous (but not-instantaneous) codes which require you to read the entire message before you know what the code words were, but they're much less useful. Designing one is left as an exercise for the reader.

So, in this case, code-word 0 (representing B) is a prefix of code-word 01 (representing A). In order to do something about it, we could give A code-word 110 or 100. However, in either case one of the existing code words would be a prefix of our new code word. So, we need to move one of those as well too. But which one? Remember that we want the most common code-words to be the shortest. Thus, it makes more sense for C to have a long code word than it does for D. If we move C we get the new code:

Table 6: A sort-of Huffman code
Value Code word Frequency
A 101 .10
B 0 .40
C 100 .20
D 11 .30

Again, note that we could swap the code words for A and C. In fact, in an actual Huffman code, A would have a different arrangement of code-words However, this arrangement is just as good.

So, have we improved the situation? Say we want to encode 100 characters and they meet the frequency characteristics listed above. In the original code we need two bits per character so it consumes 200 bits. In the new code we will have 10 As (30 bits), 40 BBs (40 bits), 20 Cs (60 bits), and 30 Ds (60 bits) for a total of 190 bits. So, we have indeed compressed the data.

Before we proceed, let's take a moment to review what's happened. We started with a code with all the code-words being equal size and then did two things:

  1. The most frequent code word got shortened.
  2. The two least frequent code words got lengthened
Huffman coding is simply a systematized version of this process extended to more bits.

Huffman Codes
Huffman codes are built the opposite way from the procedure we just followed. Instead of identifying the most frequent word we start by identifying the least frequent, in this case A and C. As we saw in the previous example, whatever coding scheme we come up with is going to have equal-length code-words for A and C. In fact, they're going to share a common prefix x, which of course can't be assigned to any symbol. We don't know what x is but we don't care right now. All we need to know is how to distinguish A and C once we've read x. No problem: assign A bit 0 and C bit 1.

Table 7: A partial code
Value Bit Frequency
A x0 .10
C x1 .20

But what is the value of x? Ok, now here comes the really clever bit: forget about the fact that A and C are two separate symbols. Instead, think of them as a new hybrid symbol A|C, which is represented either x0 or x1. We can now ignore the trailing bit and just worry about x.

At this point we have only 3 symbols remaining: B, D and A|C and we can do the same trick. The two least common symbols are D and A|C. Again, they're going to share a common prefix, y. We give D bit 1 and A|C bit 0. This gives us Table 8. Note that we've been able to do away with prefix x because now know that it's y0. And, of course, we have a new hybrid symbol A|C|D represented by y.

Table 8: A less partial code
Value Bit Frequency
A y10 .10
C y11 .20
D y0 .30

We're now left with two symbols, B and A|C|D. We assign bit 0 to B and bit 1 to A|C|D, thus making y = 1. This gives our final code:

Table 9: A Huffman Code
Value Bit Frequency
A 110 .10
B 0 .40
C 111 .20
D 10 .30

Nowhere in this process have we made use of the fact that there were only 4 code words to start out with. In fact, we've followed the same procedure at every stage, just focusing on two code-words at a time. Using the procedure we've just followed we can build a Huffman code for an alphabet of any size.

Other Compression Schemes
Huffman coding gives isn't necessarily very fast and has the serious disadvantage that for maximal efficiency you need to actually read the entire data you are trying to compress (in order to get the frequencies). This is pretty inconvenient if you have really big files. There are a number of other compression schemes that perform better under real-world conditions. The most common real-world compression algorithms are based on Lempel-Ziv. and include Lempel-Ziv-Welch (used in the UNIX compress program) and LZ77 (used in GZIP).

How low can you go?
The performance of compression techniques varies a lot depending on the data you're compressing. The less random the data, the better compression ratio you can get. In general, performance of general purpose compressors on real-world data gets you a size reduction of betwen 50% and 75%, so you can shrink the data by a factor of 2-4.

However, in many cases, this isn't enough. A CD is about 650 MB, so each song is about 50 MB. Even a 12 MB file is a pretty big file to download. So, this sort of technique is almost never used for real audio compression. It's possible to do a lot better if you make some compromises. Throughout this entire discussion we've been implicitly assuming that we wanted to compress the data without damaging it--so that we could reproduce the original data at will. If you're willing to compromise on that a little bit and have the data not be completely accurate, you can get much better compression ratios. I'll be discussing that topic in the next part.

Further reading:
Check out the comp.compression FAQ for more than you ever wanted to know about compression.

[0] The cognoscenti will note that I'm cheating a bit here. In some applications it pays to measure intensities on a log scale instead of a linear scale. However, we can assume that that transformation happens somewhere along the line and it's irrelevant to the basic point.
[1] Incidentally, the space between words is a relatively late introduction in human writing. Really old languages don't have it whichmakesthemmuchhardertoread.

Update: 13:51
Added a link to Part 1.

Update 20030508 12:29:
Fixed the encoding for "B" in Table 6. It should be "0", not "01". Thanks to Nagendra Modadugu for catching this.

Posted by ekr at 10:27 AM | Comments (10) | TrackBack

I'm breaking even

My friend Terence, a pretty serious poker player, points out that you hear gamblers say "I pretty much break even" quite a lot. It generally means "I'm losing my shirt".
Posted by ekr at 10:00 AM | Comments (13) | TrackBack

Bill Bennett's statistics

Time to join the free-for-all over whether or not Bill Bennett lost money. Kieran Healy is siding with Brad DeLong against Eugene Volokh over whether or not Bill Bennett lost money.

Eugene's argument:

Well, let's take a close look at the argument. First, the argument simply assumes that Bennett's expected loss was $8 million. Where does that assumption come from? The alleged casino accounts are that Bennett's estimated loss was $8 million, but I highly doubt that the casinos are talking about his expected loss. They don't know how many times he pulled the lever; they're presumably operating based on their accounting of his purchases, debts, and cash-outs, not based on statistical estimates. Nothing in the story hints that the casino sources deduced this information using expected values. So if the loss estimated from various sources was $8 million, then the chances of his actually having lost any less than $6 million have nothing to do with statistics -- they have to do with how reliable the sources are. If the estimates are correct, then obviously he lost more than $6 million. If they're mistaken, then he may have lost a good deal less.

I think Kieran's assessment is about accurate:

But Bennett didn't just say "That estimate of my losses is incorrect" he said he had come out pretty close to even over the last 10 years. I find this claim absurd for the same reasons as Brad. No-one disputes that Bennett was a high-rolling, high-stakes slot-machine player for years. We know how slot machines work. They are set up to make a fixed profit over time. There is no uncertainty about their profitability. The house cannot lose in the long-run. It defies belief that a single player routinely playing the house's slots for 10 years could break even. I can't see a way around this, unless Bennett wants to say that by "breaking about even" he means "losing no more than the 2%-8% set by the machines," and that seems a little disingenuous for such a virtuous guy.

Merely from Bennett's alleged losses we can't figure out precisely what his net profit (loss) was, because we don't know what strategy he followed. If, for instance, he followed the strategy of "keep playing until you lose a lot of money" then his total loss could in fact be $8 million. That's pretty much the upper bound.

We can also compute an expected lower bound. Assume that the casino has a 2% edge per play. So, the best case (if Bennett ran his money through the machine just once) is that wins are about 49% of losses. In that case, we expect that Bennett has won about $7.7 million over the same time period for a loss of $300,000. Unless you're Bill Gates, I don't see how you can say with a straight face that you've broken about even when you're $300k in the hole.

Incidentally, Eugene's most likely wrong when he says that the casino doesn't know how many times Bennett pulled the lever. Most casinos provide affinity cards to their preferred gamblers so they can keep track of them. When you walk up to a machine you stick in your card and it knows who you are and records your transactions. The casinos use this information as the basis for comping their good customers (free hotel rooms, tickets to shows, etc.) If you're burning through a substantial fraction of a million bucks, the casino probably has a pretty good idea of what you're doing.

Update 10:36
Terence points out that since the casinos likely have very precise data about Bennett's spending habits it's a little surprising that the data that was leaked was so vague.

Posted by ekr at 09:53 AM | Comments (12) | TrackBack

May 06, 2003

Yeah, wrestling!

Just got back from wrestling class. I've been taking Brazilian Jiu-Jitsu for about a year now but really wanted to take something a bit different. My friend Kevin has been wrestling with Tim Lajcik (web site in process, not very exciting yet) and finally convinced him to offer wrestling classes. The first one was tonight. As expected, it was great.

Tim is a former AAU National Wrestling Champion, a Golden Gloves Boxing Champion and has fought in the Ultimate Fighting Championship. He teaches something called "Fighter's Wrestling", which is the application of wrestling techniques to real world fighting situations. To quote from his brochure:

In a fight, there are three essential phases: striking, stand-up grappling, and groundfighting. Boxing and kickboxing are great striking arts while Brazilian Jiu-Jitsu and Sambo are outstanding groundfighting arts. FW provides the pivotal component linking stand-up striking and groundfighting, allowing you to neutralize strikes through clinches and takedowns or shut down a groundfighter by preventing the takedown.

Now, I have no interest in intention to get into a real fight, but if I'm going to take a martial art I want to learn something that's practical, and Fighter's Wrestling is eminently practical.

More importantly, it's a lot of fun. Despite being huge and scary-looking, Tim is actually a great teacher and all-around nice guy. Moreover, his techniques are really smooth and he's got a system for teaching them that leads you through a natural progression so you're never lost or confused. I've taken classes from a bunch of different instructors and I can tell you that that's very unusual. If you live in the Bay Area and you're looking to take a martial art, FW is highly recommended.

Posted by ekr at 11:11 PM | Comments (10) | TrackBack

More good news for mice with cancer

Kevin Dick pointed me to the following article on CNN. These scientists at University of Texas took mice who they had injected with human tumor cells with a modified version of the adenovirus (which causes colds) and effected what looks like cures in 60% of the mice. This particular cancer, glioma, was a particularly nasty form of brain tumor which kills about 9,000 people a year and has a 1-year survival rate of just about zero.

I don't have access to the journal this was published in, so I'm just working from the CNN article here. Apparently they engineered the adenovirus to make it less virulent with healthy cells and improved its ability to enter tumor cells. People have working along these lines for a while now but this is the first time I've heard of it really working well in a whole organism. It's so promising, in fact, that they're rushing it to human trials.

Posted by ekr at 01:37 PM | Comments (42) | TrackBack

What was all the fuss about Kyoto again?

Ever since the Bush administration withdrew from Kyoto there's been no end of complaining from Europeans, American lefties and environmentalists about how the US didn't care about the environment, international treaties, etc. Turns out that that the EU isn't going to meet its Kyoto targets anyway.

This really isn't any surprise. If you've been paying attention, you will have seen this was coming for a year or two now. Still, projections are one thing. Actually missing your targets is a more concrete kind of failure. I suppose one could argue that the Euros would have met their targets if the US had stayed in, but that's starting to look pretty weak both from a practical and moral standpoint.

Posted by ekr at 08:47 AM | Comments (36) | TrackBack

More on B.C.E. etc.

Following some comments by Colby Cosh, I've rethunk my position a bit, and, as the politicians say, would like to revise and extend my remarks.

I had originally offered two defenses of the B.C.E/C.E convention:

  1. Politeness to non-Christians.
  2. A reminder to academics that Christianity was just something else to study.

After some thought, I've concluded that the first rationale is bogus. I doubt that most non-Christians care very much or even think about it, and certainly does irritate some Christians--and apparently Colby as well. So, it's hard to call it polite.

So, let's try again. The primary rationale for my mind for this convention is to remind people interested in religious or cultural history that Christianity shouldn't be considered anything special in that context. There's a long tradition of this kind of jargonification, and I think it's useful. I wouldn't limit the use to Religious Studies but also think it makes sense in archaeology, history, etc. where you're trying to give the students a sense of the culture of the time and it's therefore helpful to remind them periodically that Christianity didn't occupy the special status it does now. As I said earlier, it's a hard notion to shake. However, it's probably not very useful it is if we're talking about the Jurassic or writing a history of mathematics or whatever.

As Cosh points out, this probably accounts for only a small fraction of widespread usage. I think the rest of that usage can be accounted for by two factors:

  1. Using academic jargon makes you look cool.
  2. It annoys others (Christians, your elders, etc.)
While these are certainly normal motives for undergraduates, once one gets into the professional world it's probably time to grow up a bit.
Posted by ekr at 06:06 AM | Comments (16) | TrackBack

May 05, 2003

An embargo? That's your big plan?!??

As anyone who's paying the slightest attention the news knows, we've pretty much given up on stopping North Korea from having nukes, and instead we're going to uh... stop them from selling nukes to other people or something.

I've been trying to think of something clever to write about this but so far all I've got to say is "Uh... This sucks.".

Posted by ekr at 10:17 PM | Comments (72) | TrackBack

Authorized dealers and warranty insurance

Manufacturers of high-end audio and video equipment often have a network of "authorized dealers". Consumers are heavily discouraged from buying from anyone else. Here's a typical warning from Denon:
WARNING TO VALUED CONSUMERS:
The warranty on DENON Electronics products is NOT VALID if the products have been purchased from an unauthorized dealer/on-line E-tailer or if the original factory serial number has been removed, defaced or replaced in any way. Recently DENON Electronics has become aware of numerous instances in which such serial number tampering has occurred. Unauthorized dealers/on-line E-tailers and/or their suppliers frequently alter the serial numbers in an effort to prevent manufacturers from tracing the supplier source. DENON Electronics sell products through authorized retail and on-line channels to insure that consumers obtain quality pre-sale and after-sale support and service. PROTECT YOUR WARRANTY. Buy from an authorized DENON Electronics dealer/E-tailer. Check the unit and its packaging to determine whether the factory serial number may have been altered. If in doubt, call DENON Electronics at (973)396-0810.

It's reasonable to ask what's going on here. These are consumer products, typically in sealed boxes. The quality of the product isn't materially affected by whether it went through an authorized channel or not. So, why does Denon care?

It's actually pretty simple: the manufacturers think that their products are more attractive when customers get a high quality of service and support. However, in an open market, some dealers will naturally compete by offering inferior support and a lower price. Since the products are essentially a commodity, this tends to push down the price of the commodity overall, leading to a generally lower level of support. This effect is particularly strong if an expensive and cheap dealer are physically close, since the customers can go to the expensive dealer to shop but then buy from the cheap dealer. The existence of Internet sales makes this even easier, of course. The cost savings to the customer of can be substantial. For instance, compare the Denon DVD1600, $449 at Tweeter and $369 at Etronics.

The manufacturers have responded by requiring their authorized dealers to sell at a minimum price level. If the dealers can't compete on price, they now have to compete on quality, and because the product is a commodity this generally means offering better support and service. Of course, none of this works if customers could just buy from unauthorized dealers instead, so the manufacturers have to stop that somehow. The easiest way would simply be to sell only to authorized dealers. Unfortunately, when you're selling a lot of units controlling your channel becomed difficult and so unauthorized dealers will sometimes arrange to buy from wholesalers independently. The manufacturers have instead adopted the strategy of refusing to honor warranties for goods purchased from unauthorized retailers.

It should be noted at this point that it's not clear that the manufacturers are trying that hard to block unauthorized sales. After all, they reflect a useful form of price discrimination. Customers who are interested in getting the best possible price and have proved they are willing to forgo extras like the warranty can get a cheaper price without there being too much cannibalization of the main market.

However, it turns out that there's an interesting counter-strategy. Through the same unauthorized retailer you buy your equipment from you can also buy a third party warranty (essentially insurance) which covers repair or replacement in case of equipment failure. The cost of 2 years of insurance (equivalent to the manufacturer's warranty) on the aforementioned unit is $30, so you're still $50 ahead of the game. This sort of insurance makes the price discrimination tactic work much less well since the customer can save money without forgoing the warranty.

Further reading:
While looking up the legal status of minimum pricing, I came upon the following paper by Carlton and Chevalier, which actually has some price measurements. Good reading.

Posted by ekr at 03:19 PM | Comments (11) | TrackBack

Spam-control ideas people have already thought of

As a public service, I will now name, in no particular order, all the ideas (that I know of) people have already had for controlling spam.
  • Charging for the receipt of e-mail.
  • Requiring the sender to perform some expensive computation to send e-mail.
  • Challenge-response: The first time I get mail from you I make you do some liveness test.
  • Whitelists (of people who I want to get mail from).
  • Blacklists (of known spammers).
  • Requiring cryptographic authentication (typically used as a whitelist).
  • Explicit filtering based on message characteristics (e.g. Spam assassin).
  • Collaborative filtering based on global databases of messsages.
  • Adaptive learning-based filters (a.k.a. Bayesian filtering).
  • Requiring distinctive headers in the message that can then be filtered on.
  • Opt-out lists.
  • Opt-in lists.
  • Requiring correct headers in the message (typically combined with opt-in or opt-out lists).
  • Banning spam.
  • Suing spammers.
  • Denial of service attacks on spammers.
  • Taxing e-mail.
  • Whitelists of servers that refuse to host or relay spam

I've probably missed a few here, and if readers want to suggest any, I'll add them to the list.

Anyway, if you happen to think of one of these ideas, before announcing your brainstorm to the world, take a minute to do a little research and see if:

  1. Any analysis has been done on it's pros and cons.
  2. Anyone has tried it and what the results were.
Once you've done those two things you'll be in a position to tell the world about your new brilliant idea without looking like someone who hasn't done their homework.

Update 20030508 22:23
Added whitelists of servers which refuse to server spammers. Suggested by Dan Simon

Posted by ekr at 07:15 AM | Comments (14) | TrackBack

An even worse idea--a tax on Internet mail

Charging for e-mail in order to prevent SPAM is the idea that will not die. Every few months someone comes up with this same brainstorm, apparently oblivious to the fact that the reason such an obvious idea has never been done before is that there are a few minor technical obstacles. Therefore I've got to hand to it to Christopher Caldwell, who has come up with a truly new--and even dumber--version of the same idea. Instead of charging for Internet mail, we'll tax it.

What's really brilliant about this scheme is that it's in every way worse than the simple charging for mail idea:

  • It requires replacing pretty much every mail server in the country and probably a fair number of the clients more or less simultaneously. By contrast, charging for e-mail receipt can be rolled out slowly.
  • What do we do about foreign spam? A lot of spam comes from Asia and Africa. If those guys don't tax e-mail, then that spam won't be deterred at all. And don't tell me it's going to be taxed when it enters the US, since that means that the poor guy who gets it has to pay for it! (And if he wants to transfer the cost we're back to charging for e-mail).
  • The people who are spamming are often criminals anyway, since the spam is fraudulent. So, what's to make them pay the tax?

You know, it's not everyone who can come up with an idea this stupid. It takes talent, hard work, and complete ignorance of how things actually work.

Posted by ekr at 06:50 AM | Comments (18) | TrackBack

May 04, 2003

A.D., C.E., B.C., and B.C.E.

Colby Cosh complains about the use of B.C.E. and C.E. (Before the Common Era and Common Era) instead of B.C. and A.D. (Before Christ and Anno Domini, respectively).
Political correctness is well and good, but "B.C.E." has a number of modest problems: there is no universal agreement on what it even stands for, it impedes legibility by being far too similar to its opposite "C.E.", and it seems as though it does a pretty poor job of placating non-Christians, since year one of the "common era" is still, inconveniently enough, the year of Jesus Christ's birth.

Unfortunately, for Mr. Cosh's argument, year one of the common era isn't the year of Jesus Christ's birth because Jesus was most likely born somewhere in the range 4-6 B.C.E. In fact, this is a pretty common point of confusion which the use of B.C.E and C.E. goes some small distance to remedy.

It's certainly true that one of the purposes of B.C.E/C.E. is to be polite to people who aren't Christians. It's not clear to me that there's anything wrong with that. Moreover, in the fields where this convention is most common (religious studies, archaeology, etc.) it serves as a useful reminder that in that environment Christianity should be treated as a subject of academic study, not as something special that must be treated with kid gloves.

Posted by ekr at 10:27 AM | Comments (62) | TrackBack

Why fingerprint your children?

Yesterday I heard an advertisement on the radio for a child fingerprint kit. Apparently the idea is that you fingerprint your kid and then if they're ever abducted you'll be glad you have the fingerprints on file. A little Google searching reveals that these things are all over the place.

Somebody help me out here. What's the point of this again? Say your kid is abducted and you've got their fingerprints. Then what?

  • The probability that it's going to help you get your children back if they're abducted is very low. Children aren't fingerprinted on a regular basis and the fingerprints certainly aren't run against any national missing child database.
  • You don't need a fingerprint to identify your own child. DNA testing is good enough now that we can figure out if a child is yours from your own DNA.

I talked to my friend Jennifer about this and she suggests two purposes.

  • If your kids are adopted, then you would want some form of positive ID.
  • Fingerprints can be checked faster than DNA so you might get your kids back faster rather than having them held by protective services for a week or so.

Of course, these rationales only really apply with very young children if you don't have lots of pictures and who can't identify you themselves. Otherwise, you shouldn't have any problem establishing positive ID.

So, my question is: "Is this really it?" We've got national advertising campaigns for something with the value proposition "If your children are abducted (which is pretty unlikely), we'll be able to identify them a couple of days faster, especially if they were adopted?" Can someone explain this to me?

Posted by ekr at 10:00 AM | Comments (49) | TrackBack

Well, that's a load off my mind

I hate releasing software. That's a funny thing for someone who writes software as part of his job to say, but there it is. I like writing it, but I'm never sure it's completely baked and so the actual release and testing process takes me forever. The fact that it's security software and so there's some natural paranoia about bugs just makes it worse.

Anyway, I just released a new version of my Java SSL/TLS implementation PureTLS. So, that's one less thing to procrastinate on.

Posted by ekr at 08:14 AM | Comments (25) | TrackBack

Why having your options bought back is a good deal

There's been a fair amount of discussion in the comments section about whether it's good for Microsoft to buy back people's options at their Black-Scholes value. That's not entirely clear. However, it's almost certainly good for employees, because it makes them liquid.

Employee stock options aren't ordinarily transferrable. They just sit there. If you had an ordinary stock option like you can buy on the market, then you could typically sell it for pretty close to the Black-Scholes value. However, employee options can't be sold. Therefore, if (for some reason) you value them at less than the Black-Scholes value, you're hosed because you can't sell them.

Why would employees value their options at less than the Black-Scholes value? There are two main reasons: risk aversion and option expiration.

Risk Aversion
The Black-Scholes model assumes that investors are risk-neutral. That is, they're indifferent between any two bets with the same expectation value (for instance, $500 or a lottery ticket with a payoff of $5000 and only a 10% chance of winning). However, many individual investors are risk-averse and would prefer to take less risk. As a consequence, many individual investors value options at less than the Black-Scholes value. There's nothing irrational about this, especially as we start to get into interesting-sized chunks of money. [0]

Option Expiration
One of the funny things about employee options is the exercise rules. If you have options in a company (Microsoft, say) and you leave Microsoft, you have to exercise the option (buy the stock) shortly after you quit. If you don't, the option expires and you lose it. This is obviously good for the company since it gives you an incentive to stay with the company, but it's not good for you, because you're giving up something of value if you leave. [1]

However, when you compute the Black-Scholes price of an option, it doesn't include the possibility that the option might just disappear--or, alternately, that you might be forced to execute it at some arbitrary time. Obviously, this possibility reduces the value of the option. As a consequence, the Black-Scholes price over-estimates the value of the option to the employee.

What if this were standard practice?
So, it should be clear that getting bought out at the Black-Scholes value is almost certainly good for an individual employee. Interestingly, though, it's almost certainly bad for employees as a whole. Recall that I said earlier that option grants aren't taxable events. The rationale for that is that options have no actual value but just this theoretical Black-Scholes value. However, if you can sell your options back to the company at the Black-Scholes value (or indeed, any significant value) then an option grant is really income (since it's liquid) and the government will surely find some way to tax it. I don't know how regular a practice Microsoft would need to make this buyback before the rules change but I imagine it wouldn't have to happen too often before the IRS got interested.

[0] Incidentally, this is how insurance companies make money: by exchanging certain losses (the insurance premium) for uncertain but higher losses (the payout in case of negative outcome).

[1] Of course, if the company wants to get rid of you and you don't want to leave, this incentive to stay becomes a problem, in which case the company may want to buy you out. Underware Gnome CTO suggests in the comments that that's exactly what happened here.

Posted by ekr at 07:40 AM | Comments (62) | TrackBack

May 03, 2003

Guns do kill people, at least in movies

Just finished watching Snatch, which, like Lock, Stock, and Two Smoking Barrels (also by Guy Ritchie) is a caper films that take place in the UK. In both movies, substantial parts of the plot are driven by the need to obtain guns. Can you imagine an American film where the characters had to use replica guns because they didn't know where to find real ones?
Posted by ekr at 05:35 PM | Comments (3) | TrackBack

Wait, now options do have value?

Steve at PM Style digs up an interesting tidbit: Microsoft has allowed an employee to use underwater options to repay a large outstanding loan. Incidentally, this isn't the first time this has happened. (See here).

The theory under which this was done isn't particularly complicated: the stock is at $26 and the strike price is at $28, so even though the options are out of the money, they have some value since they extend pretty far into the future. Why? Well, there's some chance that the stock will be above 28 at some time in the future and in that case you'd want to exercise the option and take the money. You can use the Black-Scholes model to tell you what the fair value of that option is.

Here's the thing, though: a number of financial-type activities are based on pretending that stock options don't actually have any value (because they don't have any fixed value). In particular:

  • Employees don't have to pay taxes on stock option grants.
  • Lots of companies don't treat option grants as an expense on their balance sheets. (In all fairness, Microsoft has come out in favor of changing this behavior).
Of course, it's not unreasonable for Microsoft to decide that they're going to treat options as something with real value, and if they believe that there's no problem with buying them back. However, I do wonder if they're going to be willing to buy back all the underwater options held by other employees. I bet a lot of them would jump on this deal.
Posted by ekr at 08:49 AM | Comments (59) | TrackBack

Students settle with RIAA

The students that the RIAA was suing have settled for about $15,000 each. Remind me again why it made sense to sue them for five million times more than that.
Posted by ekr at 07:22 AM | Comments (26) | TrackBack

May 02, 2003

What the heck is SCO up to?

SCO (hey, do they still use the Santa Cruz Operation name at all?) has been making threatening noises about Linux for a while now. However, in early March they filed suit against IBM, claiming that IBM had put proprietary Unix code into Linux. They've also hinted at suing Red Hat and SuSE, but so far haven't done anything about that.

To quickly review: the original Unix source code was written by AT&T, which sold the IP to Novell, which sold it to SCO. Most of SCO, including the IP, was sold to Caldera, which then changed it's name back to SCO. SCO has sold their own Unix (which was never that popular and was widely considered to suck) for years and also sells a version of Linux.

"But wait", I hear you cry, "Linux is a complete from scratch rewrite of Unix by, so what's the problem?" Well, a lot of development has gone into Linux since the original Linus Torvalds days. Some of that work was done by IBM, so it's certainly possible that IBM put some of SCO's IP into Linux at the time and that SCO has a case.

So, it's possible that SCO has some sort of case, and that their behavior towards IBM makes some sense. IBM's got a lot of money and so there's presumably some chance that they would be willing to pay off SCO rather than fight a really extended lawsuit. There's even some chance that SCO will eventually win and extract some substantial sum of money from IBM. So, suing IBM is probably rational, especially if you're desperate, as SCO presumably is.

What seems less rational, at least from where I'm sitting, is the threatening noises that SCO is making towards Linux in general, including:

"There's a point in time that has to be resolved with those guys [Red Hat and SuSE] too," McBride said, "but that's not currently what our legal approach is about."
and
Regardless of whether the issue is hashed out in the courts, Linux companies will have to grapple with it, McBride said. "For Linux to move forward in a wide-scale fashion, I believe the intellectual property issues have got to be resolved," he said.

Regardless of whether the issue is hashed out in the courts, Linux companies will have to grapple with it, McBride said. "For Linux to move forward in a wide-scale fashion, I believe the intellectual property issues have got to be resolved," he said.

"There is not an intellectual property policeman sitting in at the check-in counter saying this is OK, this is not OK. It is a free-for-all," McBride said. "At the end of the day, there's not a basis for making sure code is clean when it goes in there."

There's a simpler solution to the issue, Perens said. "They should show us what code they have problems with. We'll take a look at it or we'll just replace it. Keeping us in the dark is just silly," he said.

An impassioned McBride, however, said there's a matter of principle at stake. SCO Group can't just let its intellectual property be used willy-nilly.

"This is not about 10 lines of code, it's about 20 years of extremely valuable intellectual property we're trying to protect...Am I supposed to lie down and not say anything about it?" McBride said. "There's a certain point here where you stand up for what's right and let the chips fall where they will."

First, let's get this "extremely valuable intellectual property" thing out of the way. There's no particular rocket science in SCO's or AT&T's version of UNIX. Lots of people have gotten independently written Unix-type OSes working on PCs and Linux worked just fine before IBM got into the act, so the suggestion that SCO's code is somehow essential to Linux is just silly. It's certainly possible that some IBM programmer crammed some of SCO's code into Linux, but that code could easily be removed and the rest of the Linux vendors could go on with their lives. Unless Red Hat, SuSE, etc. knew about the alleged infringement, it's hard to see how it could actually be a matter of principle.

So, what's in it for SCO? Obviously, they can't hope to extract any money from the Linux hacker community, since there's not any money there. One possibility is that they're just hoping to shake down Red Hat and SuSE. There's not a lot of money to be had there, but there may be some. Another possibility is that they're planning to shake down some big companies who run Linux and are afraid of a big copyright infrinement suit.

On the other hand, what could be happening is that SCO is hoping to push all the other Linux players out of the market so that they have a clear field to sell Unix and their own version of Linux. I suppose it's possible that they could force other companies out of the market, or at least delay them, but given that SCO has been selling PC Unices for years (long before Linux) with only very modest success, it's hard to see the sale of Unix or Linux turning into a big profit center for SCO. There's no reason to believe their execution would be any better this time around.

However, they could succeed in destabilizing the Linux market pretty badly. Concerns over AT&T's code in the BSD code base held back the release of free versions of BSD for years, and in fact are one reason for Linux's dominance over the BSD variants. Unfortunately, SCO's refusal to identify the allegedly infringing code suggests that this is exactly what they have in mind.

Posted by ekr at 07:34 PM | Comments (15) | TrackBack

More in Intellectual Property Politics

Ed Felten suggests better terminology for the three major copyright factions: "Big-IP" [0] for the copyright lobby, "Small-IP" for the people who want to keep copyright but have it reined in a bit and "No-IP" for the people who want to do away with it entirely. [1] Ed also asks the very important question:
So far, the Big-IPs have done a pretty effective job of cementing the alliance between the Small-IPs and the No-IPs, most notably by treating the Small-IPs as if they had taken No-IP positions. Perhaps this is because Big-IPs overestimate the numbers and influence of the No-IPs. Or perhaps it is because some Small-IPs are being cagey about their beliefs, so as not to alienate their No-IP allies.

If I were a Big-IP, I would be wanting all the allies I could get, and I would be looking for a way to pry apart the Small-IPs and the No-IPs. If the Big-IPs decide to do this, things could get very interesting.

Why should the Big-IPs care what the Small-IPs think, given that the Big-IPs can get their way in Congress without any help? Because Congress, by itself, can't solve their problem. To solve their problem, the Big-IPs need cooperation from their customers, most of whom are still Small-IPs.

My guess is that there are two reasons why the Big-IPs have been acting as if the Small-IPs were No-IPs, one political and one semi-technical.

The political reason is simple: there are (or at least the Big-IPs think there are) a lot of undecideds who's positions are closer to Small-IP than Big-IP but closer to Big-IP than No-IP. If these undecideds can be convinced that the choice is Big-IP or No-IP then they'll vote big-IP and the Big-IP people will win.

The semi-technical reason is that the Big-IPs may actually believe that the choice is between Big-IP and No-IP. The conflict has basically been fought on two fronts: expanding legal protections for copyright (more restrictions, longer copyright terms, etc.) and attempting to restrict technologies which can be used to distribute media (Napster, anti-DRM tech, etc.) Now, the Small-IPs are probably willing to compromise a fair bit on the legal front, but are much less willing to compromise on the technical front. They see (rightly, in my view) the legal restrictions required to control the relevant technologies as incredibly destructive to other important technical work. Ed, of course, has done a lot of work showing just how damaging these restrictions are.

However, many of the Big-IPs see the existence of these technologies as the death of copyright, in practice if not in principle. As long as they believe that's the case and the Small-IPs aren't willing to compromise on that front, I don't think there's any way that the Big-IPs and Small-IPs can see eye-to-eye.

[0] To a networking guy like me, Big-IP makes me think of the BIG-IP mailing list where IPv6 (then IPng) was originally discussed, but I imagine that few others get confused. F5 also makes a load balancer called BigIP.
[1] I think it may actually be a mistake to talk about IP here, since I don't get the sense that the Big-IPs, at least, care much about patents, except to the extent to which they can use them as technical means to enforce copyrights. The Small-IPs and especially the No-IPs seem to think of the issues as more connected.

Posted by ekr at 08:28 AM | Comments (28) | TrackBack

May 01, 2003

Some good news on age-based mental decline

This weeks's Science reports on some promising work in reversing the damage that age does to neurological function. It turns out that a lot of the degradation occurs due to a reduction not of neuron activation but of inhibition. As a result there is more spontaneous neural activity which makes it harder to pick the signal out of the noise. Presto, functional degradation.

The disinhibition seems to be somehow related to gamma-aminobutyric acid (GABA). The researchers showed that adding GABA to senescent neurons in the visual cortex of old monkeys improved the visual response substantially and the GABA agonist muscimol worked even better. As further evidence that this hypothesis is correct, a GABA antagonist produced similar degradation in young monkeys.

It's not clear exactly what the problem is (decreased GABA production, decreased sensitivity, etc.) and it's also not clear that this technique will work in humans, but it's certainly an exciting step in the right direction. I'm not really excited about having my brain fall apart as I get older, and so I'm pretty positive about any research designed to do something about that.

Posted by ekr at 09:23 PM | Comments (50) | TrackBack

Engineer's pragmatism?

Steven Den Beste writes about the supposed hardheadedness of engineers:
Engineers cannot afford any kind of delusions; it costs too damned much. One of my readers referred once to my "ruthless engineer's pragmatism" and that's exactly right. We must be pragmatic, because any other view of the world leads to failure. So we have to be ruthless about pragmatism because we have no choice. That's why, for instance, we constantly check one another's work and are very free with criticism of it. It's why we don't mind (much) when someone shows that we're wrong.

I'm amazed that anyone who's actually worked in an engineering field--as I know Den Beste has--can possible believe this. The world is full of engineers who have all sorts of ridiculous delusions and engineering artifacts which don't work. And I'm not talking about delusions about the world at large but delusions about their own field. Delusions that have real world consequences.

A few examples (mostly from my field):

  • WiFi shipped with a completely broken security system called Wired Equivalent Privacy (WEP). The fact that it was broken was obvious to all the security professionals who examined it but somehow the original designers convinced themselves that they were competent to design a system without consulting with experts. They weren't.
  • Millions of dollars has been spent engineering Quality of Service (QoS) capability into routers, despite the fact that there is no apparent market demand for this product. I assure you, the QoS engineers thought this was a fantastic idea.
  • Look at most large pieces of software and you'll find all sorts of weird design choices that were the result of some engineer's idea that they would improve performance. Generally, these choices were made without any real profiling to determine if they were a good idea or not. The engineer in charge just thought it was a good idea. This effect is captured in Donald Knuth's famous comment that "premature optimization is the root of all programming evil".
  • An enormous amount of software is written in C or C++. Almost inevitably, the engineers screw something up, with the result that some attacker can overrun some buffer and take control of the software (this is the source of enormus numbers of security holes in software) software. This problem can be easily solved by a number of techniques (using Java or even bounds-checking for C) but when this is suggested, programmers almost always insist that that would be too slow--generally without any evidence that the code in question is performance critical. Amazingly, against all evidence, those programmers are convinced that their code is just fine and it's just other people who have a problem.
  • Porsche's manufacturing was a total mess until they brought in Toyota engineers to fix their processes. Apparently the vaunted German engineering expertise didn't extend to knowing how to manufacture stuff--or to knowing they didn't know how.
My point here isn't that engineers are stupid or incompetent, just that they're people like anyone else and subject to all the usual human failings. To suggest otherwise is simply--dare I say it--delusional.
Posted by ekr at 11:32 AM | Comments (35) | TrackBack

Two views of Intellectual Property

I think Ed Felten is onto something important here:
In all this quibbling about numbers, we mustn't lose sight of the big picture, which Kerr sees clearly. If the revenue per song is zero, it doesn't matter what share of that zero goes to the artist. No matter what future you hope for, if you want to enjoy recorded music it had better involve some kind of payment.

Roughly speaking, there are two camps working to loosen (or at least prevent tightening) of intellectual property. For lack of better words let's call them Idealists and Pragmatists. The Idealists really don't want to have any kind of intellectual property at all. They see copyrights and patents as evil and want to do away them altogether. Richard Stallman and the Free Software Foundation are probably the best-known advocates of this point of view.

The Pragmatists tend to view Intellectual Property as a necessary evil. The traditional economic analysis of IP is that it's required to incentivize people to engage in the kinds of intellectual effort that produce content. (Boldrin and Levine argue against this, but it's still the majority point of view in economics). In the view of the Pragmatists, the amount of protection given to IP has gone beyond the point of efficiency and so should be reduced. See the "economist's brief" in the Eldred v. Ashcroft case for a pretty good exposition of this view.

Both sides agree that protection for Intellectual Property is too tight currently, and so they've temporarily made common cause for the purposes of fighting the IP lobby (the RIAA and the MPAA primarily) who want to see copyright strengthened. Despite that temporary alliance, is that there are still a lot of people who don't agree with Ed's statement above. These people are quite loud (go check out Slashdot) and they naturally scare the hell out of the content producers, who fear a complete loss of control.

This is both a danger and an opportunity for the Pragmatists. The danger is that the IP lobby will decide that it's an all or nothing situation and fight hard--the way that anyone does when their back is up against the wall. The opportunity is to use the existence of the Idealists to lever the IP lobby into a compromise that seems more attractive. There's quite a bit of leverage there because the RIAA and the MPAA are really and truly scared. However, in order for compromise to seem reasonable, more people will have to say the kinds of things that Ed is saying here.

Posted by ekr at 09:01 AM | Comments (36) | TrackBack

11% less airport screeners

Reuters reports that the Transport Security Administration is planning to cut the number of baggage screeners. Key quote:
While we still live in a dangerous world, it also is time to assess our workplace requirements in relation to budget realities," Loy told reporters at a news conference.

While it's probably too much to hope that this is a sign that we're actually starting to rationally evaluate the cost/benefit ratio of airport security, it may indicate that the panic response to 9/11 has started to die down and our natural cheapness has started to reassert itself. Given my view of the effectiveness of airport security, I think this is probably a good thing.

Posted by ekr at 07:38 AM | Comments (10) | TrackBack