September 30, 2003

MPAA closes one source of leaks

One of the little known facts about file sharing of movies is that insiders are major source of movies for piracy is leaks from insiders. So, it's interesting to see that the MPAA will be forbidding the sending out of "screener copies" of movies for Oscar voters. [*]
Posted by ekr at 09:46 PM | Comments (15) | TrackBack

Asthma and antibiotics

Now, here's something interesting. Children who are given antibiotics as children are significantly more likely to develop asthma: [*]
Overall, children given antibiotics in their first half-year were 2.6 times more likely to develop allergic asthma, the team told a meeting of the European Respiratory Society on Tuesday. With broad-spectrum antibiotics, which kill a wide range of bacteria, the risk was far higher: children were 8.9 times more likely to suffer from asthma.

The New Scientist article doesn't have a link to the primary source, but just based on the summary, this looks like solid work. And the New Scientist reporter does a good job of bringing out the primary objection: that it's infections which cause asthma and that children who get infections naturally get antibiotics more often. These two explanations are obviously tough to disentangle.

Posted by ekr at 09:14 AM | Comments (14) | TrackBack

September 29, 2003

What kind of system is United using?

I did a lot of flying in July. One of those trips was to Vienna on some assortment of United partner airlines:

  • San Francisco to Frankfurt on Lufthansa
  • Frankfurt to Vienna on Austrian
  • Vienna to Toronto on Austrian
  • Toronto to San Francisco on Air Canada

Foolishly, I failed to give them my frequent flyer number, so I had to send them copies of my ticket stubs in order to get credit.

I just checked my United Mileage Plus status and somehow they've managed to credit me with the Frankfurt to Vienna flight but no others. Now, I understand that they have to check with the partner airline, so I can understand that not all the credits come at the same time, but the Vienna to Toronto flight was also on Austrian, so why didn't I get credit for that as well.

I guess one possibility is that "International" flights are handled differently from "domestic" and inside EU is domestic. Whatever it is, it's strange.

Posted by ekr at 08:41 PM | Comments (47) | TrackBack

Surprise, PATRIOT powers used for general law enforcement

Kevin Dick pointed me to this article in the Times about how the Justice department is using the new powers it was granted under the PATRIOT act for general crime fighting purposes. Unsurprisingly, there's a lot of complaining about it, including a bunch of "that's not what we had in mind" from Patrick Leahy
Senator Patrick J. Leahy of Vermont, the ranking Democrat on the Judiciary Committee, said members of Congress expected some of the new powers granted to law enforcement to be used for nonterrorism investigations. But he said the Justice Department's secrecy and lack of cooperation in putting the legislation into effect made him question whether "the government is taking shortcuts around the criminal laws" by invoking intelligence powers -- with differing standards of evidence -- to conduct surveillance operations and demand access to records.

"We did not intend for the government to shed the traditional tools of criminal investigation, such as grand jury subpoenas governed by well-established precedent and wiretaps strictly monitored" by federal judges, he said.

This strikes me as fairly disingenuous. I didn't know that Justice was doing this but I can't say I'm surprised. What else would you expect?

Look, prosecutors basically have an incentive to prosecute people by whatever means necessary. They don't have much of an incentive to hold back out of some concern for the balance between civil liberties and law enforcement. If Joe Prosecutor is in a position where he can either let some probable scumbag walk or use the expanded PATRIOT powers against him, why would he ever not do that? If Leahy wants that not to happen, then he should have made sure that the law forbid it, not count on prosecutorial discretion.

Posted by ekr at 08:27 PM | Comments (56) | TrackBack

I guess it's back to wearing long sleeves and hats

So, some researchers in the UK have discovered [*] that sunbscreens don't do as good a job of blocking UVA (the kind of UV light thought to cause melanoma) as UVB (the kind of UV which causes most sunburn). Typical SPF 30 sunscreens (which block about 97% of UVB) only block about 50% of UVA.

This is fairly bad news for athlets, who spend a lot of time out in the sun and can't practically wear long sleeves. Traditionally, I've worn a lot of sunscreen, which has done a pretty good job of protecting me from burns, but I guess hasn't done that good a job of protecting me from skin cancer. Decoupling sunburn from cancer risk actually makes things worse in some sense, since it means you no longer have a visible marker for when you're at risk.

Posted by ekr at 10:27 AM | Comments (15) | TrackBack

September 28, 2003

iPod + Beetle

Check it out [*]. Apple and VW have joined up. Now when you buy a Beetle, you get an iPod with it. These do seem like two cultures that have a lot in common. Check out the commercial.
Posted by ekr at 10:32 PM | Comments (13) | TrackBack

Tergat beaks marathon WR

Paul Tergat just broke the marathon WR by 43 seconds, bringing it below 2:05 for the first time. [*]. The race sounds like it was really interesting. Sammy Korir was supposed to pace Tergat but ended up going for the win himself, finishing only 1s behind Tergat.
Posted by ekr at 12:15 PM | Comments (11) | TrackBack

September 27, 2003

Statistics

Chris Bertram points out that doctor's don't seem to be able to handle the kind of statistics you need to estimate risk. [*] This could be one reason why doctors would mistrust evidence-based medicine--they don't understand it.
Posted by ekr at 09:36 PM | Comments (13) | TrackBack

September 26, 2003

What good is evidence-based medicine if you ignore the evidence?

Kevin Dick pointed me to a pretty disturbing article in today's Wall Street Journal. Go here and look for "Failing to Reap Benefits of Great Research". There's a lot of good work going on in evidence-based medicine--studies designed to determine which treatments really work and which do not--but that research is often not getting translated into actual physician practice. A large part of the problem seems to be resistance by doctors to changing their behavior based on this kind of research.

Some of this resistance is natural. If you've been doing things a certain way for a long time and they seem to work, it's hard to just abandon those procedures based on studies of other people's patients. People have a terrible time accepting statistical results that contradict the evidence of their experience. It's particularly difficult when some treatment seems like it ought to work but the data says that it doesn't, as in the case of arthroscopic surgery for osteoarthritis. [*]. However, the case described below falls more under the category of turf defense.

One infamous case, though, makes you wonder how many doctors even believe in science-based care. In the mid-1990s, a federal group now known as the Agency for Healthcare Research and Quality issued a clinical guideline for lower back pain, concluding that surgery (spinal fusion) usually does no good. Orthopedic surgeons did the modern equivalent of grabbing pitchforks and storming the castle: They lobbied congressmen to punish the agency and crippled for years the very idea of evidence-based medicine.

Makes you wonder indeed.

Posted by ekr at 02:06 PM | Comments (66) | TrackBack

More caveman DRM

I was surfing the Web [*] and accidentally hit the right-click button only to be greeted with a dialog that read "Copyright protected ER Headquarters". A little sniffing around revealed that the page had been equipped with a little Javascript program that activated on right click, thus stopping you from cutting and pasting the contents--at least using the right mouse button.

This has to be one of the most useless pieces of DRM I've ever seen. It's laughably easy to circumvent. First, you can use the "Save Page" menu option to just save the entire file. Second, you can cut and paste using a menu bar pick or a hot key. Third, Mozilla Firebird on UNIX doesn't even respect the override anyway. Finally, if you really wanted to steal all the content on the site you'd just use a mass copying tool, not a browser. if you're going to do something this useless, why bother with anything at all.

Posted by ekr at 12:28 PM | Comments (42) | TrackBack

Segway recall

Segway is recalling their scooters. [*]. Apparently, when the batteries get low riders can fall off [*].
Under certain operating conditions, particularly when the batteries are near the end of charge, some Segway HTs may not deliver enough power, allowing the rider to fall. This can happen if the rider speeds up abruptly, encounters an obstacle, or continues to ride after receiving a low-battery alert.

As far as I can figure it, here's what's going on: Segways are basically unstable. They depend on being actively stabilized under control of the onboard computer. If there's not enough power, the software doesn't handle it properly and you can fall off. Segway is delivering a new software release to fix the problem.

Computerized control systems are quickly replacing simple mechanical systems in all sorts of applications. This allows the designers to do all sorts of things that they couldn't do otherwise. In your car alone, computer control has enabled cool features like antilock braking, traction control, and electronic fuel injection. The problem, of course, is that all this software is complex and as we all know, software has bugs. The trick is designing that software so that when it fails, it fails in such a way that it's not dangerous. Looks like Segway isn't quite there yet.

Posted by ekr at 11:10 AM | Comments (10) | TrackBack

September 25, 2003

I've forgotten all my calculus

It's amazing how utterly you can lose stuff when you don't use it. I once knew calculus--as in really knew it. Unfortunately, that was 15 years ago and I've pretty much forgotten how to do anything but the simplest polynomial integration. Worse yet, I took calc in high school and in my HS we didn't buy our textbooks. Accordingly, unlike the classes I took in college, I don't even have an old book to refer to.

The good news is that technology has advanced quite a bit and so it's a lot less important to know how to integrate by hand. Most of the time you can just use R to numerically integrate whatever function I'm interested in. If you need symbolic integration, there's always Maxima.

Posted by ekr at 08:15 PM | Comments (17) | TrackBack

Why do we put up with OPEC?

Ok, so OPEC has cut production and prices are going up again. [*]. Why is it again that the United States allows OPEC countries to collude to control oil prices? If OPEC were a bunch of American companies rather than a bunch of third world countries, Justice would be all over them for antitrust violations.

You want to see something amazing? Look at their rationale for their behavior:

All eleven Members are developing countries, whose economies rely on oil export revenues. One of OPEC's primary missions is to achieve stable oil prices, which are fair and reasonable for oil producers and consumers.

Hmm... The members of OPEC are Algeria, Indonesia, Iran, Iraq, Kuwait, Libya, Nigeria, Qatar, Saudi Arabia, United Arab Emirates and Venezuela.

This is only true if we really stretch the definition of "developing" Qatar and UAE have per capita GDPs higher than Spain! Saudi Arabia's per capita GDP is about half that of Spain and higher than that of Russia! Now, it's true that a few of these countries (Algeria, Nigeria, Iraq) really are a mess, but most of the OPEC counties are raking in money hand over fist. If they're still "developing", it's because they've wasted the money rather than using it to develop. Why exactly should I feel obligated to pay higher gas prices so they can waste more of it?

UPDATE:
Michael Kinsley has a pretty negative piece about Iraq rejoining OPEC.

Posted by ekr at 08:04 AM | Comments (50) | TrackBack

September 24, 2003

Provigil for everyone

Kevin Dick pointed me to the news that the FDA is considering broadening the indications for Provigil. [*]. Provigil is a drug originally developed for narcolepsy, but it turns out to dramatically decrease symptoms of sleepiness even in normal people. However, unlike caffeine or amphetamines, it's not really a stimulant and doesn't make you jittery. It just keeps you alert.

I've taken Provigil a few times and it certainly does seem to dramatically improve alertness. It's not perfect. I still got somewhat tired, but it seemed to be a lot better than nothing or just caffeine. I do kind of worry about the long term effects of sleep deprivation. Sure, I didn't feel sleepy, but that doesn't mean that the rest of my body didn't need sleep. Still, it sure is nice to be able to put it off a little bit when you have something to do.

Posted by ekr at 03:51 PM | Comments (14) | TrackBack

September 23, 2003

HP indemnifies Linux users

Here we go. HP has said that they will indemnify [*] Linux users from being sued by SCO. Anyone want to bet that IBM will be following suit shortly?
Posted by ekr at 10:11 PM | Comments (26) | TrackBack

95% Confidence Interval

If there's nothing special about the 95% confidence interval, why is it so ubiquitous? My guess is that it's the result of the coincidence of two factors:
  • If your data is normally distributed, then 95% of the measurements will fall within 1.96 standard deviations of the mean. Obviously, 2 standard deviations is a special number and 1.96 is close enough to 2 standard deviations that 95% feels special too.
  • A 95% confidence interval means that you're wrong 1 time out of 20, which is about the kind of error rate people are willing to live with.

These two factors make 95% very psychologically compelling. The other commonly used confidence interval, 99%, has a similar argument going for it: it's one error out of 100.

Posted by ekr at 04:18 PM | Comments (11) | TrackBack

Confidence and line drawing

One classic case where essentially arbitrary lines have to be drawn is the confidence interval. Say your job is to estimate some quantity. For convenience, let's say that we're interested in estimating the number of voters who will vote for Arnold Schwarzenneger. As I discussed previously [*] our measurements always have some error associated with them.

It's customary in science to quote what's called a 95% confidence interval. Say, we measure Arnold's support at 40% and the 95% confidence interval is 37-43%. means that if we repeated the measurement 100 times, we would expect that 95 of those times the result would be between 37% and 40% and the rest of the measurements would be larger or smaller. However, there's nothing really special about 95%. One could just as well quote a 90% confidence interval or a 99% confidence interval. As with voting machine error rates, you have to draw the line somewhere and 95% is where it's customary to draw it. There are fields where much tighter confidence bounds (99% or 99.9%) are common as well.

Posted by ekr at 04:01 PM | Comments (10) | TrackBack

Drawing the line

I caught some of the oral argument in the 9th Circuit's en banc hearing of the California Recall case yesterday. One thing that struck me was how Judge Kozinski kept pressing Charles Diamond to say how high the error rate had to be before he would admit that there was a problem. Here's Slate's summary [*]:
And Kozinski, who dominates the argument today the way Scalia tends to do in the high court, similarly razzes Charles Diamond, who represents Ted Costa, the man who initiated the recall ballot initiative. Pressing Diamond on what kind of government-tampering with voting would represent an equal protection violation, he asks what would happen if Los Angeles County officials just decided to count every other vote. Later he asks what happens if they toss nine out of every 10 ballots, quipping, "I feel like Abraham here." High Holy Day humor that's lost on much of the crowd.

I'm not a lawyer, but it sure looked to me like everyone in the room understood that--like in many negotiations--being the first guy to name a number was a bad idea. You can just imagine the conversation:

Diamond: 5% error rate
Kozinski: Really? So, what about 4.9? That's ok. How do you distinguish those two.
Diamind: Uh...

Despite the seeming effectiveness of this line of argument, I don't think it holds water. The fact is that there's a continuum between "perfect" and "worthless" and there's really nothing special about any point along the continuum other than the end points. Nevertheless, there has to be some standard for how bad error rates are before we say "that's too much". Wherever we draw the line is arbitrary, but that doesn't change the fact that the line has to be drawn. Pretending that it doesn't is just a rhetorical trick.

Posted by ekr at 03:33 PM | Comments (12) | TrackBack

September 22, 2003

Didn't I tell you not to reinvent SSL?

Peter Gutmann has a nice post to the cryptography mailing list where he analyzes a number of Linux VPN packages and finds amateurish cryptographic errors. This won't come as much of a surprise to anyone who has ever looked at home-grown protocols. Such protocols nearly always contain simple mistakes. It's hard to get things right. If you can possibly reuse something preexisting (typically SSL or SSH), then you should.

Peter quite properly nails the excuse that such people almost always have for reinventing the wheel:

For all of these VPN apps, the authors state that they were motivated to create them as a reaction to the perceived complexity of protocols like SSL, SSH, and IPsec. The means of reducing the complexity was to strip out all those nasty security features that made the protocols complex (and secure). Now if you're Bruce Schneier or Niels Ferguson, you're allowed to reinvent SSL ("Practical Cryptography", John Wiley & Sons, 2003). Unfortunately the people who created these programs are no Bruce or Niels. The results are predictable.

In my experience, most of the complexity is in certificate handling. You can safely punt that (if you know what you're doing). After that, you have to be really careful what you cut because you're likely to cut something important. This isn't to say that SSL is perfect--it's not. It's just that the amount of complexity reduction you can achieve (after removing certs) while retaining security is actually quite small.

UPDATE: If you dare, scroll to the bottom of the link above for Peter's rather graphic suggestion for the appropriate punishment for people who try to reinvent SSL/SSH.

Posted by ekr at 04:21 PM | Comments (46) | TrackBack

A missing intellectual property right

If you have a TiVo you've probably noticed that it doesn't have a commercial skipping feature. This is technically possible, in fact the ReplayTV used to do it. However, following their court case, ReplayTV is apparently taking it out. [*].

Now, I don't know whether commercial skipping is copyright infringement [*] as Richard Posner has argued. [*], but I do think there's a high probability that an effort to get it declared fair use--either by court decision or legislative action--would eventually prevail. There are lots of big players in the digital video recorder business and this would obvious be an attractive feature to customers, so why hasn't this been done.

One potential answer is that there is a missing intellectual property right here. Being the only company who has commercial skip is indeed a competitive advantage, but it's not one that's protectable. Say that TiVo puts commercial skip in their product and fights the court case and eventually wins. Well, nothing stops ReplayTV (or Microsoft or whoever) from turning around and putting the feature in their product and then the advantage is gone. So, none of the companies has any incentive to be the test case and we get the usual free rider situation.

Maybe we need a new kind of patent for this litigous age: the first company to establish the right to do something gets a temporary monopoly.

Posted by ekr at 02:43 PM | Comments (21) | TrackBack

September 21, 2003

Should RU-486 have been approved?

Medpundit argues that RU-486 should not have been approved on safety grounds. [*]:
But, in the case of mifepristone, it isn't at all clear that the benefits exceed the risks. There's already a safer alternative to the drug. It's called surgical abortion (complication rate less than one percent in the first trimester compared to 4 to 8% for mifepristone).

Imagine if there was a drug that could treat gallstones, but 4-8% of users required surgery to treat its complications, which include death. Is there any doubt that the FDA would deny it approval? They would correctly point out that gallbladder surgery is a safer alternative. But then, there is no National Right to Life Without Gallstones to pressure the FDA for approval.

Huh?

According to Medpundit, 4-8% of the people who take RU-486 will have to experience surgical abortion, whereas 100% of people who choose to have a surgical abortion will have to have one. And Medpundit says that makes RU-486 worse. I'm having a lot of time following this cost/benefit analysis.

Now, it could be that her argument is that the cumulative death rate for RU-486 is higher as well, but that's not what the numbers she's quoting demonstrate.

Posted by ekr at 08:46 AM | Comments (69) | TrackBack

September 20, 2003

Recommendation: Wil McCarthy's "The Collapsium" and "The Wellstone"

Lately I finished reading Wil McCarthy's The Collapsium and The Wellstone. They're the first two books of a really interesting hard science fiction series.

Like Robert Forward, McCarthy is really interested in the implications of hypertechnology. Unlike Forward, McCarthy actually can write. McCarthy introduces a whole bunch of technological concepts, including:

  • Matter transmission/duplication by scanning and reassembly ("faxing")--including teleportation of humans.
  • Immortality (by correction during faxing)
  • "Semi-safe" black holes
  • Programmable matter that changes shape and chemical composition.

"The Collapsium" is pretty much a straight adventure story exploring the implications of the technology. The Wellstone is more an exploration of what it's like to be young in a world where you know your parent are never going to grow old and get out of the way--where all the respected positions in society are already filled.

McCarthy claims that this hypertechnology is technically possible and provides an appendix describing how semi-safe black holes and programmable matter might conceivably work. I don't understand the physics well enough to assess the semi-safe black holes, but the programmable matter doesn't sound totally insane. It certainly would be incredibly cool if it worked.

Posted by ekr at 07:38 PM | Comments (12) | TrackBack

September 19, 2003

Cheap beer means more drinking

Some researchers in Boston are reporting [*] that alcohol consumption is elastic.
The cheaper the beer and the larger the volume available, the more students reported drinking. When retail outlets sold discounted beer, the average number of drinks students consumed rose. The same was true when stores sold 24-pack, 36-pack, kegs and party balls, a form of mini-keg holding 2,5 cases.

Mark Kleiman [*] call your office [*].

Posted by ekr at 09:26 AM | Comments (18) | TrackBack

Mercury or viruses or what?

The funny thing about the vaccine-autism hypothesis is that it wasn't initially about mercury. The initial fuss about vaccines and autism was started by a British doctor named Andrew Wakefield [*], and it wasn't about vaccines in general but about a particular vaccine called MMR (Measles-Mumps-Rubella). Moreover, Wakefield's argument didn't have anything to do with mercury but was rather that the live measles virus in MMR was causing autism. Like the claim about mercury, this wasn't entirely crazy, but doesn't seem to be true either. [*].

Now, here's the point at which something fascinating happens, and I wish I knew exactly when it did, because instead of people concluding that vaccination was ok, the fear of live virus and MMR transferred itself to thimerosal and vaccines as a group, despite the fact that the only real connection between the two hypothesis is that they link vaccines and autism.

I'm speculating now, but here's what I think happened: humans are incredibly bad at accepting that some things are just coincidence. Since there appeared to be a temporal link between vaccines and autism, they figured there must also be a causal link. Since it wasn't the virus, they looked for another culprit and mercury is what stuck out.

The problem, of course, is that this isn't good science. There are any number of conceivable ways in which vaccination--or just about anything else in an infant's life--could autism. If there was a really clear epidemiological connection between the two, then it would be worth doing research to figure out what the biological cause actually was. As it is, though, the epidemiological link is, at best, inconclusive, so there's no particular reason to single out vaccines for study--unless, of course, you're mainly interested in finding someone to blame.

Posted by ekr at 09:08 AM | Comments (46) | TrackBack

What's it going to take to kill the vaccine-autism hypothesis?

Last week's Science magazine has an article about two epidemiological studies on the relationship between autism and vaccine administration. [*]. The relationship? There doesn't seem to be any.

For those of you who mercifully haven't been paying attention, vaccines used to contain a mercury-based preservative called thimerosal. Mercury is a known neurotoxin in high doses and so it has been suggested that it mercury load was a cause of autism. However, in recent years countries have started to phase out thimerosal without finding and reduction in autism.

1454a-1-med.gif
(From Science)

Naturally, the proponents of a vaccine-autism link have a bunch of criticisms [*] of this Danish work. Some of these criticisms are valid and some are kind of silly. A lot of the criticisms are of the precise methods the researchers used to aggregate their data. I'm somewhat concerned by that but it would require a lot of investigation to know how seriously to take these arguments. When you're working from public records rather than gathering your own data, things are almost always pretty messy.

What's important to remember here, however, is that the evidence being presented by the Danish researchers is vastly more convincing than that presented by their opponents, which is to say effectively none. A read of Bernard et al's paper [*] reveals that the anti-vaccine argument is incredibly weak. Basically, it consists of the following four points:

  • Mercury is a known neurotoxin and in some respects autism looks like mercury poisoning.
  • The amount of thimerosal being delivered to children in vaccines exceeds the "safe" level.
  • Autism comes on at about the same time as the vaccine load is high.
  • Autism rates have been going up over the past 50 years or so, as has the number of vaccines children get.

However, what the authors don't mention is:

  • Mercury poisoning pretty much generally screws up your brain in a variety of ways, and so pretty much any symptoms of autism would be consistent with some form of mercury poisoning.
  • The instantaneous mercury loads are very high (because you get a single shot) but it's now known how that compares to what's considered safe as a cumulative load. Also, thimerosal is a different form of mercury from that for which the safety levels were derived and the exact form of mercury matters quite a bit.
  • 12-18 months is also when children do a lot of neurological development, so you'd expect autism to start showing up here no matter what.
  • The rise in autism also coincides with the rise in the stock market, but we're not blaming Richard Grasso

The bottom line is that it wasn't insane to think that autism was related to mercury, and it's probably not insane to remove mercury from vaccines--which has pretty much already happened in the US. However, the actual evidence for a connection is at best suggestive. Moreover, despite a couple years of research it hasn't gotten any stronger. On the contrary, the studies that have been done, though not perfect, point in the other direction. While we certainly haven't proved beyond all doubt that autism isn't caused by vaccines, there's really no good reason to think that it is, either.

Posted by ekr at 08:38 AM | Comments (28) | TrackBack

September 18, 2003

Prilosec OTC

I didn't see any news announcements, but it looks like Prilosec has finally gone over the counter (shipping September 29th) [*]. It's cheap, too. $19.99 for a month's supply. This is less than a quarter the price of even the generic prescription version. I wonder how long it will be before there's an even cheaper generic OTC.
Posted by ekr at 07:00 PM | Comments (61) | TrackBack

What was that PGP key again

So, I thought I'd install the new version of OpenSSH, 3.7.1, to fix the aforementioned buffer overflow. [*]. Naturally, I want to verify the PGP signature to make sure that the distribution hasn't been tampered with. No problem, I've got GPG. Unfortunately, I don't know the guy who signed the distribution, and his key isn't signed by anyone I know.

This is the problem with PGP, of course. It's incredibly flexible but actually building a chain of trust to someone you personally trust can be incredibly complicated. Effectively you need to do a recursive tree search. There may be some way to get GPG to do this automatically, but if there is I don't know it. And even if there is, chasing down all the links could be arbitrarily computationally expensive. In the worst case one could have to traverse the entire database of keys in order to find out that you couldn't build a chain.

Posted by ekr at 11:39 AM | Comments (26) | TrackBack

Oldsters to the rescue

Yesterday's Slate has an interesting article about conflict between public pension funds and VCs. Basically, what's going on here is that the VCs want to keep detailed information about their financials secret and the pension funds are insisting they be made public. I imagine that eventually the pension funds will win--they're the ones with the money.
Posted by ekr at 09:54 AM | Comments (10) | TrackBack

September 17, 2003

"I'm not going to quit"

I see that Richard Grasso has resigned [*] from the chairmanship of the NYSE. Now, I have no idea whether Grasso was jumped or was pushed, but I see that only yesterday, he was saying that he would not resign [*]. Does this kind of ritual denial have any informational content whatsoever? I'm having trouble thinking of any public figure who resigned after public pressure who didn't initially claim that he would not.
Posted by ekr at 08:44 PM | Comments (12) | TrackBack

September 16, 2003

Oh, no, not again

Well, apparently OpenSSH 3.7 (just released today) also has a buffer management problem [*] Excuse me while I go hide under my bed.
Posted by ekr at 09:48 PM | Comments (12) | TrackBack

What's wrong with OpenSSH

For the cognoscenti, here's a pointer [*] to the patch for this OpenSSH problem, as represented in FreeBSD's source tree. Here's the broken code.
/* Increase the size of the buffer and retry. */
buffer->alloc += len + 32768;
if (buffer->alloc > 0xa00000)
        fatal("buffer_append_space: alloc %u not supported",
	  buffer->alloc);

The problem appears to be that fatal() does a bunch of cleanup stuff and this may include using the value of buffer->alloc, which no longer accurately reflects the size of the buffer.

Posted by ekr at 09:32 AM | Comments (10) | TrackBack

Oh, this is good

So, there's another remotely exploitable hole in OpenSSH. [*]. Details are slowly trickling out into the public community. Most of the advisories don't seem to have been released and we're already starting to see rumors of exploits. I've disconnected SSH until an update is available. Apparently, there are copies of the new OpenSSH 3.7 release that fixes this out there, but everything is so jammed up with people trying to figure out what's up that it's hard to make forward progress.

UPDATE:
OpenSSH 3.7 was just released, as well as a patch for the problem.

UPDATE:
My sources tell me that this bug has actually been circulating for a while in the underground community. Doh!

Posted by ekr at 09:28 AM | Comments (12) | TrackBack

September 15, 2003

Doing the math on voting error rates

The UPI article on the California recall election case [*] quotes Mark Rosenbaum of the ACLU saying:
"The other side has to answer the question, 'How can you hold an election when you know going in that because of the unacceptability of the machine, poor people and people of color are going to have a half or third of a chance of having their votes counted as white or more affluent individuals,'" Rosenbaum demanded to know. "That's a principle that every court ... has subscribed to."

If the information in the article is correct, this isn't an accurate characterization. It's true that the error rate of punch cards is twice that of other voting systems, but even that error rate is only 3%. An accurate characterization would be to say that "poor people and people of color are going to have a 98% of a chance of having their votes counted as white or more affluent individuals". Doesn't sound quite as serious that way, does it?

Posted by ekr at 10:14 PM | Comments (13) | TrackBack

Did punch-cards suddenly get worse?

Ok, so the 9th district court of appeals just delayed the recall election. [*]. I haven't had a chance to read the decision, but apparently the issues is that some California counties use less accurate punch card voting.
The ACLU, a lead plaintiff in the case, contended that the six counties still switching over to electronic voting were among the most populated in the state and were home to a large number of minority voters. The lawsuit argued that going ahead with the Oct. 7 election would risk the votes of an estimated 40,000 voters based on an error rate of 3 percent for punch-card ballots; the new electronic devises have an error rate or 1.0 percent to 1.5 percent.

Ok, so punch-card ballots suck. But aren't they the same sucky punch-card ballots that we've been using for years. It's not like we suddenly discovered this stuff. How have things changed that we now can't hold an election using them?

Posted by ekr at 10:04 PM | Comments (46) | TrackBack

September 14, 2003

The marginal value of Britney Spears

As I've argued before, Britney Spears is obviously very popular, there are likely to be lots of other people who are very nearly as good. [*] If Britney were hit by a bus tomorrow, it wouldn't be that hard for her label to produce another star. To the extent that that's true in general then the vast majority of the value add in the production of an album is provided by the label--presumably in the form of marketing of various kinds. In that case, not only would one expect that the record companies would extract almost all the surplus[*], but it's quite arguable that that's the fair outcome.
Posted by ekr at 10:01 PM | Comments (15) | TrackBack

What is it that will make Sun rise again?

Sun Micrososystems has placed a number of billboards on 101 that read "Show the world what you showed me in your office in December, and Sun will rise again." The quote is by Rich Karlgaard from Forbes. [*] So, what is this brilliant product that Karlgaard is talking about and is going to save the company? It turns out to be the Sun Ray, Sun's insane thin client.

I don't know whether to laugh or cry.

Posted by ekr at 04:18 PM | Comments (12) | TrackBack

How much should musicians make?

Tyler Cowen points to Steve Albini's rant about how musicians get screwed by the label. It's sufficiently illuminating that you should probably read it yourself, but the executive summary is that the band ends up paying all their production expenses--though the label fronts them their expenses in the form of an advance. At the end of the day, they're lucky to make back the expenses.

Now, what's interesting here is that the technical book publication process is totally different. When you write a book, the publisher pays all the production expenses--sometimes they'll even buy you the word processing software you need--and you start making royalties with the first copy sold. As a consequence, despite the fact that SSL and TLS sold a lot less than the 250,000 copies in Albini's example, I actually came out substantially in the black on the project.

Why the difference? Finding decent authors for books, especially technical books, is quite difficult, and the publishers believe that having a good author matters for having sales. Most of the people capable of writing your book have other things they want to do. [0] By contrast, there are any number of young bands falling all over themselves to sign with a major label. No doubt many of them are just as talented as the current crop of signed artists. Some of them would probably record for free just to get the fame and fortune. Why would you expect the labels to ever pay more than the market price?

Incidentally, book royalty rates are generally regarded as pretty meager. As far as I can tell, most people who write technical books do it for the fame, not the royalties. Perhaps that's why musicians do it as well.

Posted by ekr at 03:12 PM | Comments (11) | TrackBack

September 13, 2003

An accidental AIDS vaccine?

Is it really possible that vaccinating people against smallpox could protect them from AIDS? [*] If you'd asked me a week ago I would have said it was unlikely, but a team at GMU is reporting that it does. Their explanation for why this might work doesn't sound entirely implausible:
A study published in 1999 showed that a relative of smallpox, called the myxoma poxvirus, uses the same cellular doorway -- the CCR5 receptor -- to infect a cell as AIDS does.

And studies have noted that people with certain mutations in CCR5 are resistant to HIV infection.

I'm not enough of a virologist to know if this makes sense or not. As I understand it, vaccines work by sensitizing the body to particular antigens on the viral coat. If HIV and vaccinia use the same cellular receptor they must have one protein on their coat that's sort of similar (the one that docks with that receptor). It would be pretty lucky if the smallpox vaccine triggered an antibody that matched that common antigen.

Posted by ekr at 10:08 PM | Comments (54) | TrackBack

Anti-Spam Authentication + Spam Viruses = Mess

I've been doing some more thinking about the use of sender authentication to stop spam [*] I said previously that I didn't think it would work, but there's actually a problem I didn't mention. Say you've got one of these systems in place. The basic elements are:
  1. All senders must authenticate.
  2. Senders caught sending spam get blacklisted somehow.

Now, consider what happens when you introduce a SoBig-style spam virus into the mix. Remember that SoBig takes over the machines of legitimate users, so it's going to be able to send mail as them. Following rule 2, we add all those users to the blacklist. Now, Joe CEO, who's not a spammer but just the owner of a badly maintained machine, can no longer send email. Considering the number of people infected with SoBig, there are going to be a lot of these "innocent" victims on the blacklist. Now, you could imagine some procedure for removing them from the blacklist, but you'd presumably want to at least make sure they weren't infected first. How happy do you think Joe CEO is going to be to be told he can't send mail for 4 hours--or 4 days--while he waits for the admin to disinfect his machine and get him off the blacklist. It's hard for me to see a system like this surviving the first such mass infection.

Posted by ekr at 09:59 PM | Comments (60) | TrackBack

GWB and the board of Carlyle

Brad De Long quotes [*] Suzan Mazur about how George Bush got his position on Carlyle Group's Board:
Carlyle Group Director David Rubenstein: ...But when we were putting the board together, somebody [Fred Malek] came to me and said, look there is a guy who would like to be on the board. He's kind of down on his luck a bit. Needs a job. Needs a board position. Needs some board positions. Could you put him on the board? Pay him a salary and he'll be a good board member and be a loyal vote for the management and so forth.

I said well we're not usually in that business. But okay, let me meet the guy. I met the guy. I said I don't think he adds that much value. We'll put him on the board because - you know - we'll do a favor for this guy; he's done a favor for us.

Now, De Long's point is that Bush got a job which he clearly wasn't competent for on the basis of his connections. I don't like that much, but what really annoys me is that part of the pitch was that Bush would be a "loyal vote for the management". The board's loyalty shouldn't be to management but to the stockholders-- it's their job to oversee the management! How is it that so few people in business can't remember this?

Posted by ekr at 04:41 PM | Comments (12) | TrackBack

September 12, 2003

Programming language inventor or serial killer

Programmers should check out the Programming Language Inventor or Serial Killer quiz. I got 7 out of 10. Mostly I was guessing, but I definitely recognized one of them as John Backus, inventor of FORTRAN. Unfortunately, for me, it actually turned out to be serial killer Ed Gein.

Backus Gein

As you can see, it was a mistake anyone could have made.

Posted by ekr at 06:02 PM | Comments (52) | TrackBack

September 11, 2003

Recreational ultrasound

In a nice example of medical technology going mainstream, companies are offering to give you a DVD recording of a 3-D ultrasound of your fetus, somplete with soundtrack. [*]. Officially, of course, these services aren't being advertised as a medical service, just a memento, but of course people are using them as a sort of secondary medical service:
Some doctors say, however, that the 3-D images have another value -- reassuring nervous parents that their child is all right.

Though Angie's second, two-dimensional ultrasound showed no problems, the Kruses said they weren't 100 percent sure things would be all right until they visited Sneak Peek.

"I could see she had a good chin and her hands were not clenched. It was for peace of mind," Barry Kruse said.

I know, it seems silly, but I can see the appeal here of being able to double check, especially if you know there's some better technique than the one that's been used on you. Moreover, it sure sounds like--unlike hospitals--the private services are making some attempt to actually give the customer what they want:

But the private services are selling more than just a better image of the fetus. Unlike with most hospital-provided ultrasounds, private businesses hold sessions in comfortable offices with dimmed lighting and smoothing music playing. Extended family members are invited to view exams. Angie Kruse's mother, Julie Erhart, was by her side. And while diagnostic exams usually last about 30 minutes, private sessions can be twice as

Though this does make me wonder why hospitals don't do some of the same things to make women comfortable. Is it too expensive or that it somehow feels new agey and unscientific?

Anyway, there's noone in this story that comes out smelling that great. There's a lot of tut tutting from the medical professionals about how it "shouldn't be used as a substitute for seeing a doctor" and it might hurt the fetus, despite the fact that ob/gyns use ultrasound like it's going out of style and there's never been any reports of ultrasound hurting a fetus. On the other hand, the people offering these services appear to be painfully naive about how medical risk works:

"We had heard that there may be some risk with additional ultrasounds -- to use them as necessary," said Barry Kruse, a business consultant. "But we had a friend who had complications during her pregnancy who had 10 ultrasounds with her child, and the child is fine."

In this particular, case, it seems quite likely that the procedure is harmless and that the naysaying by doctors is just the usual reflex behavior, but some day there might be some other procedure that wasn't quite as harmless and it would be nice to get some sense that the kind of people who would offer it had a better idea of how to determine what was safe.

Posted by ekr at 08:22 AM | Comments (13) | TrackBack

September 10, 2003

In the paper of record

Someone just pointed me to this New York Times article on worms. If you turn to page 2, there's a paragraph about my study on fixes in response to disclosed vulnerabilities in OpenSSL. I wasn't contacted for this article, so I'm as surprised as anyone that I get mentioned--pleased, but surprised.
Posted by ekr at 08:07 AM | Comments (51) | TrackBack

Mark Kleiman on the MDMA study

Mark Kleiman has added his more informed take on this issue. [*].
Yet it appears that the researchers failed to investigate the causes of those deaths. Moreover, they went on to draw inferences about the effects of MDMA on humans from the observed damage to the brains of the remaining animals. That didn't seem to trouble the reviewers for Science or the administrators at the National Institute on Drug Abuse who trumpeted the findings as evidence of the dangers of MDMA. (Science is published by the AAAS, whose president, Alan Leshner, was the Director of NIDA when the grant in question was awarded; he made MDMA his particular crusade.)

It is hard to escape the thought that many of the people involved were less cautious than they might have been because the results seemed to support their already strongly-held beliefs.

I hadn't realized (though I should have) that there were such strong incentives to get any particular result in this case--as opposed to the usual scientific incentives to get the most surprising and interesting result. Given that, this kind of work probably needs to be examined carefully.

Posted by ekr at 08:03 AM | Comments (11) | TrackBack

September 09, 2003

Why is there another hole in my street?

When last we checked in, Palo Alto had just completely resurfaced my street. Well, the Palo Alto construction gnomes are back and there's a big hole in the asphalt. Apparently they're "installing water blowoff valves"--or rather trying to. The construction gnome I just talked to tells me that they're having trouble finding the pipe. Here's my question: why didn't they install the valves before they repaved the street?

Update:
Did I mention that this water construction project seems to involve turning off my water?

Posted by ekr at 11:50 AM | Comments (11) | TrackBack

Airline boarding color codes?!?!?!

Let me see if I have this right. We're going to forbid 1% of the population of the US from flying?!?! [*].
According to the Washington Post, passengers will be assigned one of three codes, based in part on their travel plans, traveling companions and the date the ticket was purchased. Sources say those coded "green" will easily pass through security checkpoints. Others will be coded "yellow" and face additional screening. An estimated 1 to 2 percent who get "red" coding will be barred from boarding and face police questioning. They may be arrested.

Now, there are more or less two possibilities here:

  1. These will be people who have done something else and the government is just using the airline checkpoint as a convenient place to screen for known offendors.
  2. 1-2% of travelers are going to be denied boarding and questioned by police because some risk profiling system flags them.

I'm not sure which of these alternatives I find more disturbing.

Posted by ekr at 07:51 AM | Comments (49) | TrackBack

What Warren Zevon really died of

I very much liked Zevon's music and even though I knew it was coming, I was sorry to hear that he had died. Coincidentally, I'd just ordered a copy of The Wind. Like everyone else, I'd heard he died of lung cancer and figured that it was all those years of smoking. But Colby Cosh is really on the case here. It turns out that Zevon didn't die of lung cancer but rather mesothelioma--which is asbestos related, not smoking related. [*].

Posted by ekr at 07:28 AM | Comments (3) | TrackBack

Uniform non-unisex bathrooms

Ever been to a restaurant that has two single-occupant bathroom, one labelled "Men" and one labelled "Women". Ever found that the bathrom for your gender was full and with a slightly guilty feeling popped into the one for the other gender--and noticed that it was exactly the same?

But if they're the same, what's the advantage of labelling them for individual genders? There's certainly a real disadvantage: increased average wait time. Let's assume for convenience that men and women use the bathroom] equally frequently and that it takes them equal amounts of time to use the facilities. Consider what happens when two people arrive simultaneously. There's a 50% chance that they'll be the same gender and therefore one will have to wait. If the bathrooms weren't gender segregated then then there would be no waiting in this case.

In real life the segregated arrangement is harder on women then men, since women generally take longer in the bathroom than men. Nevertheless, except in situations where there really aren't enough bathrooms anyway it's probably a net lose for both sexes, since men end up waiting at times when the women's bathroom is empty.

Does anyone have any idea why places--especially restaurants--do this? It's not universal, but it is quite common.

Posted by ekr at 07:19 AM | Comments (51) | TrackBack

September 08, 2003

Crank not that bad for you either

If you read my previous article on the great MDMA/crank mixup, you're probably thinking "I better stay away from methamphetamine". The truth is that even methamphetamine isn't as insanely toxic as you might have guessed either. Actually, it's available by prescription as Desoxyn, and occasionally used for ADD treatment (though other stimulants such as Adderall and Ritalin are more common.)

My guess would be that the reason that Ricaurte et al. observed so much toxicity in their experiment is that they were giving the monkeys some enormous dose of methamphetamine. Typical doses of MDMA are on the order of 150mg for a human. Typical doses of methamphetamine are on the order of 5mg. The supplier, RTI, doesn't have their catalog but if the concentration of the methamphetamine solution was anything close to that of the MDMA solution then those poor squirrel monkeys would have been getting a pretty heavy load of methamphetamine.

Posted by ekr at 08:11 PM | Comments (56) | TrackBack

The great MDMA/crank mixup

Last year, there was a paper in Science reporting that MDMA (Ecstacy, X, etc.) damaged the dopamine system in primates, thus potentially leading to Parkinson's disease. It turns out that there was a mixup with the labels and they were actually giving the animals methamphetamine (known on the street as "crank"). [*]

There's one thing that is a little confusing about all this, though. They gave five monkeys the drug and one died and another started to have serious Parkinsons-type problems right away other. Now, clearly MDMA doesn't have this kind of immediate impact on humans. The authors make a stab at answering this question in the original paper:

In light of the present findings, and given the fact that MDMA use is widespread and increasing, one might ask why more cases of MDMA-induced Parkinsonism (33) have not been reported. There are multiple potential explanations, but only two will be mentioned. First, Parkinsonism does not generally become clinically apparent until more than 70 to 80% of brain dopamine has been depleted. Therefore, substantial MDMA-induced dopaminergic neurotoxicity could occur yet remain occult until unmasked by other processes (such as drug-induced interference with dopaminergic neurotransmission or decline in brain dopamine with advancing age). Second, until now, the potential for MDMA to damage brain dopamine neurons in primates has not been appreciated and, therefore, MDMA neurotoxicity has not been considered in the differential diagnosis of Parkinsonism in young adults. It is possible that some of the more recent cases of suspected young-onset Parkinson's disease might be related to MDMA exposure but that this link has not been recognized.

Still, you'd think that if MDMA were this toxic, we'd be seeing a lot more reports of death by MDMA. The fact that we don't was always kind of suspicious. To tell you the truth, I feel for these guys. It's great to get some really surprising result and think "I've got this fantastic paper" but at least for me, that elation also comes with a nagging voice in the back of my head that says "maybe it's all wrong and you've made some obvious mistake". I've never had that voice be right yet, but I'm always worried that if I don't listen to it, it will be.

Posted by ekr at 07:52 PM | Comments (51) | TrackBack

Don't import your drugs from Nebraska

Tech Central Station is carrying an article by Conrad F. Meier that argues that buying drugs via Internet isn't safe. What's confusing here is that Meier sems to think that this is an argument against reimportation of drugs from Canada:
The Food and Drug Administration (FDA) warns re-imported drugs raise serious safety concerns since they could be counterfeit, contaminated, expired, or mislabeled. They won't vouch for the quality of re-imported foreign drugs or those sold over the Internet since there is no way to tell the origin of the drugs, their quality, their effectiveness, or if they endanger our health.

He then goes on to list a bunch of examples of counterfeit drugs being sold in various locales, such as Turkey, Haiti, and Lebanon. This list would be a lot more convincing if it didn't also include Kansas City, Florida, and Nebraska. Moreover--at least as Meier describes it--none of the scams seem to have relied on the Internet at all.

As far as I know, there's no reason to believe that buying drugs over the Internet from any reputable provider is any less safe than buying them from your local drug store. Frankly, I trust Amazon.com more than I trust my local Longs Drugs. Similarly, I don't know of any evidence that drugs reimported from Canada are any less safe than those originaly sold in the US. If Mr. Meier knows of any such evidence, he should raise it. If not, he should stop spreading FUD.

Posted by ekr at 06:26 PM | Comments (12) | TrackBack

Could we increase the information content of spoken language?

Spoken language is not a very efficient carrier of information. Try listening to someone over a lousy wireless telephone and you'll notice that you can still understand almost all of what they are saying despite the static or dropounts. The reason you can do this is that there is an enormous amount of redundancy in speech. So, even if you lose part of the audio stream there's still plenty to allow you to reconstruct what's being said.

But this very redundancy means that if there isn't any noise you should theoretically be able to communicate more information. This is a topic close to my heart, as I often wish that I could get my point across more quickly.

Roughly speaking, there are three ways to communicate more information over a given channel:

  1. Increase the symbol rate. In this case, the number of words per second.
  2. Increase the amount of information conveyed per symbol. In this case, increase the number of words that people regularly use.

Symbol Rate
Now, the easiest way to increase the symbol rate is to simply to actually speak faster. Unfortunately, as anyone who has had the opportunity to listen to me speak when I get excited knows, just speaking faster generally means that the speaker starts to run words together and rapidly becomes unintelligible. So, while there's probably some headroom here, I don't think there's much.

If we're going to increase the symbol rate, then, we need to shorten the words so that less sound needs to actually be produced to generate a given word. There's no hope of doing real compression, our brains aren't built for it, but you can certainly imagine that we could shorten words to the extent that they were as short as they could be and still be understandable. In fact, this happens all the time in technical fields. Abbreviations and jargon are basically forms of symbol compression.

I don't know how much room there is here, though, either. There are already a lot of words which sound alike and I'm not sure that you could actually shorten that much without having a fair amount of speech start to be ununderstandable.

Number of Symbols
The other way to increase the information content of a channel is to increase the number of distinguishable symbols. We do this all the time: whenever we invent a new word we increase the number of possible symbols by one. How long can this go on, though? We're already at the point where the number of short words is running out, so new words sound like old words (AIDS, aids, aides) and you need context in order to distinguish them. So, there's a limit--though I don't know exactly what it is--to how many new words we can create before we need to start making them longer, which of course affects the symbol rate and therefore the overall information rate.

There's more than one way to skin this cat, though. Any particular language only uses a subset of the number of possible sounds that the human vocal apparatus can produce and the human auditory system can process. For instance, English isn't a tonal language--whether a given syllable i on a rising or falling tone doesn't affect its meaning. But Thai and Cantonese are. Similarly, Xhosa has three "click" sounds as syllables. Introducing these sounds into our language would let us produce many more different sounding words that were still short. [0]

Would it work?
Sounds great in theory, right? Unfortunately, I don't know if this could be made to work in practice. There are a number of possible objections:

  1. The new sounds aren't actually distinguishable from the old ones. Consider a simple analogy: if you have an electronic communication system with a 1 volt detector resolution you can use either levels of (0 volts, 1 volt) or (.5 volts, 1.5 volts) but if you mix them then the signals will be unintelligible. The same thing may be true on a more sophisticated level with phonetic features.
  2. There's an upper limit to the information processing capacity of the human brain and you wouldn't be able to think fast enough to process a faster rate of information. Or, it might simply be the case that your auditory system couldn't encode or decode it fast enough.

One way to try to figure this out would be to see if the information content of speech is constant across languages. If it is, it seems likely that we're fairly close to the upper limit. If not, then it seems less likely. I don't know the answer to any of these questions and a few queries to linguistically savvy types haven't turned anything up. If any EG readers know more about this topic and can make suggestions, the fast talkers of the world would be grateful.

[0] Those of you familiar with communication theory will recognize this tactic as increasing the number of levels that we use in our communication channel.

Posted by ekr at 06:12 PM | Comments (46) | TrackBack

September 07, 2003

Grading and systematic bias

One thing that you do have to worry about with automatic grading is systematic bias. Let's consider a simple problem: determining whether or not texts are in French or English (let's assume they must be in one or the other).

It's pretty clear how to do this with human graders. We take a bunch of graders who know both languages and ask them to identify them. Now, assume that the people are perfect at doing the identification but occasionally hit the wrong keyboard key. This gives us an error rate of (warning: entirely made up number here) 2%.

Now, we could write some really sophisticated computer algorithm that attempts to emulate the human process, but that would be a lot of work. It turns out that there's a really simple rule that will do a pretty reasonable job: look to see if the text has any accents in it. Most French texts of reasonable length have an accent or two in them and almost no English texts do. So, this rule will give us reasonable discrimination with almost no programming effort at all. (Detecting accents is incredibly easy).

Now, this rule isn't perfect. Some French texts don't have accents and some English texts do (some people write resume with an acute accent on the final e). But it's pretty good. I have no idea what the error rate is, but I wouldn't be surprised if it was down in the 2% range, so let's just pretend it's 2%.

So, if the error rates of human and machine language discrimination are identical, so they're equally good, right. Well, maybe. As we've described it, the humans make random errors which aren't dependent on the text. So, if you gave the same text to a human again--even the same human--they'd most likely get it right. By contrast, the machines make systematic errors. If we feed the machine the same text again it will make the same mistake. Moreover, we can predict which texts the machine will get wrong.

What does this mean in the context of testing. Well, if there's systematic error in the machine scoring system then some people will likely get a consistently inaccurate score for having a given writing style. On the other hand, most everyone else will get an accurate score. By contrast, if we have a random error than everyone has to suffer some chance of getting an inaccurate score. Which is more fair? I don't know, but it bears some thinking about if we're going to have informed opinions about what kind of testing we want to have.

There is one important sense in which we should worry about systematic bias. If it turns out to be possible to exploit that bias to fool the grader without writing a good essay (to extend the example above, by adding a false accent mark), then the scoring machines will get increasingly inaccurate. That sort of systematic bias is important to watch out for since it corrodes the accuracy of the system. However, as I said in the previous post, how do we know that that's not already happening with human graders?

Posted by ekr at 08:25 PM | Comments (61) | TrackBack

So what if a computer grades your essay?

Just read this quite interesting article in the New York Times on computerized grading of essays for things like the GMAT. The way that these systems work is that they're trained on a sample corpus of essays which are also scored by human professionals.

"Our aim is that the system agrees with a human reader as often as two human readers agree with each other," said Jill Burstein, a computational linguist at the Educational Testing Service and the leader of the team that developed e-rater. "The goal is to simulate the human score."

Apparently they're getting pretty close. Current systems are about 97-98% accurate.

What's really interesting is how primitive these systems are. Basically, they look for a relatively small number of linguistic features:

For example, a high score almost always contains topically relevant vocabulary, a variety of sentence structures, and the use of cue terms like "in summary," for example, and "because" to organize an argument. By analyzing 50 of these features in a sampling of essays on a particular topic that were scored by human beings, the system can accurately predict how the same human readers would grade additional essays on the same topic.

This is actually quite a familiar story in what's called "machine learning". Rather simple approaches to text analysis turn out to be surprisingly powerful. For instance, the Bayesian Filters that have gotten quite popular for spam filtering basically depend on the fact that some words are more popular in spam than in non-spam email.

Naturally, there's a fair amount of complaining about this sort of grading from certain quarters:

Julie Cheville, an assistant professor of literacy education at Rutgers University and the local director for the National Writing Project, which promotes professional development for writing teachers, is among those skeptical of such an approach. "To be scored, writing needs to be formulaic, and formulaic writing has never been the trademark of effective writers," she said. "At the moment, what automated scoring technologies can do is scan, count and score. They orient students to errors, not to meaning. Vacuous student essays can receive high marks only because they are error-free."

While Cheville is in some sense correct, I think this is looking at the situation backward. If the software is able to get good agreement with human graders it's most likely because the human graders are already grading in a formulaic way. That's what she should be objecting to, not that we're using software to get the same results that human graders do.

Posted by ekr at 08:05 PM | Comments (21) | TrackBack

September 06, 2003

P2P == Piracy 2 Porn? You've gotta be kidding me

The NYT is reporting that the RIAA is attempting to get legislation to restrict peer to peer file sharing technology on the grounds that it's used to trade pornography--including, of course, the trump card in all such discussions, child pornography. I won't bother to list the ridiculous restrictions that the RIAA favors on P2P. I'll merely say that pornography is so pervasive on the ordinary Web that anyone who has the least bit of trouble finding it without using a P2P network is a complete moron. Restricting P2P will have zero effect on people's ability to consume pornography. However, it will have a big effect on people's ability to share music. Gee, you don't think that the RIAA is using pornography as a pretext to attack music sharing do you?
Posted by ekr at 04:01 PM | Comments (20) | TrackBack

A potential solution for tax ambiguity

Kevin Dick and I were discussing the complexity of tax law and the fact that the IRS can't seem to give people correct answers to tax questions. In the course of this conversation, Kevin pointed out that if it's possible to understand the tax law in an unambigous way--which would seem to be a basic requirement of fairness, then it should be possible to write a piece of software that encoded that information.

This fact suggests a simple answer to the problem of complexity or ambiguity. The IRS simply writes a piece of software that encodes the rules. If you use that software to prep your taxes, then provided you've answered the questions honestly, then that's the correct interpretation. Alternately, the IRS could bless TurboTax (or Taxcut or whatever) as the official tax software. Now, of course this software would be wrong some of the time, but at least it would be unambiguous, which is more than can be said for the current system.

Now, it could be argued that the tax law is so complicated or ambiguous that it's not possible to write such a piece of software. However, if that's so, then I would think it's a pretty clear signal that things have gotten completely out of control. If a team of programmers and experts can't implement the tax code, how can any individual be expected to?

Posted by ekr at 02:47 PM | Comments (51) | TrackBack

September 05, 2003

ISDN back on the air!

The SBC tech just left. Turns out the box at the demarc was bad. He replaced it. Now everything works great--or at least as great as ISDN ever did. Now, with any luck the intermittent failures I was seeing before were a bad box and not a bad modem.

It's funny. All the ISDN field techs I've ever dealt with were competent and pleasant to deal with. The phone support techs were only so-so. What's really weird here is that back in the days when I had DSL, the DSL techs were generally pretty bad. I can't tell if it's just random chance, observer bias, or what.

But at least I have decent connectivity again.

Update:
I just reread this, and, you know, it's funny. The field tech was competent but nothing special. He just showed up and got the job done. On the other hand, SBC service is generally so terrible that I'm incredibly grateful to get even that.

Posted by ekr at 10:17 AM | Comments (15) | TrackBack

September 04, 2003

Progress with the ISDN line

A friend lent me a Cisco router and an NT-1 so I was able to independently test the ISDN line with that. After we tested it (thanks Rohan) and it didn't work we were able to conclude that the ISDN line was really hosed. I called Pac Bell and this time when they tested the line it showed in service even though the modem was unplugged--indicating that things are hosed somewhere on Pac Bell's side of the system. They're sending someone out to check the cables.

In retrospect I should have just insisted they send someone out when I talked to them on Friday but I let myself get scared off by the possibility they'd charge me if it really was my problem. I guess having to use a modem for the past week is my punishment for being cheap.

Posted by ekr at 09:48 PM | Comments (12) | TrackBack

Does anyone understand the tax law?

The Boston Globe is running an article on the Treasury's study of IRS accuracy.
IRS employees provided complete and correct answers to 45 percent of the questions asked by auditors, and correct but incomplete answers in 12 percent of the cases.

IRS employees told the auditors to do their own research in IRS publications to find the answers in response to 12 percent of the questions, despite an IRS policy banning the practice.

Incorrect answers were given to 28 percent of the questions. The questions most commonly answered incorrectly dealt with the earned income tax credit, education credit and dependents.

Whenever I read one of these articles, I get the impression that I'm supposed to come away thinking "Gee, those IRS people are really stupid." But I seem to remember hearing that similar surveys of tax attorneys result in a lot of wrong answers tool. Maybe the problem is that the tax law is so complicated that noone can understand it.

Posted by ekr at 05:56 PM | Comments (53) | TrackBack

Kinsley on Schwarzenneger

Just read Michael Kinsley's article in Slate about the whole Schwarzenegger/Oui magazine fracas. Kinsley argues that the press should report politician's sexual behavior because "Some sexual habits reflect an attitude toward other people, especially women, that is worth knowing about in the voting booth."

With respect to Schwarzenegger in particular, the interview in question has Schwarzenegger saying:

Once in Gold's gym there was a black girl who came out naked. Everybody jumped on her and took her upstairs, where we all got together. But not everybody, just the guys who can f*** in front of other guys.
Kinsley doesn't like this. He says:
Did this parody of a testosterone fantasy really happen? (Kaus quotes Mr. Gold himself saying that Gold's gym had no women members back then.) But if it did happen, exactly as Arnold described it in 1977, it's pretty disgusting. It's disgusting even if it was consensual all around. It's disgusting even though Arnold wasn't married at the time.

Maybe it's just me, but I don't see what's so disgusting here. Assume for the moment that it was consensual. If it wasn't that's a totally different story. So, Arnold engaged in some sort of group sex experience. That's disgusting why? It might not be my taste, or Kinsley's but so what?

Of course, Kinsley could just mean that he finds it personally yucky, like eating lutefisk or brocolli. But, surely that's not a reason not to elect someone governor. Based on the beginning of the article, Kinsley presumably thinks that it reflects an attitude towards women that makes him potentially unfit to be governor. I'd be interested in hearing just what Kinsley thinks that attitude is. Strangely, Kinsley's article doesn't say. I wonder if he has an answer or if he just thinks it's yucky.

Posted by ekr at 12:55 PM | Comments (61) | TrackBack

September 03, 2003

A response to AMTP-guy's FAQ

I did check out the AMTP FAQ and it didn't leave me very excited. Here's his explanation for why he has to have a new protocol:
3. Why not add this capability to SMTP as an option?

This solution will only work if it is exclusive of existing practice. In order to solve the problem we must stop accepting traffic from non- trusted sources.

AMTP is based upon SMTP, so it can make use of existing SMTP extensions and existing code. It is designed to make the transition relatively painless for system operators and mail server/client authors.

AMTP will deploy on a different port number than SMTP, allowing existing servers to exchange traffic using both protocols for a period of time, and then, after a sufficient amount of time, simply stop listening on port 25.

This is just silly. All you would have to do to make it "exclusive of existing practices" is require the cryptographic authentication and policy transmission in your server's receive configuration. There's no need for a new port here.

Posted by ekr at 09:02 AM | Comments (36) | TrackBack

The 987th proposal for getting rid of spam

So, this thing called Authenticated Mail Transfer Protocol (AMTP) is getting some press, if you can call getting onto /. press (I do). Here's how the author describes it:
This is the home of the AMTP protocol. AMTP is being designed as a possible replacement for SMTP, with security features designed to reduce the impact of Unsolicited Bulk Email (UBE) and the cost of running mail servers.

Let's get the cheap shot out of the way: AMTP isn't really a new protocol. It's a simple extension to SMTP. I have no idea why the author thinks that it needs to get a new name and run on a separate port. One of the major design principles of the Internet is to reuse existing protocols whenever possible, and definitely not to go forking them when you don't have to. AMTP could perfectly well have been written up as "Anti-Spam Extensions to SMTP", but I guess that wouldn't have had the cachet of being able to say "Hey, I designed a new mail protocol".

Moreover, some reading, it becomes clear that even though AMTP is gratuitously different, it's really is nothing new. It's yet another shot at the ever-popular combination of two well known techniques:

  • Requiring cryptographic authentication.
  • Requiring distinctive headers in the message that can then be filtered on. [0]

"Ok", I can hear you saying... "Whether something is new or not may matter to academics, but I just care if it will stop me from getting mail offering to increase my penis size. Will it?" The answer here is "Yes, if". As in "Yes, if you are willing to assume a bunch of things that aren't true".

The way that AMTP is supposed to work is that all AMTP senders are required to use SSL and authenticate with a certificate. They're also required to attest to some set of policies that apply to the message. The idea here is that you filter on the policy indicator and then if someone lies you track them down and do... something nasty, I suppose. Maybe give them the finger or something. Or maybe you just tell all your friends that they're jerks.

There are a number of big assumptions here:

  1. Every mail server has to get a certificate. Not just any certificate mind you, but one that identifies them, or at least is expensive.
  2. There needs to be some way to punish people who attach false policies to their email.
  3. Mail servers need to actually treat AMTP mail better than SMTP mail in some way.

All of these assumptions are highly questionable. Let's start with (1) and (2). As I said earlier, the whole scheme depends on having some way to punish people who lie about their policies. Otherwise spammers will just lie all the time. But the only information you have about someone's actual identity is the certificate that they used. This gives you basically two choices:

  1. Find out their real name and punish them in the real world.
  2. Put their certificate on a black list of "policy-liars" so that noone will ever accept their mail.

But either of these approaches means that the certificates have to mean something. Approach (1) means that the certificate issuer needs to actually know your real name. They'll need to charge you to enforce that. Approach (2) requires that it not be easy to get a new certificate. Otherwise, people will just get a new cert as soon as they get blacklisted. In either case the kind of certificates that one needs are going to be relatively expensive to get (order $50-200). If you look at the history of SSL/TLS so far, you'll see that that's a pretty substantial deterrent to getting certificates.

So, if it's a pain to get certificates, why would a sender do it? The only reason they would is if it means that receivers are going to treat their mail better (assumption (3) above). Will they? It's highly doubtful. At the beginning of AMTP deployment, there are very few AMTP users and so almost all of your mail will come via SMTP. Thus, you can't give AMTP much preference or you risk missing important e-mail. Thus, the pressure on senders to use AMTP is low. Now, maybe when half the world is AMTP the situation will change, but how are we going to get there when it's so undesirable at the very beginning?

The answer, of course, is that we're not likely to. There have been a number of other attempts at this same general approach and they've all gone nowhere for the same reason: the startup cost is way too high for the value that they provide. I don't see that AMTP is any different.

[0] Strictly speaking what AMTP has is markers, not headers, since they go in the mail transport but not the mail message. The difference is pretty much irrelevant.

Posted by ekr at 08:50 AM | Comments (54) | TrackBack

September 02, 2003

Vaccinate for flu to help with SARS?

Here's an interesting article. The WHO is recommending increasing flu vaccination in order to make it easier to deal with SARS. The idea isn't that the flu vaccine prevents SARS--it doesn't--but that it reduces the confusion over whether people have the flu or SARS, since they're less likely to have flu if they're vaccinated.

I'm not sure I buy this rationale. The flu vaccine isn't 100% effective and so there will still be plenty of flu cases. On the other hand, even without SARS vaccinating against flu would probably be a good idea. It's estimated that 36,000 people die of flu yearly in the US alone. Only about 1000 people died of SARS last year worldwide. I guess if fear of SARS makes people get vaccinated for flu, that's a good effect in and of itself.

Posted by ekr at 09:34 AM | Comments (55) | TrackBack

September 01, 2003

The cyborgs are coming

I think we're entering the age of the business cyborg. So, lots of people, nerds especially, carry cell phones or PDAs, but those are pretty clearly tools that are distinct from your body. When I was in DC I saw a guy wearing one of those Jabra Bluetooth headsets. With one of these puppies you can pretty much wear it all day long. activation. When I talked to cyborg man, it appeared that that was exactly what he did. At this point, the cell phone is more of an extension of your body, especially if you have voice dialing and answer. Now if I could just get that portable heads-up display and EEG input device...
Posted by ekr at 04:12 PM | Comments (57) | TrackBack