December 31, 2003

Time for an archival filesystem?

Like many UNIX-heads, I use tape backup, but I'm starting to push the limits of my backup system. I've got about 30 GB of data on my system and The 8mm tapes I use have a capacity of about 8 GB. Now, most of this data doesn't change, but it's still a pain to back stuff up and pull it off backup. Here's the thing, though. Hard drives have gotten so cheap (< $1/GB) that I'm thinking of converting to them as a backup medium. Amanda, the backup system I use, can be set to just back up to disk. Even so, this isn't optimally convenient, since the data is stored in UNIX dump files and so getting the data off the backup is a pain.

What would be maximally convenient would be an archival filesystem like the one used in Plan 9 [*]. The really cool thing about it is that it presents nightly snapshots of the filesystem state. So, for instance, the entire filesystem state as of December 31, 2001 would be stored under /n/dump/2001/1231. This is much more convenient than conventional backup systems.

Unfortunately, I don't know of an implementation of a filesystem like this for FreeBSD. In theory, you could just do it with a copy, but that would consume way too much storage. Plan 9 uses some optimization to avoid duplicate data. You'd need to have something like that to make this work in practice.

Posted by ekr at 06:09 PM | Comments (111) | TrackBack

December 30, 2003

Not exactly the return of Black Flag

William Shatner will be making another album, [*] following up on the success of his smash hit album The Tranformed Man. The guest performers include punk legend Henry Rollins. Does anyone else think this is strange?
Posted by ekr at 10:07 PM | Comments (19) | TrackBack

December 29, 2003

This really makes me want OnStar...

Perry Metzger pointed me to this article on people's concerns about privacy with the OnStar system. Now, I knew about the remote monitoring issue [*], but here's the part that I noticed
OnStar has said that its equipment was not involved in that case. An OnStar spokeswoman, Geri Lama, suggested that Mr. Dunnam's worries were overblown. The signals that the company sends to unlock car doors or track location-based information can be triggered only with a secure exchange of specific identifying data, which ought to deter all but the most determined hackers, she said.

I'm kind of curious what "secure exchange of specific identifying data" means. Sometimes when you hear "ought to deter all but the most determined hackers" it means "We know what we're doing. We designed our system securely but we're being careful about what we claim." Sometimes it means "we bungled things". It's easy to get these protocols wrong. If anyone has a pointer to documentation on their protocols I'd love to take a look.

Posted by ekr at 10:34 PM | Comments (10) | TrackBack

More on the externalities of gasoline

Interestingly, the CBO report cites an NRC report that estimates a very low externality cost of gasoline:
While acknowledging uncertainty, the NRC tentatively suggested an estimate of 12 cents to reflect the cost of carbon emissions resulting from a one gallon decrease in gasoline consumption (which corresponds to a cost of $50 per metric ton of carbon). Further, it suggested an energy-security cost associated with consuming one gallon of gasoline of 12 cents (which corresponds to a cost of $5 per barrel of oil). Finally, the NRC estimated a cost of 2 cents per gallon due to emissions of air pollutants associated with the production and distribution of gasoline, resulting in total external costs of 26 cents per gallon.

The CBO report says that the gas tax is currently $.41. If these estimates are correct, then the gas tax is too high right now.

Again, I'm not qualified to review any of this data in detail, and I haven't read the NRC report. However, if we accept these results, it sure looks like at worst it would take a modest increase in gas taxes to achieve an efficient outcome.

Posted by ekr at 09:22 AM | Comments (57) | TrackBack

The CBO on a gas tax

The Congressional Budget Office has just issued a report arguing that fuel economy standards are much less economically efficient than a tax on gasoline. (Link via Brad DeLong). They argue that a $.46/gallon tax increase would achieve the same long run efficiency improvement as an increase in CAFE standards to 31.3 mpg for cars and 24.5 for light trucks.

Here's the really interesting bit:

The advantage of a gasoline tax over CAFE standards is much greater in the short run. Neither the higher tax nor higher CAFE standards would achieve full effectiveness until all existing vehicles were replaced, or after about 14 years in CBO's analysis. But over the initial 14 years, the tax would save 42 percent more gasoline than would CAFE standards with trading, while costing 27 percent less (see Summary Figure 1). The gasoline tax would outperform the CAFE standards because, while both policies would improve the fuel economy of new vehicles, the tax would produce greater immediate gasoline savings by inducing owners of both new and existing vehicles to drive less. In contrast, by making new vehicles cheaper to operate, higher CAFE standards would encourage owners of new vehicles to drive more (and would not affect the driving incentives of existing-vehicle owners at all).

Given the uncertainty of predictions beyond 10 years, decreasing consumption now is much more interesting than decreasing it 14 years hence, so this is an even stronger argument in favor of gas taxes.

I'm not really qualified to review their economic modelling, but this argument sounds intuitively right to me.

Posted by ekr at 09:19 AM | Comments (11) | TrackBack

December 28, 2003

Prison finally too expensive?

It looks like America's enthusiasm for harsh prison sentences may finally give way to the American taxpayer's cheapness, at least in California [*]:
Shorter and alternate sentences and speedier release programs already were being implemented by the state's Department of Corrections - or were part of a recent lawsuit settlement over California's parole system.

But senior administration officials said last week Schwarzenegger has asked them to consider further steps. The state faces a budget deficit that could grow to between $12 billion to $24 billion by the middle of 2005 if current spending and revenues don't change.

The changes would reverse years of a get-tough policy on criminals under California's last three governors, and could face opposition from Republican lawmakers who make up a minority in the Legislature.

Seeing as about 11% of the CA general fund (not including bond-based expenditures, which are large) is spent on courts and corrections, this could be a pretty big deal.

Posted by ekr at 09:49 AM | Comments (50) | TrackBack

December 27, 2003

What is the deal with political Googlebombing?

I've noticed that some of the liberal blogs seem to be doing some political Googlebombing, linking, for instance, "optimistic" to Howard Dean or "miserable failure" to George Bush. The idea behind Googlebombing is that when people type the chosen search key into Google, they get the target. So, for instance when you search for "miserable failure" into Google, you get this page, the first hit on which is George Bush's biography. (In what's perhaps an amusing error, the second hit is the biography of Jimmy Carter).

Now, maybe I'm just radically naive about how political campaigning works, but I don't see the point of this. So, what? Random people are going to type "miserable failure" into Google, get Bush's biography and think "ah, what a loser, I'll vote for Dean". That can't be right, seeing as most people would never type "miserable failure" into Google if they hadn't heard of this little hack. Surely the purpose isn't that. Rather, I'm assuming that the ostensible purpose is PR: people will hear about it and will sort of subsconsciously make the association. Frankly, I doubt it. Sure, people who already hate Bush (or love Dean or whatever) will feel all ra ra about it, but speaking as someone who just dislikes Bush, I find the whole thing rather juvenile, the Internet version of throwing a pie in Bush's face.

Like many such political stunts I suspet thact its real purpose is not to directly advance one's political agenda but rather as a sort of act of group bonding for the faithful. In which case, I guess it doesn't matter if it looks silly to people like me who are not--provided of course that it doesn't alienate the outsiders more than it energizes the faithful.

Posted by ekr at 09:35 AM | Comments (54) | TrackBack

Who blew up France?

The following is idle speculation resulting from watching too much History Channel. The Germans took most of Europe, including France and Belgium without a fight. The result was that the physical infrastructure wasn't that damaged. Obviously, the Nazis did a lot of non-infrastructructure damage, due to their policy of slaughtering large segments of the conquered population, but I'm just talking infrastructure here. But when you watch documentaries or movies, it's obvious that Europe was horribly damaged after WWII. Who is responsible for that?

My guess is that the Allies did most of the damage during our campaign to push the Nazis back into Germany. Now, no doubt the Nazis did some of the damage during that period as well, but by then the Allies had better air support and more materiel so we were able to blow more stuff up. Besides, when we shelled the Germans, they were mostly occupying territory that was more or less untouched. When they shelled us, we were occupying territory that we had taken from them and therefore had probably already been shelled by the Allies.

Just a thought.

Posted by ekr at 09:11 AM | Comments (22) | TrackBack

December 26, 2003

The marginal value of Christmas

Last night I watched R. Lee Ermey's Christmas special, which was naturally full of interviews with GIs talking about how much it meant to them to have a special meal on Xmas day. Here's the thing, though: in order to give you a special meal on Xmas day, the service must be giving you slightly less good supplies the rest of the year.

This isn't some isolated behavior, either. People all over the Western world save up the rest of the year so that they can buy presents, so we need to conclude that they really do prefer things this way. It's just rather counterintuitive, since we're used to the economic situations where there the marginal value of money decreases the more you have of it, but in this case it's actually increasing, at least up to a point.

Posted by ekr at 04:25 PM | Comments (20) | TrackBack

December 25, 2003

How am I supposed to charge my iPod?

Pre-loaded iPods seemed like such a good idea that I made Lisa one for Christmas based on our music collection. In the process I discovered that the iPod has a uh... unique charging system. Here are the components we have to work with:
  • The iPod has a big iPod specific connector at the bottom. For convenience, call it an I-jack.
  • Apple supplies us with a Firewire to I-jack connector.
  • The power supply has a Firewire jack.
  • You get a dock which has both a male and a female I-jack.

So, if you have a Mac, this works fine. The Mac supplies charging power to the iPod through the Firewire interface. If you don't have a Mac, things are a little different.

I was using my trusty Sony Vaio, which has the four-pin IEEE 1394 (iLink) connector, which doesn't provide power. Apple gives you a 4 to 6 pin converter, but it just changes the form factor, so you're still not supplying power to the iPod. To charge the iPod, you connect the I-jack to the power adaptor. This means that you can either charge the iPod or copy data to it, but not both. This is ok some of the time, but when you're moving a lot of data to the iPod (which runs the disc constantly) you're burning up battery like crazy. This means you have to be pretty careful to have your iPod charged before you do any big copying activity. [*].

There are a number of things you can do to work around this:

  1. Apple will sell you a three headed cable that has an I-jack on one end and splits into Firewire and USB on the other. You plug the firewire end into the power adaptor and the USB end into your computer. This is no good for me because my computer only has USB 1.1, so I need Firewire for fast transfer. [*].
  2. Get a PCMCIA Firewire card that supplies power (this requires that you use an external transformer to power the card). This will work, but it requires forking over about $50.
  3. Get a powered Firewire hub. This also requires forking over about $35.

All of these options seem kind of lame. I'm no Firewire expert, but it seems like Apple could have put another jack in the base and let you run a $10 Firewire cable from the base to the power adaptor. I'm not sure why they didn't, except that having things this way makes using an iPod with your Mac a much nicer experience than with a PC.

Posted by ekr at 02:31 PM | Comments (56) | TrackBack

December 24, 2003

The danger of Canadian drugs?

So, the FDA opposes importation of Canadian drugs into the United States on the grounds of safety [*] . Now, I don't seriously believe that Canadian drugs are less safe than those in the US--there's not exactly a rash of Canadians dying from unsafe drugs--but there's an interesting point lurking here:
McGinnis, the agency's pharmacy affairs director, said the FDA would not piggyback its inspections on the Canadian system because the United States inspects drug manufacturers around the world, while Health Canada relies on inspections done by the drug maker's host country.

If Canada's drugs are manufactured in the same factories as American drugs FDA's then it's possible that the reason that Canada is getting safe drugs is that the manufacturers are complying with FDA regs, not because of the Canadian inspections. That's fine if most drugs are sold into the US, but if most people start buying their drugs from Canada, then it might be less important to comply with the FDA and quality could suffer. Now, I'm not convinced that the FDA's onsite inspection procedure is really any better, but if it is, then mass reimportation could in fact be a problem.

Posted by ekr at 09:33 PM | Comments (10) | TrackBack

December 23, 2003

What's wrong with this web server?

Here's a nice little networking problem. You're doing some stress testing of an Apache web server on your FreeBSD machine. Everything is going along swimmingly at 200 or so requests per second, and then after 40 seconds connections start failing, badly. Each requests ends in a TCP RST from the server, something like this:
16:43:24.369144 ld1.rtfm.com.50469 > ld2.rtfm.com.8080: S 3615660492:3615660492(0) win 57344  (DF)
16:43:24.369241 ld2.rtfm.com.8080 > ld1.rtfm.com.50469: S 2150875758:2150875758(0) ack 3615660493 win 57344  (DF)
16:43:24.369257 ld1.rtfm.com.50469 > ld2.rtfm.com.8080: . ack 1 win 57920  (DF)
16:43:24.369307 ld1.rtfm.com.50469 > ld2.rtfm.com.8080: P 1:78(77) ack 1 win 57920  (DF)
16:43:24.369377 ld2.rtfm.com.8080 > ld1.rtfm.com.50469: R 2150875759:2150875759(0) win 0

Now, if you're a networking guy you immediately suspect that you're running into some kind of system limit, especially when nothing crops up in the Apache error logs. I'm also not running out of per-process file descriptors or mbufs. Here's the second clue: the stall isn't permanent. The graph below shows the number of requests served as a function of time.

The key here is the 60 second periodicity. This smells of TCP TIME_WAIT, the time that any TCP host/port quartet can't be reused for after it's shut down, which is 60 seconds on FreeBSD. One way that this can happen is if you run out of client TCP ports. Say you have a maximum client TCP port of 5000, then after you've used 5000 ports, you have to wait for those ports to exit TIME_WAIT before you can reconnect. But I've already expanded the client's port range up to 60,000 so that can't be it.

Here's the third clue: we're doing about 200 transactions per second and we get about 40 seconds of requests before everything stalls, for a total of about 8000 transactions. If we reduce the number of connections to 100 per second, things work fine. This points the finger at the source of the problem: by default the maximum total number of open sockets (across all processes) on this BSD system is 8008.

These three pieces of information are enough to work out what's going on. We're hitting the server at 200 requests per second. After each request finishes, the socket sits in the TIME_WAIT state for 60 seconds. So, after 40 or so seconds, we've consumed all the available sockets on the system. 20 seconds later, the first of the sockets in TIME_WAIT starts to time out and so we can create sockets again, and the cycle repeats.

The way to fix this problem is to increase the maximum number of open sockets. BSD has a setting for this called kern.ipc.maxsockets. When this is set to 32000, things work fine.

There are still two features of the data I haven't figure out yet: the dip at about 20 seconds and the small spike at about 41 seconds. If anyone has any suggestions for those, I'd be interested to hear them.

Posted by ekr at 09:15 PM | Comments (6) | TrackBack

December 22, 2003

Forming possessives

Strunk & White's rule 1 tells us:
Form the possessive singular of nouns by 's

Great. So how do we form the possessive of "Bill Gates"? "Gates" is a plural noun, right? But when we talk about Bill Gates there's only one of him...

Posted by ekr at 09:13 PM | Comments (40) | TrackBack

Big victory for the Judean People's Front

If you don't read /. you've been mercifully spared the debate over Bruce Perens's decision that the new UserLinux project should include only GNOME as its UI framework or KDE. What a terrible tragedy! Apparently if you want KDE you'll have to download one of the other 932 distros that come with it. Oh, the humanity!

Yes, yes, I know that this is really bad news for the KDE guys, but seriously, if you stripped the logos (and the silly names), not one user in 100 could tell you whether a given app was KDE or GNOME. And we'd certainly be much better off without all of this infighting.

Posted by ekr at 08:39 AM | Comments (42) | TrackBack

December 21, 2003

A trivial Turing Machine

Here's a simple Turing Machine, to give you the flavor of how things work. Say that we've been given a tape with some number of symbols on it and we want to know whether there are an odd or even number of symbols. There's always a special symbol called end. It doesn't count but it just means that there are no more symbols. For convenience, our Turing Machine is equipped with two lights, one that says "EVEN" and the other that says "ODD", but we could just as well write the output on the tape.

In that case, our program looks like this:

  1. Read a symbol. If it's end, light the EVEN light and stop. If it's any other symbol, go to step 2.
  2. Read a symbol. If it's end light the ODD light and stop. If it's any other symbol, go to step 1.

A little thinking quickly reveals that this simple machine works just fine, with only two states, no matter how long the tape is.

Posted by ekr at 01:04 PM | Comments (6) | TrackBack

A toy I want to buy

I'm a big fan of desktop toys, but there's one toy I have never been able to find that I think would be cool for techies: a Turing Machine.

For non CS types, a Turing Machine (TM) is a sort of primitive idealized computer. A Turing Machine consists of a state machine connected to a tape reader and an infinitely long tape. The tape is divided up into segments, each of which can have one of some finite set of symbols on it. The state machine can be in any of a finite number of states and has a set of rules for which states lead to other states. For instance, you might have a rule that said "if you're in state 1 and you read symbol 2, go to state 2". The cool thing about a Turing Machine is that in principle any program you can run on an ordinary computer can be run on a Turing Machine--though it may be a lot slower. [0] Sometimes it can be quite tricky to figure out how to write the programs, but it can always be done and it's interesting to think about how.

My toy Turing Machine would have a small tape reader and a mylar tape, plus some sort of dry-erase marker you could use to mark the tape. (Yes, I know that this isn't a real Turing Machine since the tape is finite, but you can do lots of cool stuff with finite tape TMs, too). You'd also have some sort of readout on the front to tell you what state you were in. I think this would make a great desktop toy, but I've never seen one. Does any company make them?

[0] Actually, because a TM has an infinitely long tape whereas real computers have only finite sized memories, a TM is more capable than a real computer.

Posted by ekr at 12:49 PM | Comments (13) | TrackBack

December 20, 2003

What I really want in a portable computer

Ok, so I love the Treo 600. Heck, everyone loves the Treo 600. It's really the first device that gives you a realistic chance of being connected to the Internet 24x7. It's like having an auxilary brain. I know that this sounds nightmarish to some people, but to people like me who came of age reading Neuromancer and, more importantly, True Names, it's something we've been waiting for most of our lives. Great as it is, though, the Treo is just the first step and has some obvious failings. The good news is that most of these are easily fixable with only modest improvements in currently available technology.

The first big problem is display size. The Treo's display is about 45x45 mm and 160x160 pixels. The resolution could be improved, but the size of the screen is basically limited by the size of the device. So, what we need is a display that's not actually connected to the device. The obvious fix is a head-up display. Effectively a pair of glasses which displays an image in front of your face. The good news is that these already exist. The bad news is that they currently look pretty stupid and the resolution isn't that great (640x480). Plus, they're expensive. Still, we're not that far off. If the resolution were 1024x768 (about as good as your average laptop), I'd probably be willing to look stupid.

The second big problem is that the Treo is a pretty wimpy computer. It's incredibly slow and the operating system--if you can call it that--is stone age. However, that's mostly a power problem. If you were willing to lug around more battery, you could have a much faster processor and a correspondingly better operating system. Between improvements in battery technology and the development of low power processors, we're closing in on acceptable levels pretty quickly. After all, the Sony X505, screen, keyboard, and all, weighs in at only 1.7 lbs [*]. Without the screen or the keyboard (and the corresponding power drain) you could probably build an acceptable unit in a package weighing a pound or so even now.

The remaining problem is network connectivity. Nominally, Sprint's PCS Vision network gives you rates of 50-70 kb/s, but I don't usually see anything that fast. However, the network is getting faster and even twice as fast would be fairly livable. Certainly, a lot of the delays I see on my Treo appear to be because the processor is too slow. If you had a better processor, you would probably have a better user experience. Still, it would be pretty nice to have the Treo WiFi capable so that you could get high speed connectivity if it were available.

The bottom line, then, is that I don't think we're that far off. As I've observed before, people seem more and more willing to load themselves up with computer gear even with today's primitive equipment. As that trend converges with improved technology, I predict a lot more people looking like Steve Mann.

Posted by ekr at 06:19 PM | Comments (21) | TrackBack

Insurance coverage for prescription drugs

In the comments section, Samsarra points out a real drawback of drugs going OTC. They're not covered by medical insurance. In fact, insurance companies are one of the big driving forces behind prescription drugs going OTC. It was pressure from health insurance companies that lead to Claritin being made OTC.
I'm in favor of easier access to medications, but it is sort of irksome to pay for prescription insurance and then to have all the medications I need go OTC. I just read that Singulair is heading that way, which will be very irksome. I will most likely have around $100 a month in OTC drugs between the Prilosec and the Singulair.

I would like to have some differently structured prescription insurance that would handle unexpected extremely expensive prescriptions (all the crap I needed during my pregnancy comes to mind) but not cover day to day stuff, because I feel like I'm not getting my money's worth from that.

The entire premise of insurance is to let you hedge risk of loss. But in order to get that hedging, insurance companies have to charge premiums that are greater than the expected loss. But what this means is that if you have insurance for routine stuff, you're overpaying because you're always paying that risk premium as well. A secondary problem is what's called price illusion. If all of your medical problems are paid for by insurance, you have a tendency to consume more because it doesn't cost you anything. And of course this raises the cost of insurance. These two effects mean that medical insurance is a lousy way to pay for routine care.

The traditional way to handle this problem is with a deductible. You pay the first X dollars of your medical expenses but afterwards the insurance company kicks in. This allows you to hedge major risks but avoids having to use insurance to pay for routine stuff. An alternative approach is to have no insurance at all for routine stuff but only have major medical insurance for unexpected large expenses.

The method we have now is one of the worst approaches. It's extremely expensive and creates really perverse incentives for both customers and insurance companies. From this perspective, universal medical insurance (as in Canada or the UK) is just as bad--worse actually, since there's no opportunity for your insurance premiums to adjust to your increased use of medical services. It's possible that universal medical insurance is good for paying for catastrophic problems, but it's not a particularly efficient way to pay for routine low-level care.

Posted by ekr at 03:49 PM | Comments (69) | TrackBack

December 19, 2003

Look, it's Plan B

On Tuesday an FDA advisory panel recommended that Plan B should be ade available without a prescription. [*]. Plan B (also called the "morning after pill") is basically a high dose oral contraceptive that prevents implantation even when taken after sex. This seems like a good thing to me. People aren't perfect about using safe sex techniques and sometimes accidents happen. There's no good reason why you should have to go to a doctor in order to avoid pregnancy in such situations.

It's just my impression, but it seems to me that lately a large number of drugs are going OTC (Claritin and Prilosec are the two biggies lately). If this is a trend, it's one I'm in favor of.

Posted by ekr at 06:59 PM | Comments (63) | TrackBack

December 18, 2003

Why don't we just tax gasoline more?

Back when the energy bill was still on the table, there was a fair amount of whining about how the bill wouldn't raise Corporate Average Fuel Economy (CAFE) standards (see, for instance this Public Citizen) Web site, or Gregg Easterbrook here).

CAFE is one of those top-down command and control measures that environmentally minded politicians and pundits seem to love and economists hate. Basically, it works like this: manufacturers are required to achieve a certain mean fuel economy for the cars that they manufacture. (Click here for a good description). However, like most such programs, CAFE is littered with loopholes and the automakers have adapted to them. The most famous of these is that SUVs are classified as "light trucks" and have to meet a much lower standard (20.7 MPG rather than 27.5 MPG). Since people respond to incentives, it's not exactly surprising that SUVs and minivans get lousy gas mileage.

The solution that Public Citizen and Easterbrook endorse, of course, is to close the SUV loopholes, and jack up the MPG standards--in other words, more of the same. And most likely whatever new laws we pass will have similar loopholes that will need to be closed in a few years. However, there's a much simpler solution: tax gasoline more.

The first thing we have to do is look clearly at the situation in terms of costs and benefits:

  • When I use gasoline in my car and that lets me get around, that's a benefit to me.
  • When the engine in my car emits CO2 and pollutants, that's a cost to everyone else (economists call this a negative externality).

Less say that gasoline costs $2.00/gallon and that it's worth $.25/mile for me to drive. If my car gets 20 mpg, then I'm getting $5.00 worth of value for every gallon of gas. Thus, I'm making a $3.00 profit with each gallon of gas I consume. That's all well and good, but remember that the pollution generated by my car is imposing costs on other people. We need to consider two possibilities:

  1. Those costs are more than $3.00.
  2. Those costs are less than $3.00.

In the first case, whenever I drive I'm decreasing the net welfare of society (I'm making a $3.00 profit but the rest of the population is incurring a > $3.00 loss. Thus, I shouldn't be driving at all.

In the second case, the net welfare of society is positive, but I'm the one that's reaping all the benefit whereas everyone else is bearing the costs. This seems deeply unfair.

There's a simple way to solve both problems: impose a tax on every gallon of gas equal to the size of the negative externality (this is generally called a Pigouvian tax after A.C. Pigou, the economist who thought of it.) This automatically produces the right outcome.:If the tax is greater than $3.00, then it's not worth it to me to drive and I don't. If it's less than $3.00 then society has my tax money (which, remember is equal to the size of the externality) and it can then (at least in theory) be passed on to the people affected (in this case, society at large). Moreover, since the cost of gas goes up, I have an incentive to buy a more efficient car--in fact, precisely the right incentive to do so.

There are two standard objections to this approach:

  1. It's not revenue neutral. The government now has a lot of extra money, which is bad if you like low taxes.
  2. It's regressive. Poor people drive about the same as rich people and so they'll have to pay a much higher percentage of their income in taxes.

It's relatively easy to solve both problems by turning the extra tax money into a tax cut and weighting that tax cut towards the poor.

How large should this tax be? Estimates vary, but it looks like it should be about $1.00. [*]. The link above recommends $1.40, but that counts congestion as an externality linked to gasoline and there are better ways to account for congestion (for instance, toll roads or other forms of congestion pricing). <

Posted by ekr at 07:48 PM | Comments (43) | TrackBack

December 17, 2003

Qveere Eye For Thye Medieval Man

I just got this link from Sasha Volokh. Well, see for yourself.
Posted by ekr at 12:35 PM | Comments (90) | TrackBack

Reproduction and bonding don't necessarily go together

Actually, the data on women's reproductive strategies doesn't exactly support Roback's view very well either. There have been a number of studies that claim to demonstrate that between 1 and 30% of infants have fathers other than the nominal father. Here's the relevant passage from Jared Diamond's The Third Chimpanzee which I recommend in any case.
People have many reasons to lie when asked whether they have committed adultery. That's why it's notoriously difficult to get accurate scientific information about this important subject. One of the few existing sets of hard facts emerged as a totally unexpected by-product of a medical study, per- formed nearly half a century ago for a different reason. That study's findings have never been revealed until now.

I recently learned these facts from the distinguished medical scientist who ran the study. (Since he does not wish to be identified in this connection, I shall refer to him as Dr. X.) In the 1940s Dr. X. was studying the genetics of human blood groups, which are molecules that we acquire only by inertness. Each of us has dozens of blood-group substances on our red blood cells, and we inherit each substance either from our mother or from our father. The study's research plan was straightforward: go to the obstetrics ward of a highly respectable U.S. hospital; collect blood samples from one thousand newborn babies and their mothers and fathers; identify the blood groups in all the samples; and then use standard genetic reasoning to deduce the inheritance patterns.

To Dr. X's shock, the blood groups revealed nearly 10 percent of these babies to be the fruits of adultery! Proof of the babies' illegitimate origin was that they had one or more blood groups lacking in both alleged parents. There could be no question of mistaken maternity: the blood samples were drawn from an infant and its mother soon after the infant emerged from the mother. A blood group present in a baby but absent in its undoubted mother could only have come from its father. Absence of the blood group from the mother's husband as well showed conclusively that the baby had been sired by some other man, extramaritally. The true incidence of extramarital sex must have been considerably higher than 10 percent, since many other blood-group substances now being used in paternity tests were not yet known in the 1940s, and since most bouts of intercourse do not result in conception.

At the time that Dr.X made his discovery, research on American sexual habits was virtually taboo. He decided to maintain a prudent silence, never published his findings, and it was only with difficulty that I got his permission to mention his results without betraying his name. However, his results were later confirmed by several similar genetic studies whose results did get published. Those studies variously showed between about 5 and 30 percent of American and British babies to have been adulterously conceived. Again, the proportion of the tested couples of whom at least the wife had practiced adultery must have been higher, for the same two reasons as in Dr. X's study.

Disclaimer: I haven't read the actual studies here. Apparently a number of them were done by Baker and Bellis and are described in their book Sperm Competition.

If the adultery rate is this high--or even remotely this high--then it suggests that even if sex does serve the joint purposes of reproduction in unity, those two purposes aren't necessarily coupled (sorry, couldn't help it) as closely as Roback would like, and weren't even in 1940, before the age of free love and easy divorce. Which, of course, rather undermines her entire argument.

Posted by ekr at 08:04 AM | Comments (10) | TrackBack

The natural purpose of sex...

Matthew Yglesias points to this anti-gay marriage article by Jennifer Roback Morse in National Review. Here are the three paragraphs that seem to me to sum up the argument for me:

So, what is the meaning of human sexuality anyhow? Sexual activity has two natural, organic purposes: procreation and spousal unity. Babies are the most basic and natural consequences of sexual activity. "Spousal unity" means simply that sex builds attachments between husband and wife.

Spousal unity is the feature of human sexuality that makes it distinct from purely animal sexuality. As far as I know, humans are the only animals that copulate face to face. Shakespeare described the sexual act as "making the two-backed beast." Both the Hebrew and the Christian Bible describe the sexual act as uniting the spouses in the most literal sense: "the two become one flesh." Two people become, if only for a short while, one flesh. Evolutionary psychology observes the survival value to spousal cooperation. Males and females who attach themselves to each other, have a better chance of seeing their offspring survive long enough to produce grandchildren. Science can now tell us how the hormones released during sex help to create emotional bonds between the partners.

...

We can construct, deconstruct and reconstruct our sexuality any way we want: it is our privilege as thinking creatures. However, human sexuality has a specific nature, regardless of what we believe or say about it. We are more likely to be satisfied with the outcome, if we work with our biology rather than against it. We will be happier if we face reality on its own terms.

I'll give Roback credit: she manages to skirt the naturalistic fallacy here--although this whole "work with our biology" thing comes pretty close.

However, it seems to me that if you're in favor of lifetime marriage, you probably don't want to be basing your argument on evolutionary psychology. A little observation of our primate cousins and human sexual behavior suggests that the "natural" male mating strategy is to attempt to impregnate as many females as possible, not to form lifelong bonds.

Posted by ekr at 07:51 AM | Comments (9) | TrackBack

December 16, 2003

What's the big problem with phonecams?

Just read this baffling NYT editorial complaining about cameras on cellphones.
The ads suggest that the purpose of putting cameras in cellphones is to take photos and share them immediately by sending them over the airwaves to friends and relatives. But the real purpose is to sell minutes on your wireless service. Although no one really wants the return of the wall-tethered rotary-dial black Bell, there is something to be said for the days when a cellphone was just a cellphone.

Huh? Why does there have to be a single purpose? The way capitalism works is that companies offer you something of value to you and you pay them money in return. Their purpose is to make money. Your purpose is to get whatever you pay for. You might as well argue that the purpose of cars is to let car dealers make money rather than letting you drive from place to place. Actually, the analogy is pretty close here. One of the major sources of money for car dealers is financing, not profits from the car.

Posted by ekr at 10:00 PM | Comments (10) | TrackBack

December 15, 2003

What did the labels have against iTunes?

One of the most disappointing things for a cynic is to discover that you weren't actually cynical enough.

One of the most striking features of the multi-year history of digital music is the resistance that the labels showed to providing a simple music download service like iTunes. Instead, the labels kept creating these subscription services that were loaded with all sorts of DRM. I'd always assumed that the problem was simple fear: that they were terrified that if they ever let music be distributed in unrestricted digital form, it would all end up on Napster, thus completely destroying their business. This never made any sense, of course, since it was always trivial for people to rip CDs and put them on Napster, regardless of whether the labels assisted them.

However, this revealing Rolling Stone interview with Steve Jobs clears up the matter for me:

We said: These [music subscription] services that are out there now are going to fail. Music Net's gonna fail, Press Play's gonna fail. Here's why: People don't want to buy their music as a subscription. They bought 45's; then they bought LP's; then they bought cassettes; then they bought 8-tracks; then they bought CD's. They're going to want to buy downloads. People want to own their music. You don't want to rent your music -- and then, one day, if you stop paying, all your music goes away.

And, you know, at 10 bucks a month, that's $120 a year. That's $1,200 a decade. That's a lot of money for me to listen to the songs I love. It's cheaper to buy, and that's what they're gonna want to do.

They didn't see it that way. There were people running around -- business-development people -- who kept pointing out AOL as the great model for this and saying: No, we want that -- we want a subscription business. We said: It ain't gonna work.

Ah... I get it now. The problem wasn't fear that they would lose their current revenue stream but rather the desire to secure a new revenue stream based on a subscription model. That actually makes more sense, and I'm kind of embarassed I didn't see it. The worst, part, of course, is that I thought I was being cynical by assuming the labels were stupid, but it can be so difficult to tell stupid from greedy.

Posted by ekr at 08:32 PM | Comments (23) | TrackBack

December 14, 2003

This guy's got Cornwell's number

A few years ago, Lisa and I plowed through the BBC Sharpe TV movies (Lisa is a big Sean Bean fan). The movies are based on a series of novels by Bernard Cornwell, who has also written a bunch of other historical novels, including "The Archer's Tale", the subject of this review I happened to run across:

You know when you start a Bernard Cornwell book you can strike certain items off a laundry list: undervalued superhero, check; bloody violence, check; loyal friends, some disposable, some not, check; fantasy chick, check; pitiless villain, check; final battle where the hero triumphs, check; an opening to the next chapter in the series, check. Cornwell never disappoints, nor does he ever really surprise. He is a guilty pleasure of several hours of, i dont want to say mindless reading, predictability.

Yep. All the Cornwell books are pretty good--not exactly great literature, but definitely an entertaining afternoon's reading.

Posted by ekr at 10:05 PM | Comments (10) | TrackBack

Back on the air with the Treo

Finally went back to the Sprint store on Friday to get my Treo fixed. I figured they'd just reprovision it, but instead they told me that the phone had failed the diagnostics and they had to give me a new handset. I doubt this was correct, but taking the new unit was easier than arguing, so I let them make the trade. Since they had to provision the new phone, this had pretty much the same effect as reprovisioning my existing unit. Of course, it took 5 hours to get the Internet service working (strangely enough, just as I was calling customer service to complain that it still didn't work). It works now, though. I just have to remember not to touch anything.
Posted by ekr at 09:35 PM | Comments (11) | TrackBack

December 13, 2003

Celebrity Poker Showdown II

Caught my second installment of Celebrity Poker Showdown last night. This time it was the cast of the West Wing, exhibiting some pretty scary poker playing. Some highlights (lowlights) include:
  • Allison Janney has a 4/10 off-suit and John Spencer has a pair of 6s. The flop comes down 6,7,4 and Janney apparently thinks that a pair of 4s is a raising hand. The turn is a 2 or hearts and the river comes down the fourth 6. Janney goes all-in (what the heck is she thinking???) and Spencer has this look of shock on his face like "I can't believe it, but I'm more than happy to take your money."
  • 10 minutes later, Richard Schiff goes all-in with JD/9Q before the flop. John Spencer has a 9H/8C and calls him. Noone makes anything on the rest but Schiff snakes out with Jack high.
  • 2 minutes later, Schiff is holding Queen high and goes all-in again against Timothy Busfield, who's holding Ace high but can't bring himself to call.

Schiff goes on to win everything, leaving me scratching my head trying to decide if he's a genius or merely insane--probably both.

Posted by ekr at 08:04 PM | Comments (113) | TrackBack

December 12, 2003

Congestion control food fight

In a previous post, I wrote about TCP congestion control, but TCP isn't the only protocol on the Internet. TCP is what's called a stream protocol. Data is delivered in strict sequence: the first byte written is the first byte read, the second byte written is the second byte read, etc.

This is great if what you're transmitting is file-oriented data like web pages and file downloads, but it's not so great for other applications. The most obvious problem is how it handles packet loss. Say you're using TCP to transmit some kind of real time data like voice telephony or video. The sender sends packets 1,2,3,4,5,6,7 and packet 4 gets lost. The receiver sees packets 1,2,3,4,5,6,7, but because packets must be delivered in order, the receiving application only sees packets 1,2,3. The typical TCP retransmit time is about 500 ms, so 500 ms later the sender retransmits and the receiving application gets packet 4 and delivers packets 4,5,6,7 to the application. Again, that's fine if what you're transmitting is Web pages, but if it's real time voice, what you hear in your ear is 500 ms of silence followed by the delayed voice data, which is not at all what you want.

How to do real-time over the Internet... well sort of
With real-time applications like voice and videoconferencing, you need a protocol other than TCP. Roughly, you want something that:

  1. Delivers packets in a predictable, timely fashion.
  2. Doesn't require that data be delivered in order.

In other words, you want a datagram protocol like User Datagram Protocol (UDP). Actually, UDP probably isn't as predictable and timely as you might want, but it's pretty much the best you're going to get on the Internet.

So, you're doing telephony over the Internet (generally called Voice over IP (VoIP)) and of course you've decided to use UDP (actually, you're probably using RTP (the Real Time Protocol)) over UDP. The way that that the sender captures short voice samples (typically about 20ms or so) and sends them to the receiver, one sample per UDP packet. The receiver decodes them and plays them on the other end, thus reproducing the voice stream.

If a packet is missing or arrives out of order, the receiver ignores it and tries to compensate, perhaps by playing silence or replaying the previous sample. Typically, these systems have a little bit of buffering, maybe 50-100 ms, just to smooth out Internet jitter, so if you happened to get packets 2 and 3 in the immediate sequence 3,2 you would be able to play them correctly. And if you get 2,4 in immediate sequence, you probably try to interpolate 3.

Silence Suppression and Comfort Noise
One interesting feature of many VoIP-type systems is how they respond to silence. In most phone conversations, only one person is talking at any given time, sometimes for periods as long as minutes. There's no point in sending long periods of zero-amplitude samples (or more likely light breathing). Instead, what the silent system does is stop sending entirely until there's something to transmit. This is called silence suppression.

Unfortunately, people often find it very disconcerting to have the other end go entirely blank. So, instead, some applications will generate comfort noise. Effectively, the receiver generates noise of the kind that you would hear over an unused but connected telephone line, just so that the person at the other end knows that they're connected. RFC 3389 describes a way for the sender to occasional description (spectrum and volume) of the kind of comfort noise it would like the receiver to get in place of actual message packets (these packets are smaller than voice samples).

Real-time application and network congestion
It should be obvious at this point that the rate at which a real-time applications transmit data has almost nothing to do with the characteristics of the network and almost everything to do with what's going on at either end of the connection. It shold be equally obvious that these applications aren't going to play nice with TCP. If the sending rate is independent of network conditions, at least two things can go wrong:

  1. Real-time media can starve TCP applications when network bandwidth is scarce. As packets get lost, TCP will start to back off, leaving more and more of the network for the real-time application. This is fine (maybe) if both the TCP connection and the VoIP call are yours, but it's unfair for my TCP connection to starve because you want to make a phone call.
  2. Real-time media can cause congestion collapse. Say you've got 4 people trying to make phone calls at 64 kb/s (ridiculously fast, but this is for illustration) over a (128 kb/s) ISDN line. Since they're offering 256 kb/s aggregate traffic, each application is losing half of its data, which means that effectively noone can make a phone call.

In order to avoid situations like (2), some real-time media systems incorporate variable-rate encoding. It's possible to encode voice and video t a number of quality levels, with each increase in quality level coming at a corresponding increase in bandwidth consumed. Thus, some systems detect when they're getting undue packet loss and switch to a slower/lower quality encoding scheme (these schemes are called codecs, for "coder-decoder"). However, this kind of rate-adjustment isn't designed to co-exist with TCP and in general it doesn't react the same way that TCP does. In particular, it's designed to be over a rather long time scale. As a consequence, it's quite possible for real-time applications to starve TCP connections. In addition, there's no guarantee it will prevent congestion collapse. TCP required substantial tuning before it handled congestion properly and these variable-rate codecs have had no such tuning.

Enter DCCP
In an attempt to close this gap, Eddie Kohler, Mark Handley, Sally Floyd, and Jitendra Padhye designed DCCP, the Datagram Congestion Control Protocol, which is a datagram protocol like UDP but with TCP-style congestion control. Unlike TCP, DCCP is designed with pluggable rate control modules (they're called Congestion Control IDs (CCIDs). The two currently defined ones are CCID-2, which has almost identical congestion behavior to TCP, and CCID-3 (also called TCP-Friendly Rate Control (TFRC)) which is designed for real-time applications. TCP congestion control can result in fairly wide variations in sending rate. TFRC is designed to be somewhat smoother. Both CCIDs are specifically designed to play nice with TCP.

There's only one problem. The real-time media people don't want DCCP. There's currently a long thread going on in the DCCP mailing list (pretty much all the messages beginnng here) with a good summary by Eddie Kohler here.

The basic objection is this: As we just discussed real-time applications want to send at a constant bit rate (CBR), at least over moderate time scales. However, proper congestion control response requires that they adjust their sending rate. This manifests itself in two main ways:

  1. When congestion occurs, the application has to back off.
  2. After silent periods, the application has to slow start.
The real-time media people don't want to do either one, because it degrades the user experience.

What kind of Internet do we want?
The real question here is what kind of Internet we want to have. The only way that TCP can coexist with large-scale real-time media usage on the the current Internet is if those streams use congestion control--though it doesn't have to be DCCP, of course. The fact that such streams currently don't have congestion control hasn't been a problem so far only because they represent such a small fraction of total Internet usage. As VoIP takes off, we have the makings of a real problem.

There are basically four possibilities:

  1. Give up on TCP (this means no more Web, e-mail, etc.!). This is a non-starter. Nearly every major Internet application runs over TCP. There has been talk of rewiring TCP to make it play nice with media flows, but the time it takes to deploy such changes is way too long for this to work.
  2. Give up on real-time Internet media. This is almost equally bad. Real-time media (especially VoIP) is becoming increasing important.
  3. Provide committed resources for the real-time media flows. Effectively, this means that real-time applications would tell every router along their data path "I need 16 kb/s" and if it couldn't get a promise for that, it wouldn't connect. This might work, but it's quite a different Internet than the one we have now and attempts to add this kind of reservation capability have not been successful in the past.
  4. Provide some kind of congestion control for real-time media.

This whole issue has been smoldering in the background for a while, but the introduction of DCCP and the IAB draft (full disclosure: I'm on the IAB, though I didn't have much personal part in this particular document) on congestion control for voice applications has brought it to full burn. Basically, there's a deep philosophical divide between the real-time media folks, who think their traffic is what's most important and the old-time Internet community which is very concerned about fairness for the existing applications, which are clearly very important at the moment. Given this philosophical divide and the increasing popularity of VoIP, I expect to see this issue get quite a bit more heated over the next six months to a year.

Posted by ekr at 09:18 AM | Comments (56) | TrackBack

December 11, 2003

Arresting spammers

Check it out. Virginia has filed charges against a North Carolina spammer [*].
According to antispam organization Spamhaus, "Stubberfield" is well-known for pornographic and "get rich quick" offers online and was ranked No. 8 on the group's top 10 spammers list for November. The charges were based in part on reports from America Online subscribers. Kilgore announced the indictment at AOL headquarters.

"Falsification (of e-mail headers or routing information) prevents the receiver from knowing who sent the spam or contacting them through the 'from address' of the e-mail," Kilgore said in a statement. "This is what makes this e-mail a crime in Virginia, and the volume that was sent during this period elevates the charge to a felony."

This seems to be the preferred tactic for prosecuting spammers. Instead of trying to make spam illegal, make forging the headers illegal. I haven't decided how I feel about this yet. Mr. Stubberfield doesn't sound like my favorite kind of person, but on the other hand I'm not sure how comfortable I am with the government prosecuting people based on what's in their e-mail headers.

Posted by ekr at 10:01 PM | Comments (10) | TrackBack

TCP Congestion Control

Security gets a lot of attention here on EG, but there's a lot more to internetworking. One aspect of protocol design that doesn't get much attention is congestion control. I want to talk about some congestion-related stuff going on in the IETF DCCP working group, but first I have to explain about congestion control. I'll do that in this post and then follow up soon on the topic of DCCP.

Flow Control
Before we talk about congestion control, first we need to talk about rate control. Say that Alice wants to transfer a file over a dedicated network--like a phone line--to Bob. Since they own the network, they want to get the file there as fast as possible, they basically want to transmit at network speed. Now, if you know exactly how fast the network is, you can just send at that data rate. But even modem lines do rate change occasionally due to line conditions, so it's better to design a protocol that's adaptive. If you're trying to transmit data over the Internet, conditions definitely change, so you certainly need to be adaptive.

On the Internet, the main protocol for doing large data transfers is the Transmission Control Protocol. (the following two paragraphs are cribbed from my previous post which you can go to to get a little more background)

Whenever you have large chunk of data to transmit, The first thing you do is break up the data into a series of segments small enough so that each segment can fit in an IP packet. Once you've done that, you've got to arrange that the segments get delivered to the remote end and not lost on the way. The obvious way to do this is to send one segment at a time. When the receiver receives a packet, it sends you an acknowledgement (called an ACK). When you get the ACK you send the next segment. If you don't get it within some timeout period (say 500 milliseconds) you retransmit the segment. This is called a stop-and-wait protocol.

The basic problem with a stop-and-wait protocol is that it's not very efficient. It takes time for packets to get from point A to point B and the entire time that the ACK is in transit, there is no data flowing from the sender to the receiver. A better approach is to use what's called a sliding window protocol. Instead of sending just one segment at a time, the sender sends several, with the maximum number being defined by the window size. Thus, while the ACK for segment 1 is in flight, segment 2 or 3 can already be on its way to the receiver. This way, the channel stays more or less full. TCP is a sliding window protocol.

Let's say that the sender S is transmitting data at 2 packets per second to the receiver R on a network with round-trip time of 1 second. The window is 2 packets. On a functioning network, the timeline looks like this:

TimeSenderReceiver
0Send S1-
.5Send S2Receive S1, Send Acknowledgement (A1)
1Receive A1, Send S3Receive S2, send A2
1.5receive A2, Send S4Receive S3, send A3
...

Figure 1: simple flow control

The key thing to note here is that it's the arrival of acknowledgement A1 that allows the sender to send S3. Because the window is 2, he couldn't send as long as he had S1 and S2 outstanding, but once he knows S1 has been received he can send S3.

Now, what happens when a packet gets lost. There's what's called a retransmission. It looks something like this:

TimeSenderReceiver
0Send S1-
.5Send S2 LOSTReceive S1, Send Acknowledgement (A1)
1Receive A1, Send S3-
1.5Retransmit S2Receive S3
2Retransmit S3Receive S2, send A3
...

Figure 2: packet loss

Now, note that two interesting things have happened here. First, the sender noticed that he didn't receive an ACK for S2 and he retransmitted it. Second, the receiver didn't send an ACK for S3 when he had not received S2. TCP uses what are called cumulative ACKs, which simply means that an ACK for packet n indicates that all previous packets were received. Thus, the receiver can't send A3 until he has seen S2 and S3. A side effect of this is that the sender has to retransmit both S2 and S3, because he doesn't know which packet was lost. [0]

What we've just shown is a single packet loss, but what happend Now, what happens if the network suddenly slows down by a factor of two, so that we can only send half as many packets per second and the round trip time doubles. However, here's the key thing: this is a network event. The sender and receiver don't know that the network has changed, so they're still operating under the old parameters. This means that we start to get timeouts and retransmissions as well, but it's a lot messier.

TimeSenderReceiver
0Send S1-
.5Send S2-
1Retransmit S1Receive S1, send A1
1.5Retransmit S2-
2Receive A1, Send S3Receive S2, send A2
...

Figure 3: network speed change

Note that not only has our transmission rate been cut in half, we're being really inefficient with our use of the network, because S1 and S2 were retransmitted. As we'll see shortly, this makes the situation worse. If the sender had just had a better estimate of the round trip time, we could have avoided the retransmissions entirely. The obvious fix is to have the sender adjust its estimate of the round-trip time based on network conditions. If we do that, we can eventually mostly adjust to the new network conditions.

Congestion Issues
What I've just described is more or less how TCP as originally described in RFC 793 works. Unfortunately, it's totally broken. The basic problem is that RFC 793 TCP responds way too slowly to network conditions. This isn't a big deal when you own the entire network, but the Internet is a shared resource.

Consider the following case. Alice and Bob share a fast corporate network like an Ethernet which is connected to the Internet. Alice is sending something to some Internet site and consumes the entire network because it's otherwise idle. Now, when Bob starts sending stuff himself, suddenly the Internet link is at twice its capacity--even though the corporate network is unloaded. Alice and Bob's router (which connects them to the Internet) responds by dropping packets. Say that Alice sends packets A1, A2, and A3 and Bob sends packets B1, B2, and B3. Since Alice is transmitting at the network rate already, only half of these packets can get through. Most routers drop pretty fairly, so we'll assume that half of Alice's packets get through and half of Bob's get through, which is pretty much the same as cutting the transmission rate in half. This gives us effectively the scenario we saw in Figure 3, which is where things start to go bad, as shown in Figure 4 (we'll assume that they share a common receiver who has a fast Internet link). [1]

TimeAliceBobReceiver
0Send A1,A2,A3Send B1,B2,B3-
---Receive A2,A3,B2
0Send A1,A2,A3Send B1,B2,B3-
...

Figure 4: congestion

Note one difference between Figure 4 and Figs 1-3. Because Alice and Bob are on a fast Ethernet, they send all their packets in one shot and then let the router send them. This works great until we see congestion. But in the congestion case it starts to go very wrong. As in Figure 3, Alice and Bob aren't adjusting their transmission rate. They're just responding to congestion by retransmitting their first set of packets in place of sending new packets. That's a fine response to simple packet loss, but the problem here is that too much load is being offered to the router. Worse yet, they're is burning up bandwidth retransmitting data that has actually been received (in this case, A2,A3, and B2). In order to fix these problems, Alice and Bob need to actually send less data--half as much as they were before. Old-style TCP will eventually arrange that by increasing the retransmit timer, but it takes a while. During the initial period of congestion, Old-style TCP is not as responsive as it should be. In the case I've just described, it probably just takes a little while to settle out, during which neither side is getting optimal network performance. However, under the right (wrong circumstances), (typically with more senders in the game) the interaction of this set of algorithms can get so bad that data transmission can grind more or less to a halt. This is called congestion collapse.

Congestion Control
This isn't a theoretical issue. In 1986, the Internet experienced a number of actual congestion collapse events. This lead to a number of fixes being applied to TCP, which are described in a seminal paper by Karels and Jacobson. The paper describes 7 fixes, but a number of them are complicated and I want to focus on three that I think are at the heart of what it means to have congestion avoidance. For more detail, check out the Karels/Jacobson paper or Rich Stevens's fine TCP/IP Illustrated.

The first two fixes, congestion windows and exponential retransmit timers are designed to make TCP more responsive to packet loss events that might signal congestion. The idea behind exponential retransmit timers is simple. Instead of just blindly retransmitting at the retransmit interval, each time you do a retransmit you increase the retransmit timer (typically by doubling). This means that as soon as congestion starts to occur, you cut your retransmit rate radically. Congestion windows attack the problem of just retransmitting the same data over and over. Instead of just trusting the receiver to tell you about its window, the sender maintains a separate sending window called a congestion window. When packet loss is detected, the sender unilaterally reduces his window and will only send up to that window size, thus reducing the problem of retransmitting a large number of packets when only a single one is lost.

The second fix is what's called "slow start". Remember that I said that when Bob first started sending he shouldn't send at some fast default data rate? Slow start means that when Bob starts a connection, he initially uses a very small congestion window and then gradually increases it as he gets successful data transmission. This might look something like Figure 5: [1]

TimeSenderReceiver
0Send S1-
.5-Receive S1, Send Acknowledgement (A1)
1Receive A1, Send S2,S3-
1.5-Receive S2, send A2
...

Figure 5: slow start

Karels and Jacobson's fixes also includes a slow-start like algorithm for recovering from congestion events. Once you have decreased your congestion window in response to congestion, you slowly probe back up to higher data rates. This lets you get back to high speeds after experiencing transient congestion, thus adjusting both up and down in response to congestion

Fairness
The final concern we have is fairness. Alice and Bob have an equal right to use the network, so they should get equal shares of the bandwidth. TCP needs to ensure that just because Alice was there first doesn't mean that she gets all the bandwidth. The description here isn't enough to show that this will happen, but you can demonstrate under certain reasonable assumptions that TCP is fair. However, it's only fair if everyone plays by the rules. If that you had a TCP implementation which was slightly less aggressive in response to congestion it could get more than its fair share of the network. A related issue is that if you designed a non-TCP protocol--which of course has to share the network with TCP--it needs to do so fairly. A critical issue when designing Internet protocols is ensuring that they do so. In the next installment we'll talk about an important application where this is a problem.

 

[0] I've shown S3 being retransmitted at time 2 under the assumption that the sender uses individual packet-level retransmit timers, so S3 doesn't timeout until time 2. There are other designs, of course.
[1] Yes, I know that 2 isn't half of 3, but you can't deliver 1.5 packets.

Posted by ekr at 10:34 AM | Comments (29) | TrackBack

December 10, 2003

Slow progress on the Sprint front

Well, we're making some progress on the cell phone porting thing. Calls in and out of my phone now work with the new number, but Internet access still doesn't. The problem seems to be that when they made me a new account I got a new username/password, but that never made it to my phone. I tried to push a new username/password pair using the Web UI, but it didn't work. Have to go in and get the phone reprovisioned. I guess this is what you get for being a guinea pig for number portability.
Posted by ekr at 08:44 PM | Comments (34) | TrackBack

December 09, 2003

Bad security questions II

Actually, the whole security question thing is getting less and less viable. As far as I can tell, the principle is to have some piece of information that is fairly secret but that doesn't have to be memorized. That typically means some piece of your life history. Unfortunately, two trends are conspiring to make this approach less and less secure:
  1. There are only a limited number of security questions currently in use (mother's maiden name, pet's name, favorite color, etc.). If you have a large number of accounts then you start to have to overlap questions, which radically diminishes the security of the questions, since now someone at merchant A knows your security question at merchant B.
  2. As more and more information moves to the Web, it gets easier and easier to determine this kind of information about someone.

What would be really great, of course, would be an authentication system that didn't require either substantial memorization or this kind of life-based security question. For a while, it seemed like public-key based authentication was that technology, but it seems to have been doomed by bad implementation.

Posted by ekr at 07:51 PM | Comments (42) | TrackBack

Bad security questions

RTFM, Inc. is changing it's banking from Wells Fargo to Bank X [0] Being the owner and sole employee, I get to handle the transaction. It's of course common practice for banks to ask you for your mother's maiden name as a form of backup authentication. However, in this case, they asked me instead for my favorite color. This is a singularly bad choice of a challenge question because the number of colors people actually choose is incredibly small. Here, for instance, are the results of a survey in the US [*] [0]

Since about so many people choose blue, the additional security provided by this question is less than optimal. An attacker who guesses blue will be right almost half the time. Worse yet, what happens if they guess wrong? Can they just hang up the phone and try again with a different operator? Unless the bank's systems stop them from doing so, the attacker can just try blue, green, and red in sequence and get almost a 75% success rate!

To his credit, when I pushed back on this security question my account manager said that he never liked that question either and invited me to use a different, better question, which I won't reveal here (though it's not incredibly unsurprising, either). Still I wonder how many other people have accounts with Bank X and still have this as their security question.

[0] Name changed to protect the guilty.
[1] It's not clear exactly how the survey was performed, and I doubt it was perfectly scientific (though the people who did the survey seem to do a fair amount of political polling), but the result seems qualitatively right. Informal surveys produce similar results [*].

Posted by ekr at 07:38 PM | Comments (41) | TrackBack

Permanent kittens

Have you ever noticed how much cuter kittens are than cats? And it's not just that they look cuter (though see neoteny), but also that they're friendlier. Most cats won't give you the time of day but a kitten will curl up in your lap and play with you as much as you want. Same thing with puppies. A few years ago this gave me a an idea: what if we could genetically engineer pets that stayed juvenile forever. Wouldn't that be a great pet?

As a nice side effect, manufacturing this kind of animal would be a great testbed for aging research. At the moment it's pretty hard to do broad-scale testing of longevity technologies because people are naturally a bit antsy about screwing with their own bodies, so though there's life extension work going on in the lab, it's a long shot as far as financial incentives. But this is an application that you could sell more or less right away (except maybe in California [*]). Now, it's certainly possible, even likely, that the technology used to create permanently juvenile animals isn't the same as that required to prolong life, but I bet you'd learn a lot about life extension working on the problem. And since animals mature in a fraction of their lifespan, you'd know if your techiques were working a long time before you would know if life extension had worked.

Posted by ekr at 06:42 PM | Comments (49) | TrackBack

December 08, 2003

More joy of phone porting

Well, spent some more time on the phone with Sprint this morning. The bad news is that the phone techs have no idea how to fix things until the port goes through, allegedly on Wednesday morning. I could take it into the store but I'm too busy today. Based on the current track record, I'm not exactly confident that things will work on Wednesday either.

The good news is that my old cell phone, which allegedly has already lost its number, continues to work, so I'm back to having separate cell phone and palm, just like before.

Posted by ekr at 11:45 AM | Comments (18) | TrackBack

December 07, 2003

Sorting data off a tap

Here's a somewhat interesting though not overly difficult technical problem that I ran across on a job this week. We're capturing and decoding network traffic using ethernet. Here's the catch: ethernet can be used in a full duplex mode. Say you're trying to capture traffic between A and B and the network is saturated in both directions. In order to capture all this traffic, you need two ethernet cables to the tap, one for the data from A to B and one for the data from B to A. And of course that means that the traffic from A to B comes into our capture machine on one Ethernet interface (say interface 1) and from B to A on another (say interface 2).

In principle, our capture engine is really simple. We wait for our incoming ethernet interfaces to be ready to read. When they are, we packets them. However, if we have two interfaces things get tricky. Let's examine the simplest possible case, the TCP three-way handshake:

Now, if A and B and the network between them are fast enough, the SYN and the SYN/ACK will arrive at our capture host more or less simultaneously. By the time our process wakes up, both interface 1 and interface 2 are ready to read. Unless we know whether A is the client or server, we have no way of knowing which packet arrived first without reading both packets and examining the timestamps.

Now, if we have only two packets, it's easy to examine them and figure out which one was first. Say that the packets arrived in the order 1,2 (the packet on interface 1, then the packet on interface 2). In that case, we know that we can process the packet on interface 1 then the packet on interface 2.

However, there are lots of situations where it's arbitrarily more complicated. In particular, the interfaces have buffering, so it's quite possible that by the time we wake up there are a bunch of packets available on each interface. For instance, consider what happens if one side sends two packets, so that we get the order 1,1,2. The simpleminded algorithm we just described would give us the delivery order 1,2,1, which is not good. There are a lot of ways to do this kind of sorting, but here's a simple, effective one.

  1. Set TEMP = 0.
  2. Wait for either interface to be ready to read.
  3. Read a packet off the ready interface (if both are ready, choose one arbitrarily) and write it into TEMP.
  4. Set OTHER to the interface you didn't read from.
  5. If OTHER isn't ready to read, deliver TEMP. Go to step 1.
  6. If OTHER is ready to read, then read a packet off OTHER. If this packet is older than TEMP, deliver it. If TEMP is older, deliver it and set TEMP to the new packet.
  7. Set OTHER to be the interface of the packet you didn't deliver.
  8. Go to step 5.

This isn't always the most efficient approach. In particular, if reading single packets off the interface is a lot slower than compared to reading multiple packets it's probably more efficient to read a bunch of packets and then sort them. On the other hand, it's more complicated to implement merging two lists, so there's a tradeoff here.

Posted by ekr at 11:06 PM | Comments (14) | TrackBack

The Sprint porting experience

So, I went into Sprint to port my Verizon cell number onto my Treo 600. Unfortunately, when I get there they informed me that my Verizon number was in a different service area from my Sprint number and so they can't do the port. Instead, they have to cancel my old Sprint account and give me a new number in the same service area. Then they can do the port.

The whole procedure took about 25 minutes and apparently involved a whole bunch of separate pieces of software. As I left I made a critical mistake: I didn't verify that the phone worked. Of course, the first time I tried to make a phone call... nothing. My phone can't be validated. Next step, Sprint phone support. I explain my problem to them and she understands how to fix it. Turns out my phone has been programmed with my old Verizon number, the one I want ported, not the new number. And since the new number is the only one that currently works, I'm SOL. So, I need to reprogram my phone. Unfortunately, since I've called her from my cell phone, and the interface is modal. Doh!

So, I call back from a landline and we go through the old "activating your phone" dialog (which at least I now know how to do). Unfortunately, that doesn't work, so I get bounced to a specialist, which means more holding on the line. The specialist bounces me to a number porting specialist who bounces me to another number porting specialist.

The final specialist informs me that Verizon has already released the number and that there are just "two more steps" to get the number ported. That should happen by midnight tonight. I'm instructed to program the Verizon number into the phone and wait. Ugh.

Posted by ekr at 12:57 PM | Comments (9) | TrackBack

December 06, 2003

Razor wire? What are we thinking?!?!?!

Am I the only one who thinks this is not good?
ABU HISHMA, Iraq, Dec. 6 As the guerrilla war against Iraqi insurgents intensifies, American soldiers have begun wrapping entire villages in barbed wire.

In selective cases, American soldiers are demolishing buildings thought to be used by Iraqi attackers. They have begun imprisoning the relatives of suspected guerrillas, in hopes of pressing the insurgents to turn themselves in

Reading the rest of the article is not going to make you feel any better, either.

Posted by ekr at 08:57 PM | Comments (57) | TrackBack

December 05, 2003

Another moon program?

Ok, this time I do agree with Gregg Easterbrook, who is arguing that the rumored new moon mission is a big waste of money [*].
NASA doesn't need a grand ambition, it needs a cheap, reliable means of getting back and forth to low-Earth orbit. Here's a twenty-first century vision for NASA: Cancel the shuttle, mothball the does-nothing space station, and use all the budget money the two would have consumed to develop an affordable means of space flight. Then we can talk about the Moon and Mars.

When you're faced with some goal that's at the limit of your technical capabilities, it's generally a better idea to spend your time building up infrastructure first, rather than trying to accomplish everything right away. For instance, the human genome project took 13 years but the most of the actual sequencing progress was made the final few years, and now sequencing is almost a routine procedure.

In this case, as Easterbook suggests, it would probably be good to take the next 5-10 years and just work on better launch platforms (or even a beanstalk). If we can do that, it will be a lot easier to mount substantial space missions economically.

Posted by ekr at 10:33 PM | Comments (15) | TrackBack

The Lancet wants to ban smoking

This week's issue of the Lancet contains a truly astonishing editorial endorsing a ban on cigarettes:
Tim Lord, the Chief Executive of the Tobacco Manufacturers' Association, believes that price is the main determinant of how many smokers there are. We disagree. Availability and acceptability are more important. If tobacco were an illegal substance, possession of cigarettes would become a crime, and the number of smokers would drastically fall. Cigarette smoking is a dangerous addiction. We should be doing a great deal more to prevent this disease and to help its victims. We call on Tony Blair's government to ban tobacco.

Let's ignore the all-too typical assumption that people can't be trusted to make their own choices but rather need doctors to decide for them. Instead, let's focus on the argument that a ban is needed because prices won't do the job. In fact, smoking is fairly price sensitive, especially among young smokers [*] [*]. So, what exactly is the need for a ban?

Posted by ekr at 09:31 PM | Comments (11) | TrackBack

December 04, 2003

Barbra Streisand loses

Via Eugene Volokh I see that Barbra Streisand has lost her lawsuit against Ken Adelman [*]. Better yet, she is being forced to pay Adelman's legal fees. I said it was a bad idea to piss off Ken Adelman...
Posted by ekr at 07:37 AM | Comments (50) | TrackBack

December 03, 2003

Drug free school zone, huh?

Mark Kleiman points to an interesting article in the FAS Drug Policy Bulletin. The authors conclude that drug free school zones are unlikely to have much of an effect, at least in preventing people from selling drugs near schools. Part of the problem is that a very large portion of the state of Massachussets is covered by "drug free school zones". See, for instance, the following map of New Bedford:

The major effect of drug free school zone legislation seems to be to enhance penalties for drug dealers, irregardless of whether they're selling drugs to children. Actually, according to this article, drug sales to children don't seem to be much of a problem:
Our review of case files did not suggest that drug dealers were selling drugs to children. Only one of our 443 study cases involved a sale to a minor, and in that case, there was a personal relationship between the adult and the minor - it was not a street sale. More than 70% of the cases occurred when school was not in session. None of the cases involved a street pusher enticing children - the image that justifies the legislation in the minds of many.

This doesn't surprise me. It's true that I lived in a suburban area when I was in high school, but we didn't exactly have people crowding around trying to sell us drugs.

Posted by ekr at 09:08 PM | Comments (60) | TrackBack

December 02, 2003

Ouch

Caught Celebrity Poker Showdown on Bravo tonight. Not particularly good play, but the highlight/lowlight of the night was seeing Willy Garson call Ben Affleck's all-in with a Q-8 suited. That would be a good call even if you didn't know that Affleck had a pair of queens. Phil Gordon's commentary pretty much sums it up: "That was the worst call I've ever seen."
Posted by ekr at 09:49 PM | Comments (54) | TrackBack

This is pretty sweet

Check out the new Sony X505. costs $4000 and weighsl. 1.7 lbs. When the original Sony 505 was introduced 5 years ago it cost $2000 and weighed 3 lbs. Dropping over a pound in 5 years is nothing to sneeze at. At this rate, when the Sony T505 is introduced in 2013, it will cost $8,000 and weigh -.3 lbs.

UPDATE: Revised after Eu-Jin Goh pointed out that the X505 is $4000.
UPDATE2: Corrected an error. The old 505 was $2000, not $200. Thanks to Eu-Jin again.

Posted by ekr at 09:19 PM | Comments (30) | TrackBack

Russia won't ratify Kyoto

Lookitthat... Russia isn't going to ratify the Kyoto protocol [*]. Apparently, this means that the whole Kyoto thing is dead, since there isn't a critical mass of ratifying countries. Not that the signatories were complying with their required reductions anyway:
Illarionov's comments came as delegates from 180 countries met in Milan, Italy, to explore the future of the accord. The European Commission (news - web sites) warned meanwhile Tuesday that the European Union (news - web sites) was falling short of its own targets in the accord, and needed to urgently introduce new measures to correct the situation.

I wonder if they'll bother now...

Posted by ekr at 08:44 PM | Comments (9) | TrackBack

December 01, 2003

Making money II

In which our hero extracts $200 from a cell phone company...

In previous episodes, I had purchased a Treo 600, only to discover that Amazon was selling them for $200 less. So, today Terence and I went into the Sprint store to see if I could get some of that money back.

Actually, it was pretty easy. I waited about 5 minutes for a customer service tech and then told her I wanted to speak to a manager. I explained the situation and she agreed to price match. The whole process took about 30 minutes and involved a few rounds of me having to explain the situation to her (Yes, Amazon really offered this on new phones. Yes, it was on top of the normal $100 activation rebate, not instead of it) but the job was done and I walked out $200 richer.

I'd heard lots of horror stories about Sprint support, but the service manager was unfailingly polite and cheerful. I doubt having to spend 30 minutes giving $200 rebates to customers is the highlight of her day, but if it was, she didn't let on. It's actually a bit of a surprise that this isn't more routinized. As Terence points out, you would expect that there would be a parade of Palo Alto nerds walking in demanding their rebate and they would have it all set up: "Oh, another one of you people. Here's your $200. Go away." Maybe I'm just the only one cheap enough to bother.

Posted by ekr at 09:06 PM | Comments (10) | TrackBack