What would be maximally convenient would be an archival filesystem like the one used in Plan 9 [*]. The really cool thing about it is that it presents nightly snapshots of the filesystem state. So, for instance, the entire filesystem state as of December 31, 2001 would be stored under /n/dump/2001/1231. This is much more convenient than conventional backup systems.
Unfortunately, I don't know of an implementation of a filesystem like this for FreeBSD. In theory, you could just do it with a copy, but that would consume way too much storage. Plan 9 uses some optimization to avoid duplicate data. You'd need to have something like that to make this work in practice.
OnStar has said that its equipment was not involved in that case. An OnStar spokeswoman, Geri Lama, suggested that Mr. Dunnam's worries were overblown. The signals that the company sends to unlock car doors or track location-based information can be triggered only with a secure exchange of specific identifying data, which ought to deter all but the most determined hackers, she said.
I'm kind of curious what "secure exchange of specific identifying data" means. Sometimes when you hear "ought to deter all but the most determined hackers" it means "We know what we're doing. We designed our system securely but we're being careful about what we claim." Sometimes it means "we bungled things". It's easy to get these protocols wrong. If anyone has a pointer to documentation on their protocols I'd love to take a look.
While acknowledging uncertainty, the NRC tentatively suggested an estimate of 12 cents to reflect the cost of carbon emissions resulting from a one gallon decrease in gasoline consumption (which corresponds to a cost of $50 per metric ton of carbon). Further, it suggested an energy-security cost associated with consuming one gallon of gasoline of 12 cents (which corresponds to a cost of $5 per barrel of oil). Finally, the NRC estimated a cost of 2 cents per gallon due to emissions of air pollutants associated with the production and distribution of gasoline, resulting in total external costs of 26 cents per gallon.
The CBO report says that the gas tax is currently $.41. If these estimates are correct, then the gas tax is too high right now.
Again, I'm not qualified to review any of this data in detail, and I haven't read the NRC report. However, if we accept these results, it sure looks like at worst it would take a modest increase in gas taxes to achieve an efficient outcome.
Here's the really interesting bit:
The advantage of a gasoline tax over CAFE standards is much greater in the short run. Neither the higher tax nor higher CAFE standards would achieve full effectiveness until all existing vehicles were replaced, or after about 14 years in CBO's analysis. But over the initial 14 years, the tax would save 42 percent more gasoline than would CAFE standards with trading, while costing 27 percent less (see Summary Figure 1). The gasoline tax would outperform the CAFE standards because, while both policies would improve the fuel economy of new vehicles, the tax would produce greater immediate gasoline savings by inducing owners of both new and existing vehicles to drive less. In contrast, by making new vehicles cheaper to operate, higher CAFE standards would encourage owners of new vehicles to drive more (and would not affect the driving incentives of existing-vehicle owners at all).
Given the uncertainty of predictions beyond 10 years, decreasing consumption now is much more interesting than decreasing it 14 years hence, so this is an even stronger argument in favor of gas taxes.
I'm not really qualified to review their economic modelling, but this argument sounds intuitively right to me.
Shorter and alternate sentences and speedier release programs already were being implemented by the state's Department of Corrections - or were part of a recent lawsuit settlement over California's parole system.But senior administration officials said last week Schwarzenegger has asked them to consider further steps. The state faces a budget deficit that could grow to between $12 billion to $24 billion by the middle of 2005 if current spending and revenues don't change.
The changes would reverse years of a get-tough policy on criminals under California's last three governors, and could face opposition from Republican lawmakers who make up a minority in the Legislature.
Seeing as about 11% of the CA general fund (not including bond-based expenditures, which are large) is spent on courts and corrections, this could be a pretty big deal.
Now, maybe I'm just radically naive about how political campaigning works, but I don't see the point of this. So, what? Random people are going to type "miserable failure" into Google, get Bush's biography and think "ah, what a loser, I'll vote for Dean". That can't be right, seeing as most people would never type "miserable failure" into Google if they hadn't heard of this little hack. Surely the purpose isn't that. Rather, I'm assuming that the ostensible purpose is PR: people will hear about it and will sort of subsconsciously make the association. Frankly, I doubt it. Sure, people who already hate Bush (or love Dean or whatever) will feel all ra ra about it, but speaking as someone who just dislikes Bush, I find the whole thing rather juvenile, the Internet version of throwing a pie in Bush's face.
Like many such political stunts I suspet thact its real purpose is not to directly advance one's political agenda but rather as a sort of act of group bonding for the faithful. In which case, I guess it doesn't matter if it looks silly to people like me who are not--provided of course that it doesn't alienate the outsiders more than it energizes the faithful.
My guess is that the Allies did most of the damage during our campaign to push the Nazis back into Germany. Now, no doubt the Nazis did some of the damage during that period as well, but by then the Allies had better air support and more materiel so we were able to blow more stuff up. Besides, when we shelled the Germans, they were mostly occupying territory that was more or less untouched. When they shelled us, we were occupying territory that we had taken from them and therefore had probably already been shelled by the Allies.
Just a thought.
This isn't some isolated behavior, either. People all over the Western world save up the rest of the year so that they can buy presents, so we need to conclude that they really do prefer things this way. It's just rather counterintuitive, since we're used to the economic situations where there the marginal value of money decreases the more you have of it, but in this case it's actually increasing, at least up to a point.
So, if you have a Mac, this works fine. The Mac supplies charging power to the iPod through the Firewire interface. If you don't have a Mac, things are a little different.
I was using my trusty Sony Vaio, which has the four-pin IEEE 1394 (iLink) connector, which doesn't provide power. Apple gives you a 4 to 6 pin converter, but it just changes the form factor, so you're still not supplying power to the iPod. To charge the iPod, you connect the I-jack to the power adaptor. This means that you can either charge the iPod or copy data to it, but not both. This is ok some of the time, but when you're moving a lot of data to the iPod (which runs the disc constantly) you're burning up battery like crazy. This means you have to be pretty careful to have your iPod charged before you do any big copying activity. [*].
There are a number of things you can do to work around this:
All of these options seem kind of lame. I'm no Firewire expert, but it seems like Apple could have put another jack in the base and let you run a $10 Firewire cable from the base to the power adaptor. I'm not sure why they didn't, except that having things this way makes using an iPod with your Mac a much nicer experience than with a PC.
McGinnis, the agency's pharmacy affairs director, said the FDA would not piggyback its inspections on the Canadian system because the United States inspects drug manufacturers around the world, while Health Canada relies on inspections done by the drug maker's host country.
If Canada's drugs are manufactured in the same factories as American drugs FDA's then it's possible that the reason that Canada is getting safe drugs is that the manufacturers are complying with FDA regs, not because of the Canadian inspections. That's fine if most drugs are sold into the US, but if most people start buying their drugs from Canada, then it might be less important to comply with the FDA and quality could suffer. Now, I'm not convinced that the FDA's onsite inspection procedure is really any better, but if it is, then mass reimportation could in fact be a problem.
16:43:24.369144 ld1.rtfm.com.50469 > ld2.rtfm.com.8080: S 3615660492:3615660492(0) win 57344(DF) 16:43:24.369241 ld2.rtfm.com.8080 > ld1.rtfm.com.50469: S 2150875758:2150875758(0) ack 3615660493 win 57344 (DF) 16:43:24.369257 ld1.rtfm.com.50469 > ld2.rtfm.com.8080: . ack 1 win 57920 (DF) 16:43:24.369307 ld1.rtfm.com.50469 > ld2.rtfm.com.8080: P 1:78(77) ack 1 win 57920 (DF) 16:43:24.369377 ld2.rtfm.com.8080 > ld1.rtfm.com.50469: R 2150875759:2150875759(0) win 0
Now, if you're a networking guy you immediately suspect
that you're running into some kind of system limit, especially
when nothing crops up in the Apache error logs. I'm also not
running out of per-process file descriptors or mbufs.
Here's the second clue: the stall isn't permanent. The graph
below shows the number of requests served as a function of
time.
The key here is the 60 second periodicity. This smells of TCP TIME_WAIT, the time that any TCP host/port quartet can't be reused for after it's shut down, which is 60 seconds on FreeBSD. One way that this can happen is if you run out of client TCP ports. Say you have a maximum client TCP port of 5000, then after you've used 5000 ports, you have to wait for those ports to exit TIME_WAIT before you can reconnect. But I've already expanded the client's port range up to 60,000 so that can't be it.
Here's the third clue: we're doing about 200 transactions per second and we get about 40 seconds of requests before everything stalls, for a total of about 8000 transactions. If we reduce the number of connections to 100 per second, things work fine. This points the finger at the source of the problem: by default the maximum total number of open sockets (across all processes) on this BSD system is 8008.
These three pieces of information are enough to work out what's going on. We're hitting the server at 200 requests per second. After each request finishes, the socket sits in the TIME_WAIT state for 60 seconds. So, after 40 or so seconds, we've consumed all the available sockets on the system. 20 seconds later, the first of the sockets in TIME_WAIT starts to time out and so we can create sockets again, and the cycle repeats.
The way to fix this problem is to increase the maximum number of open sockets. BSD has a setting for this called kern.ipc.maxsockets. When this is set to 32000, things work fine.
There are still two features of the data I haven't figure out yet: the dip at about 20 seconds and the small spike at about 41 seconds. If anyone has any suggestions for those, I'd be interested to hear them.
Form the possessive singular of nouns by 's
Great. So how do we form the possessive of "Bill Gates"? "Gates" is a plural noun, right? But when we talk about Bill Gates there's only one of him...
Yes, yes, I know that this is really bad news for the KDE guys, but seriously, if you stripped the logos (and the silly names), not one user in 100 could tell you whether a given app was KDE or GNOME. And we'd certainly be much better off without all of this infighting.
In that case, our program looks like this:
A little thinking quickly reveals that this simple machine works just fine, with only two states, no matter how long the tape is.
For non CS types, a Turing Machine (TM) is a sort of primitive idealized computer. A Turing Machine consists of a state machine connected to a tape reader and an infinitely long tape. The tape is divided up into segments, each of which can have one of some finite set of symbols on it. The state machine can be in any of a finite number of states and has a set of rules for which states lead to other states. For instance, you might have a rule that said "if you're in state 1 and you read symbol 2, go to state 2". The cool thing about a Turing Machine is that in principle any program you can run on an ordinary computer can be run on a Turing Machine--though it may be a lot slower. [0] Sometimes it can be quite tricky to figure out how to write the programs, but it can always be done and it's interesting to think about how.
My toy Turing Machine would have a small tape reader and a mylar tape, plus some sort of dry-erase marker you could use to mark the tape. (Yes, I know that this isn't a real Turing Machine since the tape is finite, but you can do lots of cool stuff with finite tape TMs, too). You'd also have some sort of readout on the front to tell you what state you were in. I think this would make a great desktop toy, but I've never seen one. Does any company make them?
[0] Actually, because a TM has an infinitely long tape whereas real computers have only finite sized memories, a TM is more capable than a real computer.
The first big problem is display size. The Treo's display is about 45x45 mm and 160x160 pixels. The resolution could be improved, but the size of the screen is basically limited by the size of the device. So, what we need is a display that's not actually connected to the device. The obvious fix is a head-up display. Effectively a pair of glasses which displays an image in front of your face. The good news is that these already exist. The bad news is that they currently look pretty stupid and the resolution isn't that great (640x480). Plus, they're expensive. Still, we're not that far off. If the resolution were 1024x768 (about as good as your average laptop), I'd probably be willing to look stupid.
The second big problem is that the Treo is a pretty wimpy computer. It's incredibly slow and the operating system--if you can call it that--is stone age. However, that's mostly a power problem. If you were willing to lug around more battery, you could have a much faster processor and a correspondingly better operating system. Between improvements in battery technology and the development of low power processors, we're closing in on acceptable levels pretty quickly. After all, the Sony X505, screen, keyboard, and all, weighs in at only 1.7 lbs [*]. Without the screen or the keyboard (and the corresponding power drain) you could probably build an acceptable unit in a package weighing a pound or so even now.
The remaining problem is network connectivity. Nominally, Sprint's PCS Vision network gives you rates of 50-70 kb/s, but I don't usually see anything that fast. However, the network is getting faster and even twice as fast would be fairly livable. Certainly, a lot of the delays I see on my Treo appear to be because the processor is too slow. If you had a better processor, you would probably have a better user experience. Still, it would be pretty nice to have the Treo WiFi capable so that you could get high speed connectivity if it were available.
The bottom line, then, is that I don't think we're that far off. As I've observed before, people seem more and more willing to load themselves up with computer gear even with today's primitive equipment. As that trend converges with improved technology, I predict a lot more people looking like Steve Mann.
I'm in favor of easier access to medications, but it is sort of irksome to pay for prescription insurance and then to have all the medications I need go OTC. I just read that Singulair is heading that way, which will be very irksome. I will most likely have around $100 a month in OTC drugs between the Prilosec and the Singulair.I would like to have some differently structured prescription insurance that would handle unexpected extremely expensive prescriptions (all the crap I needed during my pregnancy comes to mind) but not cover day to day stuff, because I feel like I'm not getting my money's worth from that.
The entire premise of insurance is to let you hedge risk of loss. But in order to get that hedging, insurance companies have to charge premiums that are greater than the expected loss. But what this means is that if you have insurance for routine stuff, you're overpaying because you're always paying that risk premium as well. A secondary problem is what's called price illusion. If all of your medical problems are paid for by insurance, you have a tendency to consume more because it doesn't cost you anything. And of course this raises the cost of insurance. These two effects mean that medical insurance is a lousy way to pay for routine care.
The traditional way to handle this problem is with a deductible. You pay the first X dollars of your medical expenses but afterwards the insurance company kicks in. This allows you to hedge major risks but avoids having to use insurance to pay for routine stuff. An alternative approach is to have no insurance at all for routine stuff but only have major medical insurance for unexpected large expenses.
The method we have now is one of the worst approaches. It's extremely expensive and creates really perverse incentives for both customers and insurance companies. From this perspective, universal medical insurance (as in Canada or the UK) is just as bad--worse actually, since there's no opportunity for your insurance premiums to adjust to your increased use of medical services. It's possible that universal medical insurance is good for paying for catastrophic problems, but it's not a particularly efficient way to pay for routine low-level care.
It's just my impression, but it seems to me that lately a large number of drugs are going OTC (Claritin and Prilosec are the two biggies lately). If this is a trend, it's one I'm in favor of.
CAFE is one of those top-down command and control measures that environmentally minded politicians and pundits seem to love and economists hate. Basically, it works like this: manufacturers are required to achieve a certain mean fuel economy for the cars that they manufacture. (Click here for a good description). However, like most such programs, CAFE is littered with loopholes and the automakers have adapted to them. The most famous of these is that SUVs are classified as "light trucks" and have to meet a much lower standard (20.7 MPG rather than 27.5 MPG). Since people respond to incentives, it's not exactly surprising that SUVs and minivans get lousy gas mileage.
The solution that Public Citizen and Easterbrook endorse, of course, is to close the SUV loopholes, and jack up the MPG standards--in other words, more of the same. And most likely whatever new laws we pass will have similar loopholes that will need to be closed in a few years. However, there's a much simpler solution: tax gasoline more.
The first thing we have to do is look clearly at the situation in terms of costs and benefits:
Less say that gasoline costs $2.00/gallon and that it's worth $.25/mile for me to drive. If my car gets 20 mpg, then I'm getting $5.00 worth of value for every gallon of gas. Thus, I'm making a $3.00 profit with each gallon of gas I consume. That's all well and good, but remember that the pollution generated by my car is imposing costs on other people. We need to consider two possibilities:
In the first case, whenever I drive I'm decreasing the net welfare of society (I'm making a $3.00 profit but the rest of the population is incurring a > $3.00 loss. Thus, I shouldn't be driving at all.
In the second case, the net welfare of society is positive, but I'm the one that's reaping all the benefit whereas everyone else is bearing the costs. This seems deeply unfair.
There's a simple way to solve both problems: impose a tax on every gallon of gas equal to the size of the negative externality (this is generally called a Pigouvian tax after A.C. Pigou, the economist who thought of it.) This automatically produces the right outcome.:If the tax is greater than $3.00, then it's not worth it to me to drive and I don't. If it's less than $3.00 then society has my tax money (which, remember is equal to the size of the externality) and it can then (at least in theory) be passed on to the people affected (in this case, society at large). Moreover, since the cost of gas goes up, I have an incentive to buy a more efficient car--in fact, precisely the right incentive to do so.
There are two standard objections to this approach:
It's relatively easy to solve both problems by turning the extra tax money into a tax cut and weighting that tax cut towards the poor.
How large should this tax be? Estimates vary, but it looks like it should be about $1.00. [*]. The link above recommends $1.40, but that counts congestion as an externality linked to gasoline and there are better ways to account for congestion (for instance, toll roads or other forms of congestion pricing). <
People have many reasons to lie when asked whether they have committed adultery. That's why it's notoriously difficult to get accurate scientific information about this important subject. One of the few existing sets of hard facts emerged as a totally unexpected by-product of a medical study, per- formed nearly half a century ago for a different reason. That study's findings have never been revealed until now.I recently learned these facts from the distinguished medical scientist who ran the study. (Since he does not wish to be identified in this connection, I shall refer to him as Dr. X.) In the 1940s Dr. X. was studying the genetics of human blood groups, which are molecules that we acquire only by inertness. Each of us has dozens of blood-group substances on our red blood cells, and we inherit each substance either from our mother or from our father. The study's research plan was straightforward: go to the obstetrics ward of a highly respectable U.S. hospital; collect blood samples from one thousand newborn babies and their mothers and fathers; identify the blood groups in all the samples; and then use standard genetic reasoning to deduce the inheritance patterns.
To Dr. X's shock, the blood groups revealed nearly 10 percent of these babies to be the fruits of adultery! Proof of the babies' illegitimate origin was that they had one or more blood groups lacking in both alleged parents. There could be no question of mistaken maternity: the blood samples were drawn from an infant and its mother soon after the infant emerged from the mother. A blood group present in a baby but absent in its undoubted mother could only have come from its father. Absence of the blood group from the mother's husband as well showed conclusively that the baby had been sired by some other man, extramaritally. The true incidence of extramarital sex must have been considerably higher than 10 percent, since many other blood-group substances now being used in paternity tests were not yet known in the 1940s, and since most bouts of intercourse do not result in conception.
At the time that Dr.X made his discovery, research on American sexual habits was virtually taboo. He decided to maintain a prudent silence, never published his findings, and it was only with difficulty that I got his permission to mention his results without betraying his name. However, his results were later confirmed by several similar genetic studies whose results did get published. Those studies variously showed between about 5 and 30 percent of American and British babies to have been adulterously conceived. Again, the proportion of the tested couples of whom at least the wife had practiced adultery must have been higher, for the same two reasons as in Dr. X's study.
Disclaimer: I haven't read the actual studies here. Apparently a number of them were done by Baker and Bellis and are described in their book Sperm Competition.
If the adultery rate is this high--or even remotely this high--then it suggests that even if sex does serve the joint purposes of reproduction in unity, those two purposes aren't necessarily coupled (sorry, couldn't help it) as closely as Roback would like, and weren't even in 1940, before the age of free love and easy divorce. Which, of course, rather undermines her entire argument.
So, what is the meaning of human sexuality anyhow? Sexual activity has two natural, organic purposes: procreation and spousal unity. Babies are the most basic and natural consequences of sexual activity. "Spousal unity" means simply that sex builds attachments between husband and wife.Spousal unity is the feature of human sexuality that makes it distinct from purely animal sexuality. As far as I know, humans are the only animals that copulate face to face. Shakespeare described the sexual act as "making the two-backed beast." Both the Hebrew and the Christian Bible describe the sexual act as uniting the spouses in the most literal sense: "the two become one flesh." Two people become, if only for a short while, one flesh. Evolutionary psychology observes the survival value to spousal cooperation. Males and females who attach themselves to each other, have a better chance of seeing their offspring survive long enough to produce grandchildren. Science can now tell us how the hormones released during sex help to create emotional bonds between the partners.
...
We can construct, deconstruct and reconstruct our sexuality any way we want: it is our privilege as thinking creatures. However, human sexuality has a specific nature, regardless of what we believe or say about it. We are more likely to be satisfied with the outcome, if we work with our biology rather than against it. We will be happier if we face reality on its own terms.
I'll give Roback credit: she manages to skirt the naturalistic fallacy here--although this whole "work with our biology" thing comes pretty close.
However, it seems to me that if you're in favor of lifetime marriage, you probably don't want to be basing your argument on evolutionary psychology. A little observation of our primate cousins and human sexual behavior suggests that the "natural" male mating strategy is to attempt to impregnate as many females as possible, not to form lifelong bonds.
The ads suggest that the purpose of putting cameras in cellphones is to take photos and share them immediately by sending them over the airwaves to friends and relatives. But the real purpose is to sell minutes on your wireless service. Although no one really wants the return of the wall-tethered rotary-dial black Bell, there is something to be said for the days when a cellphone was just a cellphone.
Huh? Why does there have to be a single purpose? The way capitalism works is that companies offer you something of value to you and you pay them money in return. Their purpose is to make money. Your purpose is to get whatever you pay for. You might as well argue that the purpose of cars is to let car dealers make money rather than letting you drive from place to place. Actually, the analogy is pretty close here. One of the major sources of money for car dealers is financing, not profits from the car.
One of the most striking features of the multi-year history of digital music is the resistance that the labels showed to providing a simple music download service like iTunes. Instead, the labels kept creating these subscription services that were loaded with all sorts of DRM. I'd always assumed that the problem was simple fear: that they were terrified that if they ever let music be distributed in unrestricted digital form, it would all end up on Napster, thus completely destroying their business. This never made any sense, of course, since it was always trivial for people to rip CDs and put them on Napster, regardless of whether the labels assisted them.
However, this revealing Rolling Stone interview with Steve Jobs clears up the matter for me:
We said: These [music subscription] services that are out there now are going to fail. Music Net's gonna fail, Press Play's gonna fail. Here's why: People don't want to buy their music as a subscription. They bought 45's; then they bought LP's; then they bought cassettes; then they bought 8-tracks; then they bought CD's. They're going to want to buy downloads. People want to own their music. You don't want to rent your music -- and then, one day, if you stop paying, all your music goes away.And, you know, at 10 bucks a month, that's $120 a year. That's $1,200 a decade. That's a lot of money for me to listen to the songs I love. It's cheaper to buy, and that's what they're gonna want to do.
They didn't see it that way. There were people running around -- business-development people -- who kept pointing out AOL as the great model for this and saying: No, we want that -- we want a subscription business. We said: It ain't gonna work.
Ah... I get it now. The problem wasn't fear that they would lose their current revenue stream but rather the desire to secure a new revenue stream based on a subscription model. That actually makes more sense, and I'm kind of embarassed I didn't see it. The worst, part, of course, is that I thought I was being cynical by assuming the labels were stupid, but it can be so difficult to tell stupid from greedy.
You know when you start a Bernard Cornwell book you can strike certain items off a laundry list: undervalued superhero, check; bloody violence, check; loyal friends, some disposable, some not, check; fantasy chick, check; pitiless villain, check; final battle where the hero triumphs, check; an opening to the next chapter in the series, check. Cornwell never disappoints, nor does he ever really surprise. He is a guilty pleasure of several hours of, i dont want to say mindless reading, predictability.
Yep. All the Cornwell books are pretty good--not exactly great literature, but definitely an entertaining afternoon's reading.
Schiff goes on to win everything, leaving me scratching my head trying to decide if he's a genius or merely insane--probably both.
This is great if what you're transmitting is file-oriented data like web pages and file downloads, but it's not so great for other applications. The most obvious problem is how it handles packet loss. Say you're using TCP to transmit some kind of real time data like voice telephony or video. The sender sends packets 1,2,3,4,5,6,7 and packet 4 gets lost. The receiver sees packets 1,2,3,4,5,6,7, but because packets must be delivered in order, the receiving application only sees packets 1,2,3. The typical TCP retransmit time is about 500 ms, so 500 ms later the sender retransmits and the receiving application gets packet 4 and delivers packets 4,5,6,7 to the application. Again, that's fine if what you're transmitting is Web pages, but if it's real time voice, what you hear in your ear is 500 ms of silence followed by the delayed voice data, which is not at all what you want.
How to do real-time over the Internet... well sort of
With real-time applications like voice and videoconferencing,
you need a protocol other than TCP. Roughly, you want something
that:
In other words, you want a datagram protocol like User Datagram Protocol (UDP). Actually, UDP probably isn't as predictable and timely as you might want, but it's pretty much the best you're going to get on the Internet.
So, you're doing telephony over the Internet (generally
called Voice over IP (VoIP)) and of course you've
decided to use UDP (actually, you're probably using
RTP (the Real Time Protocol)) over UDP. The way
that that the sender captures short voice samples (typically
about 20ms or so) and sends them to the receiver, one
sample per UDP packet. The receiver decodes them and plays
them on the other end, thus reproducing the voice stream.
If a packet is missing or arrives
out of order, the receiver ignores it and tries to compensate, perhaps
by playing silence or replaying the previous sample.
Typically, these systems have a little bit of buffering,
maybe 50-100 ms, just to smooth out Internet jitter, so
if you happened to get packets 2 and 3 in the immediate sequence
3,2 you would be able to play them correctly. And if you
get 2,4 in immediate sequence, you probably try to interpolate
3.
Silence Suppression and Comfort Noise
Unfortunately, people often find it very disconcerting
to have the other end go entirely blank. So, instead,
some applications will generate comfort noise.
Effectively, the receiver generates noise of the kind
that you would hear over an unused but connected telephone
line, just so that the person at the other end knows
that they're connected.
RFC 3389
describes a way for the sender to occasional description
(spectrum and volume)
of the kind of comfort noise it would like the receiver
to get in place of actual message packets (these packets
are smaller than voice samples).
Real-time application and network congestion
In order to avoid situations like (2), some real-time media
systems incorporate variable-rate encoding. It's possible
to encode voice and video t a number of quality levels, with each
increase in quality level coming at a corresponding increase
in bandwidth consumed. Thus, some systems detect when they're
getting undue packet loss and switch to a slower/lower
quality encoding scheme (these schemes are called codecs,
for "coder-decoder"). However, this kind of rate-adjustment
isn't designed to co-exist with TCP and in general it doesn't
react the same way that TCP does. In particular, it's designed
to be over a rather long time scale. As a consequence, it's quite
possible for real-time applications to starve TCP connections.
In addition, there's no guarantee it will prevent congestion
collapse. TCP required substantial tuning before it handled
congestion properly and these variable-rate codecs have had no
such tuning.
Enter DCCP
There's only one problem. The real-time media people don't want
DCCP. There's currently a long thread going on in the DCCP mailing
list (pretty much all the messages beginnng here) with a good summary by Eddie Kohler here.
The basic objection is this: As we just discussed
real-time applications want to send
at a constant bit rate (CBR), at least over moderate time
scales. However, proper congestion control response requires that
they adjust their sending rate. This manifests itself in two main ways:
What kind of Internet do we want?
There are basically four possibilities:
This whole issue has been smoldering in the background for
a while, but the introduction of DCCP and the IAB
draft (full disclosure: I'm on the IAB, though I didn't have much personal
part in this particular document) on congestion control for voice applications has brought it
to full burn.
Basically, there's a deep philosophical divide between the
real-time media folks, who think their traffic is what's
most important and the old-time Internet community which
is very concerned about fairness for the existing applications,
which are clearly very important at the moment.
Given this philosophical divide and the increasing
popularity of VoIP, I expect to see this
issue get quite a bit more heated over the next six months
to a year.
One interesting feature of many VoIP-type systems is how
they respond to silence. In most phone conversations, only
one person is talking at any given time, sometimes for
periods as long as minutes. There's no point in sending
long periods of zero-amplitude samples (or more likely light breathing).
Instead, what the silent system does is stop sending entirely
until there's something to transmit. This is called
silence suppression.
It should be obvious at this point that the rate at which a real-time
applications transmit data has almost nothing to do with the
characteristics of the network and almost everything to do
with what's going on at either end of the connection. It
shold be equally obvious that these applications aren't
going to play nice with TCP. If the sending rate
is independent of network conditions, at least two things
can go wrong:
In an attempt to close this gap, Eddie Kohler, Mark Handley, Sally
Floyd, and Jitendra Padhye designed DCCP, the Datagram
Congestion Control Protocol, which is a datagram protocol
like UDP but with TCP-style congestion control.
Unlike TCP, DCCP is designed with pluggable rate control modules
(they're called Congestion Control IDs (CCIDs). The two
currently defined ones are
CCID-2, which has almost identical congestion behavior to TCP,
and CCID-3 (also called TCP-Friendly Rate Control (TFRC))
which is designed for real-time applications. TCP congestion
control can result in fairly wide variations in sending rate.
TFRC is designed to be somewhat smoother. Both CCIDs are specifically
designed to play nice with TCP.
The real-time media people don't want to do either one, because
it degrades the user experience.
The real question here is what kind of Internet we want to have.
The only way that TCP can coexist with large-scale real-time media
usage on the the current Internet
is if those streams use congestion control--though it doesn't have to be DCCP, of course.
The fact that such streams currently don't have congestion
control hasn't been a problem so far only because they
represent such a small fraction of total Internet usage.
As VoIP takes off, we have the makings of a real problem.
According to antispam organization Spamhaus, "Stubberfield" is well-known for pornographic and "get rich quick" offers online and was ranked No. 8 on the group's top 10 spammers list for November. The charges were based in part on reports from America Online subscribers. Kilgore announced the indictment at AOL headquarters."Falsification (of e-mail headers or routing information) prevents the receiver from knowing who sent the spam or contacting them through the 'from address' of the e-mail," Kilgore said in a statement. "This is what makes this e-mail a crime in Virginia, and the volume that was sent during this period elevates the charge to a felony."
This seems to be the preferred tactic for prosecuting spammers. Instead of trying to make spam illegal, make forging the headers illegal. I haven't decided how I feel about this yet. Mr. Stubberfield doesn't sound like my favorite kind of person, but on the other hand I'm not sure how comfortable I am with the government prosecuting people based on what's in their e-mail headers.
Flow Control
Before we talk about congestion control, first we need to talk about
rate control. Say that Alice wants to transfer a file over a
dedicated network--like a phone line--to Bob. Since they own
the network, they want to get the file there as fast as possible,
they basically want to transmit at network speed.
Now, if you know exactly how fast the network is, you
can just send at that data rate. But even modem lines
do rate change occasionally due to line conditions, so
it's better to design a protocol that's adaptive. If you're
trying to transmit data over the Internet, conditions
definitely change, so you certainly need to be adaptive.
On the Internet, the main protocol for doing large data transfers is the Transmission Control Protocol. (the following two paragraphs are cribbed from my previous post which you can go to to get a little more background)
Whenever you have large chunk of data to transmit, The first thing you do is break up the data into a series of segments small enough so that each segment can fit in an IP packet. Once you've done that, you've got to arrange that the segments get delivered to the remote end and not lost on the way. The obvious way to do this is to send one segment at a time. When the receiver receives a packet, it sends you an acknowledgement (called an ACK). When you get the ACK you send the next segment. If you don't get it within some timeout period (say 500 milliseconds) you retransmit the segment. This is called a stop-and-wait protocol.
The basic problem with a stop-and-wait protocol is that it's not very efficient. It takes time for packets to get from point A to point B and the entire time that the ACK is in transit, there is no data flowing from the sender to the receiver. A better approach is to use what's called a sliding window protocol. Instead of sending just one segment at a time, the sender sends several, with the maximum number being defined by the window size. Thus, while the ACK for segment 1 is in flight, segment 2 or 3 can already be on its way to the receiver. This way, the channel stays more or less full. TCP is a sliding window protocol.
Let's say that the sender S is transmitting data at 2 packets per second to the receiver R on a network with round-trip time of 1 second. The window is 2 packets. On a functioning network, the timeline looks like this:
| Time | Sender | Receiver | |
| 0 | Send S1 | - | |
| .5 | Send S2 | Receive S1, Send Acknowledgement (A1) | |
| 1 | Receive A1, Send S3 | Receive S2, send A2 | |
| 1.5 | receive A2, Send S4 | Receive S3, send A3 | |
| ... | |||
The key thing to note here is that it's the arrival of acknowledgement A1 that allows the sender to send S3. Because the window is 2, he couldn't send as long as he had S1 and S2 outstanding, but once he knows S1 has been received he can send S3.
Now, what happens when a packet gets lost. There's what's called a retransmission. It looks something like this:
| Time | Sender | Receiver | |
| 0 | Send S1 | - | |
| .5 | Send S2 LOST | Receive S1, Send Acknowledgement (A1) | |
| 1 | Receive A1, Send S3 | - | |
| 1.5 | Retransmit S2 | Receive S3 | |
| 2 | Retransmit S3 | Receive S2, send A3 | |
| ... | |||
Now, note that two interesting things have happened here. First, the sender noticed that he didn't receive an ACK for S2 and he retransmitted it. Second, the receiver didn't send an ACK for S3 when he had not received S2. TCP uses what are called cumulative ACKs, which simply means that an ACK for packet n indicates that all previous packets were received. Thus, the receiver can't send A3 until he has seen S2 and S3. A side effect of this is that the sender has to retransmit both S2 and S3, because he doesn't know which packet was lost. [0]
What we've just shown is a single packet loss, but what happend Now, what happens if the network suddenly slows down by a factor of two, so that we can only send half as many packets per second and the round trip time doubles. However, here's the key thing: this is a network event. The sender and receiver don't know that the network has changed, so they're still operating under the old parameters. This means that we start to get timeouts and retransmissions as well, but it's a lot messier.
| Time | Sender | Receiver | |
| 0 | Send S1 | - | |
| .5 | Send S2 | - | |
| 1 | Retransmit S1 | Receive S1, send A1 | |
| 1.5 | Retransmit S2 | - | |
| 2 | Receive A1, Send S3 | Receive S2, send A2 | |
| ... | |||
Note that not only has our transmission rate been cut in half, we're being really inefficient with our use of the network, because S1 and S2 were retransmitted. As we'll see shortly, this makes the situation worse. If the sender had just had a better estimate of the round trip time, we could have avoided the retransmissions entirely. The obvious fix is to have the sender adjust its estimate of the round-trip time based on network conditions. If we do that, we can eventually mostly adjust to the new network conditions.
Congestion Issues
What I've just described is
more or less how TCP as
originally described in RFC 793
works. Unfortunately, it's totally broken.
The basic problem is that RFC 793 TCP responds way too
slowly to network conditions. This isn't a big deal
when you own the entire network, but the Internet is
a shared resource.
Consider the following case. Alice and Bob share a fast corporate network like an Ethernet which is connected to the Internet. Alice is sending something to some Internet site and consumes the entire network because it's otherwise idle. Now, when Bob starts sending stuff himself, suddenly the Internet link is at twice its capacity--even though the corporate network is unloaded. Alice and Bob's router (which connects them to the Internet) responds by dropping packets. Say that Alice sends packets A1, A2, and A3 and Bob sends packets B1, B2, and B3. Since Alice is transmitting at the network rate already, only half of these packets can get through. Most routers drop pretty fairly, so we'll assume that half of Alice's packets get through and half of Bob's get through, which is pretty much the same as cutting the transmission rate in half. This gives us effectively the scenario we saw in Figure 3, which is where things start to go bad, as shown in Figure 4 (we'll assume that they share a common receiver who has a fast Internet link). [1]
| Time | Alice | Bob | Receiver | |
| 0 | Send A1,A2,A3 | Send B1,B2,B3 | - | |
| - | - | - | Receive A2,A3,B2 | |
| 0 | Send A1,A2,A3 | Send B1,B2,B3 | - | |
| ... | ||||
Note one difference between Figure 4 and Figs 1-3. Because Alice and Bob are on a fast Ethernet, they send all their packets in one shot and then let the router send them. This works great until we see congestion. But in the congestion case it starts to go very wrong. As in Figure 3, Alice and Bob aren't adjusting their transmission rate. They're just responding to congestion by retransmitting their first set of packets in place of sending new packets. That's a fine response to simple packet loss, but the problem here is that too much load is being offered to the router. Worse yet, they're is burning up bandwidth retransmitting data that has actually been received (in this case, A2,A3, and B2). In order to fix these problems, Alice and Bob need to actually send less data--half as much as they were before. Old-style TCP will eventually arrange that by increasing the retransmit timer, but it takes a while. During the initial period of congestion, Old-style TCP is not as responsive as it should be. In the case I've just described, it probably just takes a little while to settle out, during which neither side is getting optimal network performance. However, under the right (wrong circumstances), (typically with more senders in the game) the interaction of this set of algorithms can get so bad that data transmission can grind more or less to a halt. This is called congestion collapse.
Congestion Control
This isn't a theoretical issue. In 1986, the Internet experienced a number
of actual congestion collapse events. This lead to a number of fixes
being applied to TCP, which are described in a seminal
paper by
Karels and Jacobson. The paper describes 7 fixes, but a number
of them are complicated and I want to focus on three that I think
are at the heart of what it means to have congestion avoidance.
For more detail, check out the Karels/Jacobson paper
or Rich Stevens's fine TCP/IP Illustrated.
The first two fixes, congestion windows and exponential retransmit timers are designed to make TCP more responsive to packet loss events that might signal congestion. The idea behind exponential retransmit timers is simple. Instead of just blindly retransmitting at the retransmit interval, each time you do a retransmit you increase the retransmit timer (typically by doubling). This means that as soon as congestion starts to occur, you cut your retransmit rate radically. Congestion windows attack the problem of just retransmitting the same data over and over. Instead of just trusting the receiver to tell you about its window, the sender maintains a separate sending window called a congestion window. When packet loss is detected, the sender unilaterally reduces his window and will only send up to that window size, thus reducing the problem of retransmitting a large number of packets when only a single one is lost.
The second fix is what's called "slow start". Remember that I said that when Bob first started sending he shouldn't send at some fast default data rate? Slow start means that when Bob starts a connection, he initially uses a very small congestion window and then gradually increases it as he gets successful data transmission. This might look something like Figure 5: [1]
| Time | Sender | Receiver | |
| 0 | Send S1 | - | |
| .5 | - | Receive S1, Send Acknowledgement (A1) | |
| 1 | Receive A1, Send S2,S3 | - | |
| 1.5 | - | Receive S2, send A2 | |
| ... | |||
Karels and Jacobson's fixes also includes a slow-start like algorithm for recovering from congestion events. Once you have decreased your congestion window in response to congestion, you slowly probe back up to higher data rates. This lets you get back to high speeds after experiencing transient congestion, thus adjusting both up and down in response to congestion
Fairness
The final concern we have is fairness. Alice and Bob have an equal
right to use the network, so they should get equal shares of the
bandwidth. TCP needs to ensure that just because Alice was there
first doesn't mean that she gets all the bandwidth. The description
here isn't enough to show that this will happen, but you can
demonstrate under certain reasonable assumptions that TCP is
fair. However, it's only fair if everyone plays by the rules.
If that you had a TCP implementation which was slightly
less aggressive in response to congestion it could get more than
its fair share of the network. A related issue is that if you
designed a non-TCP protocol--which of course has to share the
network with TCP--it needs to do so fairly.
A critical issue when designing
Internet protocols is ensuring that they do so. In the
next installment we'll talk about an important application where
this is a problem.
[0] I've shown S3 being retransmitted at time 2 under the assumption
that the sender uses individual packet-level retransmit timers,
so S3 doesn't timeout until time 2. There are other designs, of
course.
[1] Yes, I know that 2 isn't half of 3, but you can't deliver 1.5 packets.
What would be really great, of course, would be an authentication system that didn't require either substantial memorization or this kind of life-based security question. For a while, it seemed like public-key based authentication was that technology, but it seems to have been doomed by bad implementation.
Since about so many people choose blue, the additional security provided by this question is less than optimal. An attacker who guesses blue will be right almost half the time. Worse yet, what happens if they guess wrong? Can they just hang up the phone and try again with a different operator? Unless the bank's systems stop them from doing so, the attacker can just try blue, green, and red in sequence and get almost a 75% success rate!
To his credit, when I pushed back on this security question my account manager said that he never liked that question either and invited me to use a different, better question, which I won't reveal here (though it's not incredibly unsurprising, either). Still I wonder how many other people have accounts with Bank X and still have this as their security question.
[0] Name changed to protect the guilty.
[1] It's not clear exactly how the survey was performed, and I doubt
it was perfectly scientific (though the people who did the survey
seem to do a fair amount of political polling), but the result
seems qualitatively right. Informal surveys produce similar
results [*].
As a nice side effect, manufacturing this kind of animal would be a great testbed for aging research. At the moment it's pretty hard to do broad-scale testing of longevity technologies because people are naturally a bit antsy about screwing with their own bodies, so though there's life extension work going on in the lab, it's a long shot as far as financial incentives. But this is an application that you could sell more or less right away (except maybe in California [*]). Now, it's certainly possible, even likely, that the technology used to create permanently juvenile animals isn't the same as that required to prolong life, but I bet you'd learn a lot about life extension working on the problem. And since animals mature in a fraction of their lifespan, you'd know if your techiques were working a long time before you would know if life extension had worked.
The good news is that my old cell phone, which allegedly has already lost its number, continues to work, so I'm back to having separate cell phone and palm, just like before.
In principle, our capture engine is really simple. We wait for our incoming ethernet interfaces to be ready to read. When they are, we packets them. However, if we have two interfaces things get tricky. Let's examine the simplest possible case, the TCP three-way handshake:
Now, if A and B and the network between them are fast enough, the SYN and the SYN/ACK will arrive at our capture host more or less simultaneously. By the time our process wakes up, both interface 1 and interface 2 are ready to read. Unless we know whether A is the client or server, we have no way of knowing which packet arrived first without reading both packets and examining the timestamps.
Now, if we have only two packets, it's easy to examine them and figure out which one was first. Say that the packets arrived in the order 1,2 (the packet on interface 1, then the packet on interface 2). In that case, we know th