On Perfect Shuffles

There is a fun fact known to card players -- if you take a pristine, ordered deck of cards and perform the usual ripple shuffle on it 8 times, you will end up with the original, unshuffled deck. Some players even frown upon dealers who do the ripple shuffle too many times because of worries that the deck will somehow "lose randomness" that way. In fact, the "eight perfect shuffles" anecdote is just a mathematical curiosity. Practice isn't as compelling as the theory.

First, let me define what most people mean by the term "a perfect shuffle". You take a deck of 52 different cards and put it on the table in front of you. You then pick up the top half (26 cards) with your right hand and move those cards to the right side. You pick up the remaining 26 cards with your left hand. While holding the two halves of the deck in your two hands, you let cards drop from the bottoms of the two half-decks into the middle, creating a new deck. First, you drop one card from your left hand, then a card from your right hand, then a card from the left, and so on until all 52 cards are in the new deck in the middle, forming an alternating zipper-like pattern. You then straighten the edges of the new deck and repeat the procedure (take the top half into the right hand, etc.)

If you do this eight times, you will end up with the deck in the original order; that's true. However, it is very important that the first card you drop fall from your left hand (the bottom half). In fact, that card will never move from the bottom. It is always going to be the first one to fall and the only one to touch the table. Similarly, the top card will always remain on top. The "shuffle" isn't really doing its job here, is it?

We can fix that. Instead of starting the zipper by dropping a card from the left hand, let's start with the right one. That way, the bottom card will at least move one position up in the deck, and the top card will move one position down. Suddenly, the rule of eight does not apply anymore. It will now take 52 shuffles to get back to the original order of the deck.

Don't try this; you'll mess it up. Write a computer program to simulate this shuffle instead. If you have ever actually tried to perform the perfect shuffle, you probably know how difficult it is to get it exactly right. If you misplace even a single card, the whole trick is ruined.

Let's call the original perfect shuffle left-handed (because we start dropping cards with the left hand), and the new, arguably better shuffle right-handed. We could say that the left-handed perfect shuffle has a period of 8, while the right-handed version has a period of 52. Let's see what would happen if we alternated between the left-handed and the right-handed shuffles like this: L, R, L, R, L, R, ... It turns out, that we get a period of 504. We get the same period by using the pattern R, L, R, L, R, L, ...

Fancier combinations like (L, L, R, L, L, R, L, L, R, ...) and (L, R, R, L, R, R, L, R, R, ...) produce shorter periods of 80 and 160, respectively.

Now, what happens if you get really good at the perfect shuffle, but you have trouble controlling whether you do the left-handed or the right-handed version on each try? Let's say you pick the version at random. It turns out that you will never get back to the original ordering (well, at least not in your lifetime). Five or ten shuffles in, the deck's ordering will look pretty random. The chances of getting back to the sorted deck in fewer than 1000 shuffles are smaller than one in 200, and once you have performed a few shuffles already, the chances drop exponentially.

I'm not sure what the moral of the story is, except that -- if you are sitting at the table, ripple-shuffling a deck of cards, and you stop to look at the order of the cards, chances are that the cards will not be perfectly ordered... unless you are a sleight-of-hand expert.

Captain Obvious would be proud.

Bob Fosse

Michael Jackson started his solo career in 1975, at the earliest, when he was still a member of the Jackson 5. He was a phenomenal dancer and deserves great praise just for his dancing alone, not to mention the songs. But what he most certainly wasn't is original.

Here is Bob Fosse in 1974:
There is even a very crude moonwalk in there.

A plea to all writers

If you are writing a story, a novel or a screenplay, and one of your characters obtains some valuable information -- video evidence of a crime, a hard drive full of top-secret research or pictures of a cheating spouse -- listen up, I'm talking to you! I bet you are thinking of a thrilling plot twist along the lines of...

Dude with the secret USB memory stick:
  "I've got something you want. You can have
  it back for $100,000 in unmarked bills."
Embarrassed cheating husband:
  "Alright! Meet me at 8pm on the corner of
  Wirth Ave. and Chaidez St."

And then they meet, and an exchange takes place, and the matter is closed, and the story moves on.

No! Fail! Bad writer! Everyone with an IQ above 50 knows to make a copy of that USB memory stick first, or that hard drive, or that answering machine message, or those pictures, or that security camera footage. Information is not a truck. It's not even a series of tubes. You can't copy-paste a series of tubes. You can, however, Xerox an image. Who wouldn't?! That picture is clearly worth a lot of money. Why would you not keep a copy for yourself?

So please, stop writing this crap. No one believes it. I've even heard exchange dialogues that go like this:

Girl with a video tape:
  "Here, take it. Don't worry; it's the
  only copy, I just want this to be over with."
Guy with the envelope of money:
  "Good. Here is the money. You may count
  it. Give me the tape."

Of course she has another copy at home! Is she stupid? Even the writer knows it, or else there would be no mention of any "copy" in the script. If you write something like this, it's as if you are saying, "Yeah, I know it's stupid and implausible, but I'm writing this anyway because I can't think of anything cooler than stealing a hard drive full of medical research and then blackmailing aliens to trade for it." Right. Your character can crack a password and break strong encryption, but doesn't know how to drag and drop desktop icons. Genius!

An alternative to democracy

Winston Churchill famously said, "Democracy is the worst form of government, except for all those other forms that have been tried from time to time."

I think I have an alternative to democracy that might work better. It certainly hasn't been tried before on the scale of a country. And I certainly haven't thought through all the details and unintended consequences of this plan. For now, it is just a thought experiment, and I will try my best to explain the idea concisely. Here goes...

What are the main advantages of democracy over other forms of government? Here are a few obvious ones.

  • Fairness -- each person has one vote.

  • Term limits -- an unpopular leader cannot stay in office indefinitely.

  • Majority rule -- new laws need majority support; unpopular laws can be repealed if a majority wishes to.

The obvious drawbacks are

  • Minority abuse -- a majority can treat a small number of citizens unfairly.

  • Myopia -- law makers care about being re-elected in the next election, not about long-term effects of their policies.

  • Logistics of voting -- it is impossible to have a true, full-country referendum on every issue.

The common solutions to these problems are human rights (basic rights to protect minorities), transparency (to expose bad, short-sighted policies for public review) and representation (citizens elect a small number of representatives, who vote on new laws on their behalf). Also, a Constitution is used to establish the process of elections and transfer of power, as well as checks and balances (branches of government).

When is democracy not a good idea? Take a look at any small or medium company. The CEO is in charge and can hire or fire anyone he or she wants. The CEO has advisors (vice presidents), who have directors under them, who, in turn, manage other employees, etc. The corporate structure is typically a tree. Decisions made by VPs and directors carry more weight than decisions made by regular employees. Ideally, useful employees get promoted and get more responsibility (more weight to their decisions).

This structure resembles a monarchy rather than a democracy. Decisions on which products to build, which competitors to acquire and whom to fire are made by individuals, not by a vote of every employee in the company. Why is that? Why aren't businesses democratic? The answer is that some people have more knowledge and experience than others and thus, can make better decisions.

Isn't the same true in politics? It is, but proposing the idea that some votes should count more than others is a sure way to get yelled at. People do not like to feel that their vote counts for less than their neighbour's vote. This would be outrageous!

Why isn't there the same outrage about the corporate structure? Because employees have a choice. If you don't like your job, you can quit and find a new one. If you don't like your country's form of government, emigrating to a different country is much more difficult.

What if it wasn't? Can't we get the best of both worlds? Imagine a country that is made up of a large number of political units (let's call them pods). Each pod operates like a company, with a strict hierarchy of members and a "CEO", or "king" at the top. Each pod makes its own laws, cooperates and trades with other pods. If a citizen is unhappy in his or her current pod (unfair work conditions, taxes too high, etc.), switching to a different pod is easy, as long as they will accept the new member -- just like changing jobs. Pods can split, merge or disappear. New pods can be started by citizens. I call this competitive monarchy, or capitalist monarchy, if you wish.

To ensure a safe and productive environment for all, the country as a whole needs a Constitution. This is a small set of common laws that apply to all pods. It needs to protect pods from unfair competition (antitrust laws), national security (a common military force) and basic human rights that allow anyone the common freedoms of life, liberty, speech, quitting a pod, etc.

Within a pod, there are no elections (unless the king wishes to have them) and no term limits. Laws can be passed quickly; skilled decision makers can be promoted; good leaders can be rewarded with higher influence.

There are still federal taxes and a small federal government that deals with foreign policy and constitutional amendments, but instead of paying state taxes, you pay a pod tax, which goes towards promoting the interests of your pod. Being a member of multiple pods should be allowed, too.

If you think of political units of today, we have states, provinces, districts and other municipalities (geographic groups), political parties (groups based on political interests) as well as unions (groups based on occupation). A pod can be any one of these and more -- it is a group of people with any set of common principles.

As travel costs decrease, geographic divisions will become less and less important, so it will not make sense to have states or provinces as political units. As communication costs decrease, the cost of advertising and political campaigns will decrease as well, so political parties will become less important. If you feel strongly about an issue, like abortion rights, instead of joining the Democratic party and having to support a number of policies you disagree with, you should be able to join the pro-choice pod -- a smaller, more targeted political unit.

It's a bit like the Facebook of politics -- instead of joining a party, you join zero or more groups, each with its own, small, targeted agenda. Unpopular groups disappear into obscurity. Popular groups become more powerful. As a citizen, you can pay attention to the political issues that matter to you, which solves democracy's logistics problem in a better way than representation did. Minorities have a stronger voice and can be more easily protected from bullying by majorities. Within a group, laws can be drafted as quickly (or slowly) as the leadership wishes to.

Yet we still retain the benefits of democracy. Instead of "one person; one vote", as a citizen, you can have as much influence as you can muster, given your time and skills. You can spread your influence across multiple issues, or target it towards a few issues you feel strongly about. Instead of term limits and majority rule, we have proportional representation -- issues that have a lot of support get more visibility, and people who are at the center of the currently hot debate have a lot of influence, for a short time.

If you have read this far without getting bored, I'm surprised. This system is probably completely unworkable, but I can't quite see why. Maybe it's the 2am. Sounds like total madness, right?

Racism is a problem of overfitting

In machine learning, over-fitting is a problem that arises when an algorithm makes a particular kind learning mistake -- when it takes a small number of examples and assigns a unjustifiably big weight to a few features of the data. If you think of the human brain as a biological computer running a machine learning algorithm, then racism is just an example of over-fitting. A person experiences, witnesses or hears about a few stories of black people robbing convenience stores and starts thinking that all black people are criminals. This conclusion is clearly unjustified. The "blackness" feature has nothing to do with the "is a criminal" feature. A much better explanation for these observations is a more complicated combination of other features - social, geographic and economic factors, selection bias, etc. These, more complicated feature interactions are more difficult to learn, both for machines and for human brains.

The solution to over-fitting is regularization. In order not to let the feature weights get out of hand, you impose a penalty on the absolute value of these weights, effectively pushing them towards zero. I claim that the human analogue to regularization is skepticism. Skeptics are people who aren't easily convinced by a few examples. They demand to see large scientific studies and well-justified statistics before they believe a claim. This is, effectively, a way of saying, "I'm going to learn more slowly by requiring more evidence to change my views."

So here is my claim -- skepticism is a cure for racism.

Yet another evil, intentional misuse of statistics.

The Sydney Morning Herald reports on a recent CDC study that compared success rates of married couples to those of "cohabiting unions" -- couples who live together without getting married. The headline reads: Marriage outlasts living together. They define success as maintaining the relationship for at least 5 years.

They compare the following two numbers:

  • 78% of marriages last 5 years or more.

  • 29% of cohabiting unions last 5 years or more.

(Instead of 29%, they actually say "less than 30%", so let's just assume it's 29%.) The difference seems striking!

... unless you keep reading. "51 per cent of couples who lived together made the transition to marriage within three years". WTF? I don't know whether this is CDC's or the Herald's doing, but this seems like deliberate misdirection. Shouldn't getting married be counted towards success? In other words, 80% of cohabiting unions either stay together for 5 years or get married in the first 3 years. That sounds better, but let's turn the annoying 3 into a 5 to make the numbers comparable.

Let's make the assumption that the rates of divorce and splitting up are exponentially distributed, which means independent of the amount of time that has passed. In reality, both distributions probably have more weight towards zero, so we might under-estimate divorces and split-ups slightly. The two errors should cancel out somewhat, so I wouldn't worry about them too much. Assume exponential distributions.

A married couple will split up after Y years with probability
  f(Y) = 1 - e-λ Y

We know that 22% of married couples split up after 5 years, so
  f(5) = 0.22

Solve for λ to get
  λ = 0.3028255

A cohabiting couple will either split up or get married after Y years with probability
  g(Y) = 1 - e-μ Y

We know that 29% of cohabiting couples are still cohabiting after 5 years, so
  1 - g(5) = 0.29

Solve for μ to get
  μ = 0.247575

Of the remaining couples, a fraction F will get married and a fraction (1 - F) will split up. We know that 51% of cohabiting couples get married after 3 years, so
  g(3)F = 0.51

Solve for F to get
  F = 0.97

This is a bit suspicious. It suggests that, of cohabiting couples studied by the CDC, 97% ended their cohabitation status by getting married and only 3% split up. Not impossible, but at the very least interesting. Let's say this is true and continue.

How can a cohabiting couple not make it to the 5-year mark? There are 2 ways:

  1. they split up, or

  2. they get married and then divorced.

The probability of (1) is
  g(5) (1 - F) = 0.71 * (1 - 0.97) = 2%

The probability of (2) is
   μ e-μ Y F (1 - e-λ (5 - Y)) dY

This evaluates to 36%, which leaves 62% of a chance of staying together for 5 years without splitting up or getting married and then divorced.


  • 78% of marriages last 5 years or more.

  • 62% of cohabiting unions last 5 years or more.

Not nearly as sensational as the article claims. Married couples have a 16% higher chance of staying together than unmarried couples who live together do. Sounds reasonable.

Visiting my girlfriend's parents in Slovakia

Marianna and I are visiting her parents in Slovakia for a week. I do not speak Slovak, and her parents do not speak English or Russian, so it's been entertaining. Highlights so far:

- Tasting amazingly good wine and 22-year-old plum whisky in a small cellar of her father's friend.

- Two days later, thanking my Russian genes after tasting 32 different wines, one after another, at a 400 year old winery.

- Listening to her explain the plot of Dead Like Me to her mom and seeing mom's WTF reaction.

- Walking on 10cm-thick ice with an axe and making holes for the fish to breathe; then running like hell after the ice started making cracking sounds under our feet.

- Her dad: "Does your father like wine, beer or vodka?" Me: "Sure". Him: "No problem then. We'll find a common language."

- Watching Mario bake trdelnik in the kitchen of a huge restaurant.

- Realizing that 100% of my brain is occupied by the the translation process, using all applicable Russian and English and whatever small pieces of Ukrainian and Polish I know to make sense of the stray words I recognize; I suddenly find it difficult to remember facts or answer questions; could be the jet lag, too.

- An excellent backwards dinner that started with a honey-garlic-eucharist dessert-like thing and ended with a sauerkraut soup, followed by the opening of Christmas presents before midnight on the 24th of December.

- A 30-degree temperature change in 5 days (that's 54 degrees for those of you in the USA and Belize), and the crazy wind that comes with it.

- Uncle: "You are my hero! Drinking all this wine after so much Slivovitz and still staying alert?"

- Realizing that I only need to know about 20 words in a language to start cracking jokes. (Not bad, but Ryan Stiles and Colin Mochrie can get away with 4, so I still have a lot to learn.)

- Marianna's grandfather: "Are you Catholic, Orthodox?" Me, cautiously: "Nothing." And as I was preparing for a long and painful conversation, realizing that I was in the company of 20 other atheists! Unbelievable.

- Languages people have used this week to communicate with me: Slovak, Czech, Russian, English, German, French, Italian, Ukrainian, Polish, Latin, Hungarian. The Hungarian I did not understand. They rest have worked surprisingly well.

- My firearm target practice average is now 100%, just like my baseball batting average. I shot once and hit the 10.

- In Slovak, the word for basement is pivnica, which comes from pivo, which means beer or drinking. This makes perfect sense because there is absolutely no other purpose to having a basement.

And now we are off to Canada to see my family and continue with the second half of the vacation...

On Dollhouse's Topher

Aside from the amazing writing, one thing I love about Dollhouse is the Topher character. It's the most (the only?) accurate portrayal of a genius computer hacker guy I've ever seen. He is spot on -- extremely competitive, always trying to do the absolutely best job possible, gets excited when somebody challenges him (Alpha), talks down to the dumb people around him, gets really into it when explaining something technical, has no respect for all that "feelings" and "cosmic morality" crap, but isn't an antisocial basement slob.

Any other show would have him buried under rotting, empty pizza boxes, with long, dirty hair, afraid of any contact with humans, let alone females. He would have a sex robot, too. Instead, Topher looks like a normal, approachable guy, works just fine with several other people, including a woman, who is his boss, and enjoys an occasional date with a girl. Yes, she is one of the "dolls", but that's quite far from the usual robotic girlfriend cliche you would find in Buffy or Firefly, both of which featured nerdy geniuses with robotic girlfriends. He likes the social aspect of his date; they spend the time playing video games and having a passionate conversation. In that sense, his relationship is more substantial than Adelle's relationship with Victor.

I get Topher completely. He is me. I'm glad Joss finally got it right, on third try. Fourth, if you count Dr. Horrible. That was a step in the right direction. A tiny part of me is waiting for Dollhouse to get cancelled, as it obviously will be, just so that I can see the next set of characters he comes up with. Looking at the garbage that was Season 1 of Buffy and the steep slope of improvement since then, I will be following his work very closely.


SLOCCount is a program for estimating the cost of writing code. It takes a directory of source code and estimates the amount of time and money it would take to hire somebody to write that code for you.

Here is what SLOCCount says about all of the code I wrote while solving problems on UVa and during various online programming competitions. This doesn't include TopCoder, so the true number is much larger.

Total Physical Source Lines of Code (SLOC)                = 178,686
Development Effort Estimate, Person-Years (Person-Months) = 46.32 (555.78)
 (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
Schedule Estimate, Years (Months)                         = 2.30 (27.61)
 (Basic COCOMO model, Months = 2.5 * (person-months**0.38))
Estimated Average Number of Developers (Effort/Schedule)  = 20.13
Total Estimated Cost to Develop                           = $ 6,256,575
 (average salary = $56,286/year, overhead = 2.40).
SLOCCount, Copyright (C) 2001-2004 David A. Wheeler
SLOCCount is Open Source Software/Free Software, licensed under the GNU GPL.
SLOCCount comes with ABSOLUTELY NO WARRANTY, and you are welcome to
redistribute it under certain conditions as specified by the GNU GPL license;
see the documentation for details.
Please credit this data as "generated using David A. Wheeler's 'SLOCCount'."

I'm not sure what to think about this.