Tempo experiment analysis: the good
Sorry for the delay for this analysis; sickness and wanting to have everything finished interfered. It's not completely finished, but I have enough interesting things to discuss for now. This blog post deals with things that went well.
The best thing that happened was that you guys actually played the game! :) I got much, much more data than I was expecting -- in the end, there were 1041 tapping games played, with a total of 14,342 taps. I was hoping for 200 or so tap-games, so this gives me five times as much data as I was hoping for in my wildest dreams!
Of those tap-games, 882 resulted in the player accepting the computer's tempo, 89 resulted in the player giving their own tempo, and 70 resulted in a disagreement about tempo but no other tempo given. 635 tap-games used an audio metronome, while 361 used a visual metronome, and 45 used no metronome at all.
Most tempi (plural of "tempo"; no, it's not "tempos") were very close to the metronome -- looking at tap-games with user-agreed or user-specified tempo, and excluding tap-games with no metronome, we find that 82.1% of them had a difference of less than 5%, and 41.7% exercise had a difference of less than 1%.

The distribution of tap-games played per level shows an unsurprising majority of level 1, but a surprising amount of interest in level 9. I'm guessing that this is because the game was too easy (see next blog post).

Returning to main point of this experiment, 560 tap-games have their tempo detected trivially with linear regression via ordinary least squares. The player tapped the correct rhythm, with no note more than a tenth of a quarter-note beat away from the strictly-metronomic position. I call those the "boring" tap-games... I mean, if the obvious solution works, then it's no fun! :)
Happily, this leaves 481 "interesting" tap-games which require more complication solutions. I'm still working on this part; my latest algorithm correctly identifies most of the tempi, but there's still a bunch left, and I think my current approach isn't going to pan out. I need to take a break for a day and return to it with a fresh mind.
Oh, on the topic of "boring" exercises -- judging from the data I've been looking at manually, I estimate that approximately two thirds of the tap-games can be handled with simple linear regression. The missing hundred tap-games (between "two thirds of 1041" and the "560" mentioned above) come from wanting to avoid false positives. I'm being paranoid about false positives; I would rather have the computer spend more time analyzing the taps instead of producing an incorrect tempo estimate.