Friday, June 21, 2013

Finite Haiku

So a funny thing happened in my brain today. I wondered... given that haiku has exactly 17 syllables all up (5/7/5), and that the english language has a certain number of words/sounds, how many unique haiku exist? Well.. I'm still not sure, but I did some poking around to work out the worst case scenario. The worst case scenario is that English is a living language, and that haiku is simply a collection of 17 syllables, each from a pool of all possible syllables in the language. While this is very simplistic, and throwing random syllables together is unlikely to produce words let alone meaning, it does future proof against new words that may turn up. And it does provide a maximum number. So how many syllables are there in English? Turns out that I couldn't find any definitive answer, but I did find an article here which again is kind of a worst case scenario. The author refers to 15,831 syllable candidates. This does seem rather large, but I'd be interested if someone else had any good sources on something more accurate. So if we take this worst case of 15,831 syllable candidates, and we have 17 positions to fill, again using a worst case scenario that any syllable can follow any syllable, we end up with 17^15,831 unique haiku - which will include both the nonsense ones and also every possible sensible haiku. It did take a while to find a calculator that wasn't going to fall over punching in that kind of number. Luckily, Wolfram Alpha was obliging and came up with 1.7*10^19479. A stupidly large number. How stupidly large? Well... let's compare it to some other things. For the bridge players out there, there are 5.4*10^28 unique bridge deals. For the chess players, it's estimated that there are 1*10^120 unique chess games. So I'd like to make the number more accurate, but I'm not sure how. Any suggestions? Maybe if I could find the average number of syllables in a word (not in a normal distribution, but across the english language), I could use that and the total number of unique words. Any other ideas?

1 comment:

anti ob said...

Hey! That's not nice; RSS feed apparently lost track of For Battle... there's like a whole new post in here I've never seen before!

Assuming we're not allowed a word to cross the boundary between lines, you only care about words of 7 syllables or less. - which I just discovered and have no reason to believe is complete, but which should at least give us a ballpark estimate - lists the following numbers of words of a given number of syllables:

1 - 8,267
2 - 40,902
3 - 52,135
4 - 40,125
5 - 24,384
6 - 14,129
7 - 7,978

(Which seems semi-reasonable, given the oft-quoted estimate of 250,000 words in English. Of course limiting yourself to English for a traditional Japanese form of poetry is ludicrous, but hey; its what I speak.)

So for 5-syllable lines we should have:

5! * 8267^5
[3x1 + 1x2]
4! * 8267^3 * 40902
[2x1 + 1x3]
3! * 8267^2 * 52135
[1x1 + 2x2]
3! * 8267 * 40902^2
[1x2 + 1x3]
2! * 40902 * 52135
[1x1 + 1x4]
2! * 8267 * 40125

4633625542332874332840 + 554625200822568624 + 21378464232090 + 82982836705608 + 4264851540 + 663426750

4634180271899926141836 possible 5-syllable lines

7 syllable lines are left as an exercise for the reader who is not stupidly awake at 5am on a Saturday, as I feel that may have been enough to let me get to sleep...