As we had talked about earlier, I performed a frequency analysis on a large body of text, namely my personal journal entries from 4-24-02 through 4-23-05. The 3 years' worth of data proved to be rather useful. In the end, I had 1,110,954 words to work with as well as 4,397,669 characters. I ran frequency analysis and got the data attached in the spreadsheet. (The raw numbers are included below, in case of any problems in the attachment.)

Then I went through and eliminated the common words. In addition to the list of 29 distinct words that you gave us, I added some of my own (since I know that words that I commonly use in my entries). They include: "too", "my", "me", "she", "her", "his", "so", "we", "well", "really", "when", "went", "had", "as", and "am". This dropped my word count down to 638,952, and my character count down to 3,232,981.

Raw Data (slightly processed data behind this link)

Source of Data: Personal Journal Entries from 4-24-02 through 4-23-05

a 379,170
b 60,191
c 94,713
d 193,355
e 523,105
f 95,787
g 102,053
h 254,292
i 328,993
j 8,693
k 50,879
l 190,526
m 118,844
n 280,195
o 350,506
p 69,655
q 5,631
r 221,794
s 243,821
t 461,249
u 114,311
v 38,133
w 122,230
x 6,173
y 80,361
z 3,009


Statistics

Words: 1,110,954
Total Characters: 4,397,669

a 233,168
b 44,713
c 93,342
d 155,159
e 428,475
f 61,644
g 102,053
h 142,252
i 192,941
j 8,693
k 47,585
l 162,260
m 93,802
n 215,374
o 252,187
p 69,655
q 5,631
r 188,999
s 180,950
t 268,251
u 107,998
v 38,133
w 68,854
x 6,173
y 61,680
z 3,009

Statistics

Words: 638,952
Total Characters: 3,232,981


Words omitted: Given list and "too", "my", "me", "she", "her", "his", "so", "we", "well", "really", "when", "went", "had", "as", "am"