anyway, there are a couple of very particular circumstances in which you might use one of these in a biological context, and it would be arguably superior to the arithmetic mean. a book which i think does a pretty good job of laying these out this is A Primer of Ecological Statistics by N.J. Gotelli and A.M. Ellison (2004), pp. 61-63. (if you're really into this stuff, you're welcome to borrow my copy and read it for yourself).
so here i expand somewhat on one of their hypothetical examples to illustrate the use of geometric mean in summarizing population growth rates: assume an initial population of 1000 individuals and, for simplicity's sake, a growth rate of 10% the first year, increasing by 1% per year up to 20% in the eleventh year. so, in the second year, the population is (1000 * 1.10) = 1100. likewise, in the third year, the population grows by 11% to (1100 * 1.11) = 1221. in the eleventh year, the population reaches 4633.07 (you'll have to forgive the biologically unrealistic fractional individuals).
now, if you wanted to summarize the growth over these eleven years, you'd be tempted to just 'average' them -- that is, take the arithmetic mean of 10%, 11%, 12% ... up to 20%, which -- as you can probably do in your head -- is exactly 15%. in other words, on average, you'd say, there was 15% growth per year for those eleven years. it makes sense, but, as it turns out, it's not quite exactly precisely right: 1000 * 1.15 = 1150 (1st year); 1150 * 1.15 = 1322.5 (2nd year) ... ending with 4652.39, which is almost 20 greater than it should be (4633.07; from previous paragraph).
so, the arithmetic mean overestimates the average growth. as it turns out, you get the right answer if you instead use the geometric mean of the eleven values (1.10, 1.11, 1.12 ... 1.20), which is only a little bit smaller: 1.149565... (as opposed to 1.15).
as a side-note: "R" doesn't have a built-in function for calcualting geometric means, but it's nevertheless fairly easy to do:
> Y = c(1.10, 1.11, 1.12, 1.13, 1.14, 1.15, 1.16 1.17, 1.18, 1.19, 1.20)
> mean(Y) ## regular old arithmetic mean
[1] 1.15
> exp(mean(log(Y))) ## geometric mean using base "e"
[1] 1.149565
> 10^(mean(log(Y, base=10))) ## same answer in base 10
[1] 1.149565
this blog entry has already turned out much longer than i anticipated, so i'll leave it as an exercise to the reader (if there are any of you left by now) to work through the calculations. given that you haven't yet been introduced to R's 'looping' functions, it would probably make more sense to do the calculations using a spreadsheet. (i know, i know; i warned you away from them for statistical work, but they nevertheless have their uses for quick-and-dirty calculations).
let me know if you're interested in doing this, and i'm happy to help you get started.
i'll pick up with harmonic means in my next entry! (i know you can hardly wait!)
1 comment:
Well I'm not sure if the growth rates need to be regular or not but what about sea level rise over time say the last 11,000 years?... on the one hand an arithmetic mean of the growth rates would show a value x but that wouldn't really describe the monumental changes in the growth rate that have taken place due to deglaciation in the beginning and now anthropomorphic warming of the troposphere and resultant melting of the polar ice reserviors. perhaps a geometric mean or a harmonic mean would be more appropriate?
Post a Comment