Distributions ... continuing from Safe Withdrawal Rates ***
thanks to John R. for his comments

Every time I do some calculations I come up with that #$%@!*? Magic Sum, and it involves the reciprocal of some variable.

>Huh?
The reciprocals might be the reciprocal of stock prices or stock gains or portfolio values or ...

>Example?
Okay, for example, *** involves withdrawal rates which have the form W/P where W is the withdrawal amount and P is the portfolio. Assuming that W is a constant and that we know the distribution of the Ps, what's the distribution of withdrawal rates?

>The distribution of portfolios? What does that mean?
Don't you remember? We were following 10,000 investors, all of whom invest in some asset with a prescribed return distribution. (For example, a lognormal distribution with Mean Return = 9% and Volatility = 15%.)


Okay, let's concentrate on lognormal distributions.
We start with a variable, r, with a normal distribution described by f(r). This variable has Mean = M, SD = S.
We note this fact by writing: r = N[M,S] meaning r is Normally distributed with Mean = M, Standard Deviation = S.
Note: The "standard" notation is N[Mean,Variance], but here we'll use N[Mean,SD].

>M is the average annual return?
Well ... uh, actually, it's some return: average, annualized, whatever. The important thing here is that it's some constant. In practice I guess one would use annualized.

Okay, now we consider the variable: g = er which is then lognormal (by definition of "lognormal").

Suppose r represents annual returns, so that g represents annual gain factors.
Note that g = er is approximately 1+r, as expected. We can also consider that the return is continuously compounded where we divide the year into B seconds with a return of r/B per second so the "annual" return would be (1+r/B)B which, for B = a few jillion, is indistinguishable from er.    

Over n years, the cumulative gain factor is the product of the annual factors, namely: g1g2...gn = er1+r2+...+rn

Now we consider reciprocal: 1/g = e-r where the variable -r has Mean = -M, SD = S.
Over n years, the product of these reciprocals is 1/{g1g2...gn} = e-{r1+r2+...+rn}

We now consider the Magic Sum: gMS(n) = I1/g1 + I1I2/g1g2 + I1I2I3/g1g2g3 + ... + I1I2...In/g1g2...gn
where I1, I2, ... are annual inflation factors (an inflation rate of 3% means an inflation factor of I = 1.03)
and g1, g2, ... are the annual gain factors.

For the situation we're considering, the inflation rate is a constant, so Ik = Ik where I = 1+InflationRate = 1+i.
That gives: gMS(n) = I/g1 + I2/g1g2 + I3/g1g2g3 + ... +In/g1g2...gn

Here's our question: Given the g-distribution, what's the distribution of:   I/g1+I2/g1g2+ I3/g1g2g3+ ... +In/g1g2...gn ?

>So what good is this gMS?
In order that our portfolio last n years, the (initial) withdrawal rate must be less than 1/gMS.
In fact, if it were exactly 1/gMS then it would last exactly n years. This we call the Maximum Rate of Withdrawal

>We?
Well, I do. For example, the value of MRW for n = 30 years looks like this, for a S&P 500 portfolio.

>Do you see that notorious 4% "safe" withdrawal?
Yes, but we're talking S&P 500. Here's a picture for another allocation:


Both assume the sequence of annual returns occurs in historical order. Suppose they didn't. Suppose you started your retirement after a few bangup years on the market, like 1995-2000 when the S&P 500 skyrocketed. Suppose you started your retirement in, say, the year 2000. Your portfolio has ballooned. You expect to withdraw a percentage of this fat portfolio. As you might expect, the market tanked for the next three years! Indeed, if we took those 3 BAD years and stuck them in as the first 3 years of historical data, we' d get

Anyway, our goal is to establish the distribution of gMS(n).

To put it differently:
Starting with a $1.00 portfolio and
an initial withdrawal rate of w,
the value of our portfolio after n withdrawals is
g1g2...gn[ 1 - w*gMS(n)]
Note that the guy out front, that's g1g2...gn, is just the buy-and-hold portfolio, after n years.
We could, for example, assume that it's:

where Po = $1.00 and T = n, the number of years, and r and s are the Mean Annual Return and Standard Deviation (respectively).

>Huh?
It's noted here (as number 6).

Continuing, for our lognormal g-distribution, 1/gk = e-rk so our Magic Sum becomes:

gMS(n) = I e-r1 + I2 e-(r1+r2) + I3 e-(r1+r2+r3) + ... +Ine-(r1+r2+r3...+rn)

Look at a particular term:
Ine-(r1+r2+r3...+rn) = en log(I) -(r1+r2+r3...+rn) = [elog(I) -(1/n)(r1+r2+r3...+rn)]n

Our question now becomes: Given the r-distribution, what's the distribution of log(I) - (1/n)(r1+r2+r3...+rn) ?

As it turns out, if the annual returns rk are independent and selected from the same normal distribution, (which is the case we're considering),
then log(I) - (1/n)(r1+r2+r3...+rn) is Normal with Mean = log(I) - M and SD = S.
That is: N[log(I) - M,S].

Okay, so we consider a variable z = eq, where q is N[log(I) - M,S].
Since q is Normal, then z is lognormal and we then have:

gMS(n) = z1 + z1z2 + z1z2z3 + ... +z1z2z3...zn = eq1 + eq1+q2 + ... + eq1+q2+...+qn

Note that, if we know the Mean and Standard Deviation of q (normally distributed), then we also know the Mean and Standard Deviation of lognormally distributed z = eq (as noted here), namely:
A
Mean(z) = eMean(q)+SD2(q)/2 = I e-(M-S2/2) = M
SD2(z) = e2Mean(q)+SD2(q) [eSD2(q)-1] = I2e-2(M-S2/2)[eS2-1] = M2 [eSD2(q)-1] = S2
since Mean(q) = log(I) - M and SD(q) = S.

Our question now becomes:
If the set zk is lognormally distributed, what's the distribution of z1 + z1z2 + z1z2z3 + ... +z1z2z3...zn ?
>Don't you find that confusing? I mean ...?
Let's just look at two years worth, so we consider just z1 + z1z2.

Understand that, although z1 and z2 are selected from the same lognormal distribution, they are independent. Let's call them u and v.

  • We pick a jillion values for u and plot their distribution.
  • We then pick another jillion values for v, multiply by u, then plot the distribution of uv.
  • Finally we plot the distribution of the jillion sums, like so
>So what's the distribution of ... uh ...?
Of z1 + z1z2 + z1z2z3 + ... +z1z2z3...zn ?
I have no idea ... but I'm thinking ...


We'll find the Expected (or Average) value of x*y where Mean(x), Mean(y) and SD(x), SD(y) are the Means of Standard Deviations of the two random variables x and y.

As noted here :
        { Mean(xy) - Mean(x) Mean(y) } = SD(x)SD(y) PC
where PC is the pearson correlation.

Since we're assuming that our variables are independent, then PC = 0 so that Mean(xy) = Mean(x) Mean(y)
... and that extends to a product with umpteen terms. That is, the Mean of a Product is the Product of the Means.

In particular, for our variables z1, z2, etc. (all of which are selected from the same distribution), we have:
Mean(z1z2... zk) = Mean(z1)Mean(z2)...Mean(zk) = Mean(z)k
so, from A:
[1] ...         Mean(z1z2... zk) = Ik e-k(M-S2/2) = Mk
where M = I e-(M-S2/2)

So we have:
B
Mean(gMS(n)) = Mean(z1 + z1z2 + z1z2z3 + ... +z1z2z3...zn) = M + M2 + M3 + ... + Mn

>Do you know what you're doing?
That's questionable. Anyway, let's forge ahead.
For independent random variables x and y, we have a magic formula:
        SD2(x*y) = Mean2(x)SD2(y) + Mean2(y)SD2(x) + SD2(x)SD2(y)

Let's look at the relative size of these terms, for
M = 0.09 (that's a 9% return) and S = 0.15 (that's a 15% SD) and I = 1.03 (that's an inflation rate of i = 3%) and
Mean(x) = M = I e-(M-S2/2) = 0.9520
SD2(x) = S2 = I2e-2(M-S2/2)[eS2-1] = 0.0206
Then SD2(x*y) = (0.9520)2(0.0206) + (0.9520)2(0.0206) + (0.0206)(0.0206) = 0.0187+0.0187+0.0004

We can write this as:
        SD2(z1z2...zk-1zk) = Mean2(z1z2...zk-1)SD2(zk) + Mean2(zk)SD2(z1z2...zk-1) + SD2(z1z2...zk-1)SD2(zk)

Since Mean2(zk) = M2 and SD2(zk) = S2 we can rewrite this as:
[3] ...         SD2(z1z2...zk-1zk) = Mean2(z1z2...zk-1) S2 + SD2(z1z2...zk-1) (M2 + S2)

We'll go step-by-step:

Product Mean2 SD2
z1 M2 S2
z1z2 M4 (M2+S2)2 - M4
z1z2z3 M6 (M2+S2)3 - M6
... ... ...
z1z2z3...zk M2k (M2+S2)k - M2k

We now have:

C
SD2(gMS(n)) = SD2(z1 + z1z2 + z1z2z3 + ... +z1z2z3...zn) = Σ (M2+S2)k - ΣM2k

>Is that legal?
Yes. The Variance of a Sum is the Sum of the Variances ... provided the variables have zero correlation.

>There's that independence again, eh?
Yes. A simplifying assumption.

>Don't you feel uncomfortable with all those simplifying assumptions?
The proof is in the pudding ... as they say.

>So you think that z1, z1z2, z1z2z3 ... you think they're independent?
Well, not really. I think that assuming z1, z2, z3 are independent, that's good. But z1, z1z2, z1z2z3? Not so good.

>So?
We'll start off with this simplifying assumption because ... uh, the proof is in the pudding. If we don't like what we get, we'll change it


For large values of n, formulas B and C reduce to:

Mean(gMS) = M / (1 - M)   as ninfinity
SD2(gMS) = (M2+S2) / [1-(M2+S2)] - M2/(1-M2)   as ninfinity

Remember our earlier example? M = 0.9520 and S2 = 0.0206

That'd give:

Mean(gMS) = 19.8   as ninfinity
SD2(gMS) = 3.012   so   SD(gMS) = 1.734   as ninfinity

>So, what does all that mean?
Remember? The Maximum Rate of Withdrawal is 1/gMS(n).
The Expected value of gMS (for large n) is 19.8 and 1/19.8 = 0.051 or 5.1%

>For this particular example.
Yes, of course.
For smaller n-values, the Expected Value of gMS(n) is given by:    

Mean(gMS(n)) = M (1 - Mn) / (1 - M)

and 1/(Expected Value) is

(1/M - 1) / (1 - Mn)

which, for M = 0.9520 is shown here

>Uh ... I've forgotten. What's that M again?.
Okay, we'll look at A and B and sum the series and write:

D
If portfolio returns are lognormally distributed with Mean and Standard Deviation M and S,
and g1, g2, g3 ... are annual gain factors
and I = 1 + i = 1 + InflationRate is the annual inflation factor
and M = I e-(M - S2/2) and S2 = I2e-2(M-S2/2)[eS2-1] = M2[eS2-1]
and gMS(n) = I/g1 + I2/g1g2 + I3/g1g2g3 + ... + In/g1g2...gn
then

Mean(gMS(n)) = M (1 - Mn) / (1 - M)

SD2(gMS(n)) = (M2+S2) [1-(M2+S2)n] / [1-(M2+S2)] - M2 (1-M2n)/(1-M2)

and 1/Mean(gMS(n)), namely (1/M - 1) / (1 - Mn), is
and gives an estimate of the Maximum Rate of Withdrawal.

Note: A random selection of returns would give a variety of values for 1/gMS(n).
Perhaps what we need is not the reciprocal of the average of gMS(n) but the average of the 1/gMS(n) values.
That means we should look at the average of the reciprocals, not the reciprocal of the average, eh?
However, the reciprocal of the average is the smaller of these two numbers, so it's "safer" when we're estimating a rate of withdrawal.

>The proof is in the pudding, you said. So, is it any good?
I have no idea, but it sure avoids a jillion Monte Carlo simulations, eh?

However, look at this:

See? For an S&P500 portfolio, starting in 1928, gMS(n) 20 roughly

>I'd say that, so far, the only useful thing is that funny formula ... which is a complete mystery to me.
Okay.

  • We start with the annual Gain Factors, each year: g1, g2, g3, ...
    (That means that if $1.00 grows to $1.23 in a given year, then 1.23 is the Gain Factor for that year.)
  • The annual Gain Factors are reduced by the Inflation Factor: I = 1+i to give g1/I, g2/I, g3/I, ...
  • The value of your portfolio, after n years, turns out to be:
    g1g2g3...gn [ 1 - w(I/g1 + I2/g1g2 + I3/g1g2g2 + ... + In/g1g2g2...gn) ]
    where w is your initial withdrawal rate. (If it's 4.5% then w = 0.045)
    (The subtracted terms, like w I/g1 and w I2/g1g2 are the loss of portfolio dollars ... due to your withdrawals)
  • We conclude that the value of your portfolio, after n years, is the value of a buy-and-hold portfolio (without withdrawals), namely g1g2g3...gn, multiplied by 1 - w gMS(n)
    where gMS(n) = I/g1 + I2/g1g2 + I3/g1g2g2 + ... + In/g1g2g2...gn
    (Note that the terms are all of the form of a product of   InflationFactor/GainFactor)
  • Whether or not your portfolio will survive n years will depend upon whether 1 - w gMS(n) is positive or negative.
  • For your portfolio to survive, you'll need 1 - w gMS(n) > 0 so you should withdraw less than 1/gMS(n).
  • To this end, we consider 1/gMS(n) ... to be sure that w is less than this.
  • The Expected Value of gMS(n) (as derived above ... with certain assumption!) is: M + M2 + ... + Mn
    where M is the Expected InflationFactor/GainFactor = I/(AnnualizedGainFactor)
  • For a lognormal distribution of returns, Annualized Gain Factor = eM - S2/2
    (In fact, M - S2/2 is an excellent approximation for annualized returns ... if you know the Mean and Standard Deviation!)
  • Then M = I / eM - S2/2.
  • To deal with M + M2 + ... + Mn, we then need to evaluate a sum of the form:
    (1)     SUM = 1+x + x2 + ... + xN-1
    We multiply by x and get
    (2)     x SUM = x + x2 + x3 + ... + xN
    Subtract (2) from (1) and get:
    (3)     (1-x)SUM = 1 - xN   so   SUM = (1 - xN) / (1-x)
  • We use this to evaluate our sum, which we write as: M[1 + M + M2 + ... + Mn-1]
    This gives: Expected Value of gMS(n) = M [1 - Mn] / [1 - M]
  • Recall that we're interested in 1/gMS(n) so, inserting the Expected Value of gMS(n)
    and substituting M = I / eM - S2/2 and staring at the reciprocal of that Expected Value of gMS(n),
    namely [1 - M] / [1 - Mn] / M = [1/M - 1] / [1 - Mn] we get voila!
    ... the funny formula for our estimate of the Maximum Rate of Withdrawal:

>zzzZZZ

You may be interested to know that, even if I believed the above stuff, I still have no idea what's the distribution of gMS values ... but if I generate a thousand gMS values (using random lognormal returns) and plot their distribution (as described here), it looks like this

Doesn't it look familiar? That lognormal curve has the same Mean and Standard Deviation as the thousand-value sample.

>zzzZZZ

How about a calculator, using that funny formula?
Mean Annual Return = %
Standard Deviation = %
Inflation Rate = %
Number of Years =
Maximum Rate of Withdrawal = % ... maybe :^)

>zzzZZZ

to continue