Variance of a Special SUM

We assume that g1, g2, g3, ... are independent random variables with Mean and Variance (= StandardDeviation2) given by:
      M[g] = g
      VAR[g] = V = s2

We want to determine the Variance of a Special Sum, namely:

[1a]       SUM(n) = g1 + g1g2 + g1g2g3 + ... + g1g2g3...gn

>What's so special about ...?
Pay attention.
We'll assume that the gs are the daily Gain Factors for some stock price.
If the gs are daily Gain Factors, then this corresponds to the sum of stock prices over the past n days, assuming the price started at $1.00 n days ago ... or it's the sum of the prices over the next n days, if today's price is $1.00 ... or they're the numbers g1 = P1/P0, g2 = P2/P1, ... gn = Pn/Pn-1 where the P's are the stock prices. In what follows we'll assume that the starting price P0 = $1.00.
For convenience, we'll set:   Gm = g1g2g3...gm   for m = 1, 2, 3, ... n     (This corresponds to the price after m days: Gm = Pm/P0)
We can then write:

[1b]       SUM(n) = G1 + G2 + G3 + ... + Gn

Since we're assuming that the gs are independent, then the Mean of Gm = g1g2g3...gm is gm.

>Huh?
Okay, we'll recall some magic Stat Stuff regarding the Mean, Variance, Standard Deviation and Covariance of random variables (which we'll call M[x], VAR[x], S[x] and COVAR[x]):
Stat Stuff
If x, y, x1, x2 etc. are random variables and C is a constant, then:
  1. M[x+y] = M[x] + M[y]   and M[x+C] = M[x] + C since M[C] = C
  2. VAR[x] = S2[x] = M[(x-M[x])2] = M[x2] - (M[x])2   so VAR[x-M[x]] = VAR[x+C] = VAR[x]
  3. COVAR[x,y] = M[xy] - M[x] M[y] = COVAR(x+C, y]
        and notice that COVAR[x,x] = VAR[x]
  4. VAR[x+y] = VAR[x] + VAR[y] + 2 COVAR[x,y] = VAR[x] + VAR[y] + 2 r(x,y) S[x]S[y]
        where r(x,y) = COVAR[x,y] / S[x]S[y] is the Correlation Coefficient
  5. VAR[x1+x2+...+xm] = ΣVAR[xi] + 2 ΣCOVAR[xj, xk]
        i = 1 to m, k = 2 to m and j < k
  6. COVAR[x1+x2+...+xn,y] = COVAR[x1,y]+COVAR[x2,y]+...+COVAR[xn,y]
  7. If COVAR[x,y] = 0 so r(x,y) = 0, then:
    1. M[xy] = M[x] M[y]   and   M[x1x2...xm] = M[x1]M[x2]...M[xm]
    2. VAR[x+y] = VAR[x] + VAR[y]   and   VAR[x1+x2+...+xm] = VAR[x1]+VAR[x2]+...+VAR[xm]
    3. VAR[xy] = M2[x]VAR[y] + M2[y]VAR[x] + VAR[x]VAR[y]

In addition, we'll need some other magic formulas:
Magic Formulas
  1. (1 + x)n = 1 + nx   approximately, for n and x small.
  2. 1 + 2 + 3 + ... + n = n(n + 1)/2
  3. 1 + x + x2 +...+ xm-1 = (xm - 1) / (x-1)
  4. 1 + 2x + 3x2 +...+ (m-1)xm-2 = [(m-1)xm - mxm-1 + 1]/(x - 1)2

>Can I just bypass the math and go directly to the result ... please?
Well ... okay. Click here.

Continuing ... to get the Variance of the SUM(n) we'll use Stat Stuff #5:

[2a]       VAR[G1 + G2 + ... + Gn] = ΣVAR[Gi ]+ 2 ΣCOVAR[Gj, Gk]   where i goes from 1 to n and the latter sum is for j < k and k goes from 2 to n

>Huh? Do you really expect me to ...?
Okay, in all its grandeur, it looks like:
VAR[G1 + G2 + ... + Gn] = VAR[G1]+VAR[G2]+...+VAR[Gn]
+ 2COVAR[G1, G2]
+ 2COVAR[G1, G3]+ 2COVAR[G2, G3]
+ 2COVAR[G1, G4]+ 2 COVAR[G2, G4]+ 2COVAR[G3, G4]
...
+ 2COVAR[G1, Gn]+ 2COVAR[G2, Gn]+ ...+ 2COVAR[Gn-1, G3]


Consider COVAR[Gj,Gk]. Remember that k = 2, 3, ... n and j < k.
From Stat Stuff #3:
      COVAR[Gj,Gk] = M[GjGk] - M[Gj]M[Gk]     Mean of the Product - the Product of the Means
But the g's are independent, so that
      M[Gk] = M[g1g2...gk] = M[g1]M[g2]...M[gj] = gk     Mean of a Product = the Product of the Means
So we can rewrite:   COVAR[Gj,Gk] = M[GjGk] - gj+k so ...

>If Mean of a Product equals the Product of the Means, why isn't that COVAR[Gj,Gk] zero?
Because Gj and Gk aren't independent since Gk contains all the factors of Gj ... and more!

>Huh?
Pay attention:
Consider the term M[GjGk] = M[(g1g2...gj)*(g1g2...gk)] = M[(g1g2...gj)2gj+1gj+2...gk]   noting that j is less than k.

Now we use "Mean of a Product equals the Product of the Means" because (g1g2...gj)2 and gj+1gj+2...gk are independent:
      M[(g1g2...gj)2gj+1gj+2...gk] = M[(g1g2...gj)2] M[gj+1gj+2...gk]

However, for the second factor we have:
      M[gj+1gj+2...gk] = M[gj+1]M[gj+2]...M[gk] = gk-j.

For the first factor we use Stat Stuff #2: Mean[x2] = (Mean[x])2+ VAR[x] with x = Gj = g1g2...gj and get:
      Mean[(g1g2...gj)2] = (Mean[g1g2...gj])2 + VAR[g1g2...gj] = (gj)2 + VAR[g1g2...gj]
where (again!) the Mean of a Product = the Product of the Means (since the g's are independent).

It might look more elegant if we rewrite this like so:
      Mean[Gj2] = (Mean[Gj])2 + VAR[Gj] = g2j + VAR[Gj]

>Let's forget elegance, okay?
Putting it all together:
COVAR[Gj,Gk] = M[GjGk] - M[Gj]M[Gk]
= M[(g1g2...gj)2gj+1gj+2...gk] - gj+k
= M[Gj2] M[gj+1gj+2...gk] - gj+k
= [g2j + VAR[Gj]] gk-j - gj+k
= VAR[Gj] gk-j


So far we have:

      VAR[G1 + G2 + ... + Gn] = ΣVAR[Gi ] + 2 ΣVAR[Gj] gk-j

But we know those VAR[Gi] for each i = 1, 2, 3, ... n
>We do?
Yes, we did it here and it looks like this:

[3]         VAR[Gm] = VAR[g1g2g3...gm] = (g2+s2)m - g2m

For typical parameters, namely daily Gain Factors and Standard Deviations, we'd have g = 1+r with r small (r is the daily return, say 0.01 or less) so g is close to "1" and s small (say 0.02 or less) ... so s/g is small ... so we can use Magic Formula #1, like so:

      VAR[Gm] = (g2+s2)m - g2m = g2m[ (1+s2/g2)m - 1 ] = g2m[ (1+ms2/g2) - 1 ] = m g2m-2 s2

This says that (approximately), the Standard Deviation of m-day gains is SQRT(m g2m-2 s2) = SQRT(m)gm-1s.
That's just the 1-day Standard Deviation, s, increased by a factor: the square root of the time period SQRT(m) ... a familiar result

>And increased by gm-1, too.
Yes. That's like applying the average 1-day Gain Factor m-1 times.

Anyway, [2a] becomes ...

>We're talking approximation, right?
Yes, but I won't keep repeating that word. Anyway, [2a] becomes ... approximately:
VAR[G1 + G2 + ... + Gn] = ΣVAR[Gi ]+ 2 ΣCOVAR[Gj, Gk]
= ΣVAR[Gi ]+ 2 ΣVAR[Gj] gk-j
= Σ[ i g2i-2 s2] + 2 Σ[ j g2j-2 s2] gk-j
= (s2/g2)Σ[ i g2i ] + 2(s2/g2) Σ[ j gk+j]     where i = 1 to n, k = 2 to n and j < k

[!]      VAR[G1 + G2 + ... + Gn] = ΣVAR[Gi ] + 2 ΣVAR[Gj] gk-j = (s2/g2)Σ[ i g2i ] + 2(s2/g2) Σ[ j gk+j]   approx
            i from 1 to n, k from 2 to n and j < k (meaning j = 1, 2, ... k-1)

From [!], we have two sums to evaluate:   Σ[ i g2i ] and Σ[ j gk+j]

Evaluating   Σ[ i g2i ]

We have:
Σ[ i g2i ] = g2 + 2g4 + 3g6 + ... + n g2n
= x + 2x2 + 3x3 + ... + n xn = x (1 + 2x + 3x2 + ... + n xn-1)    where x = g2 ... and we have a magic formula for that sum
= x [n xn+1 - (n+1)xn+1]/(x-1)2
= g2 [n g2n+2 - (n+1)g2n+1]/(g2 -1)2

Magic Formula #4 was used:     1+2x+3x2+...+nxn-1 = [nxn+1 - (n+1)xn+1] / (x-1)2

Evaluating   Σ[ j gk+j]

>That looks awful. How about in all its grandeur, eh?
Okay, in all its grandeur, it looks like:
Σ[ j gk+j] = [g3] + [g4+2g5] + [g5+2g6+3g7] + [g6+2g7+3g8+4g9] + ... + [gn+1+2gn+2+3gn+3+...+(n-1)g2n-1]
= g3 + g4[1+2g] + g5[1+2g+3g2] + ... + gn+1[1+2g+3g2+...+(n-1)gn-2]   for n-1 terms
= (g3+g4+g5+...+gn+1) + (g4+g5+g6+...+gn+1)2g + (g5+g6+...+gn+1)3g2 + ... + gn+1(n-1)gn-2
      where we've collected the terms multiplying 1 then 2g then 3g2 etc. ... ending with the term multiplying (n-1)gn-2
= g3[1+g+g2+...+gn-2] + 2g5[1+g+g2+...+gn-3] + 3g7[1+g+g2+...+gn-4] + ... + (n-1)g2n-1[1]
= g3[(gn-1-1)/(g-1)] + 2g5[(gn-2-1)/(g-1)]+ 3g7[(gn-3-1)/(g-1)] + ... + (n-1)g2n-1[(g-1)/(g-1)]
      where we've used another magic fromula: 1 + x + x2 + ... + xm-1 = (xm - 1)/(x-1)
= [ [gn+2-g3] + [2gn+3-2g5] + [3gn+4-3g7] + [4gn+5-4g9] +... + [(n-1)g2n-(n-1)g2n-1] ] / (g-1)
= [ gn+2[1+2g+3g2+...+(n-1)gn-2] - g3[1+2g2+3g4+...+(n-1)g2n-4] ] / (g-1)
      where we've collected like terms
= [ gn+2[(n-1)gn - ngn-1+1] / (g-1)2 - g3[(n-1)g2n - ng2n-2+1] / (g2-1)2 ] / (g-1)
      where we've used magic formual [5] with x = g and again with x = g2
= [ { (n-1) g2n+2 - n g2n+1 + gn+2 } (g+1)2 - (n-1) g2n+3 + n g2n+1 - g3 ] / [ (g - 1)(g2 - 1)2 ]
      where we've taken a (g2 - 1)2 out, to the right
= ...

>Can't you just give the final result?!
Okay, we've calculated both sums ... so here it is:

[!!!]   VAR[G1 + G2 + ... + Gn] = s2 [n g2n+2 - (n+1)g2n+1]/(g2 -1)2
                            + 2(s2/g2) [ {(n-1) g2n+2 - n g2n+1+gn+2} (g+1)2 - (n-1) g2n+3 + n g2n+1 - g3 ] / [ (g-1)(g2-1)2]

        where
        g1, g2, ...gn are random Gain Factors over n days,
        they are from a distribution with Mean = g and Standard Deviation = s,
        Gm = g1g2 ...gn are the cumulative Gain Factors
        and the formula is good for daily gains and n not too large (say n < 50)

>Isn't there something more elegant?
You said to forget elegance. Besides, we won't be using it ... not with pencil and paper. We'll use a spreadsheet and ...
>How good ... uh, how bad is it?
Okay, here's what we'll do (again!):

  1. Generate n daily returns: g1, g2, ...gn.
  2. With these, construct the numbers G1=g1, G2=g1g2, ... Gn=g1g2...gn.
  3. Calculate the SUM(n) = G1 + G2 + ... + Gn.
  4. Repeat steps 1, 2 and 3 ten thousand times and calculate the Variance of the 10,000 numbers SUM(n).
  5. Repeat steps 1, 2, 3 and 4 for n = 1, 2, 3, ... 40.
  6. Compare the Variances obtained (using this actual data) with the formula [!!!].
The result is shown below where we also plot the Standard Deviation = SQRT(Variance).
It assumes an average daily return of 1% and a Standard Deviation (of daily returns) of 2%:


Figure 2


Notice an interesting thing, in [!!!].
The Variance of the sum of Gain Factors for the past n days is proportional to the Variance of the Returns, namely s2.

That means that the Standard Deviation of this Special Sum sum is proportional to s.
It looks like:

         SD[G1 + G2 + ... + Gn] = f(n,g)s.
If we assume that the starting stock price was P0, n days ago, then we have:

         SD[P1 + P2 + ... + Pn] = P0f(n,g)s.

>Yeah, so what good is it?
Some time ago I was looking for the Variance of stock prices over the past n days, in connection with Bollinger Bands,
here.

>I remember. You got a lousy result.
Uh ... yes, thanks. I took "the Variance of a Sum = the Sum of the Variances" as an approximation and ...

>That's your creeping senility ... again?
Yes, thanks ... again.