Monday, March 5, 2012

Minimum Sample Size, Were Results Obtained by Chance and Minimum Account Size Required

To build on the last post, which covered expectancy and R-Multiple distribution, let’s take a look at other important statistical measures.  This will be important when we start building and testing our trading systems.

Summary
1.         Determine the minimum sample size of trades to test a trading system. 
2.         The minimum sample size will ensure that you have an acceptable margin of error (+/- 5% generally) and that the sample results are within 95% (2 standard deviations) of the average value of the entire population.
3.         The T-Test determines if your results occurred by chance.
4.         Optimal F determines the minimum account size required to trade the idea.

Determining Sample Size

Sample size determination is the act of choosing the number of observations to include in a statistical sample. You see a lot of rules of thumb out there for how many trades you should have before you can verify if you have a robust trading system.  I have often seen that you should have “n=30” or “n=20” where n is equal to the number of trades you should have. 

It’s important to distinguish between a sample and a population.  A sample is a subset of a population.  A population is the entire set of values that are potentially observable.  We take a random sample from the population to make estimations about the entire population and make our testing more manageable.  Samples are collected and statistics are calculated from the samples so that one can make inferences or extrapolations from the sample to the population.

Say we want to know how many people in your town drink coffee.  The total population of the town is 20,000 people.  It would be highly impractical to ask all 20,000 people if they drink coffee.  So we could randomly stop 100 or 200 or 1000 people on the street and ask them if they drink coffee.  We can use the formulas below to determine what the random sample tells us about the population.

The sampling margin of error or “level of precision” is the range in which the true value of the population is estimated to be. This range is often expressed in percentage points, (e.g., ±5 percent), in the same way that results for political campaign polls are reported by the media. Thus, in our survey above, if we want to ensure that 50% of people in the sample drink coffee with a precision rate of ±5%, then we would conclude that between 45% and 55% of the sample drink coffee.

The confidence or risk level is based on the Central Limit Theorem (which we will discuss later). The key idea encompassed in the Central Limit Theorem is that when a population is repeatedly sampled, the average value of the attribute obtained by those samples is equal to the true population value. Furthermore, the values obtained by these samples are distributed normally about the true value, with some samples having a higher value and some obtaining a lower score than the true population value. In a normal distribution, approximately 95% of the sample values are within two standard deviations of the true population value (e.g., mean).

If a 95% confidence level is selected, 95 out of 100 samples will have the true population value within the range of precision specified earlier. For example, if we choose a 95% confidence level for our coffee drinkers, then we know that 95% of the population drinks coffee with a level of precision of +/- 5%/  There is always a chance that the sample we obtain does not represent the true population value. This risk is reduced for 99% confidence levels and increased for 90% (or lower) confidence levels.

The degree of variability in the attributes being measured refers to the distribution of attributes in the population. The more heterogeneous a population, the larger the sample size required to obtain a given level of precision. The less variable (more homogeneous) a population, the smaller the sample size. Note that a proportion of 50% indicates a greater level of variability than either 20% or 80%. This is because 20% and 80% indicate that a large majority do not or do, respectively, have the attribute of interest. Because a proportion of .5 indicates the maximum variability in a population, it is often used in determining a more conservative sample size, that is, the sample size may be larger than if the true variability of the population attribute were used.

Sample Size Formula #1

The first minimum sample size formula you can use is as follows:

Ss1 = N / 1 + N * (e) ^ 2

Where

ss = sample size
N = Total Population
e = sampling error required (expressed as a decimal)

Sample Size Formula #2

A second minimum sample size formula that we can use is as follows:

Ss2 = Z^2 * (p) * (1 – p) / C^2

where

ss = sample size
Z = Z-value (for example, use 1.96 for a 95% confidence level and 2.576 for a 99% level)
p = degrees of error (always use .5 for this)
C = sampling error (level of precision), expressed as a decimal (i.e., .04 = +/- 4)

Let’s apply the minimum sample size formulas to the coffee drinking problem above.

Ss1 = 20,000/1 + 20,000 * (.05) ^2 = 392

This tells us that we will need to randomly stop 392 people to have a level of precision of +/- 5%.  If we randomly ask 392 people if they drink coffee and 60% of them say yes, then we can conclude that, of the entire 20,000 people in the town, 12,000 of them probably drink coffee with a level of precision of +/- 600 people or 11,400 to 12,600.
Ss2 = 1.96^2 * .5 * (1-.5) / .05^2 = 372

If 50% of all the people in a population of 20,000 people drink coffee in the morning, and if you were repeat the survey of 372 people ("Did you drink coffee this morning?"), then 95% of the time, your survey would find that between 45% and 55% of the people in your sample answered "Yes". The remaining 5% of the time, or for 1 in 20 survey questions, you would expect the survey response to more than the margin of error away from the true answer. When you survey a sample of the population, you don't know that you've found the correct answer, but you do know that there's a 95% chance that you're within the margin of error of the correct answer.

In trading, because we don’t always know the entire population, it is probably better to use the Ss2 equation.  Say we were testing a system, and need to know how many trades to sample.  We would plug in our required confidence level (generally 95% or 99%) and level of precision (generally +/- 5%) to get our required sample size.  You can see that using a sample size of 30 trades is probably not enough in many cases. 

Determining if the Test Results Occurred by Chance Alone

The t-Test is a simple statistical test to determine if the results of a system occurred by chance alone.  The t-Test is calculated as follows:

t = square root (n) * (ATNP / SDev of All Trades)

where

n = sample size
ATNP = Average Trade Net Profit
SDev of All Trades = Standard Deviation of All Trades

You want to make sure that the t-Test has a value greater than 1.6 (or less than -1.6).  A t-Test of less than 1.6 favors chance and a test greater than 1.6 means you have a tradable idea.

Determining Optimal-F and Maximum Leverage

“Optimal f is the market’s line in the sand for a trading system. Optimal f is the maximum number of contracts you can trade given your account size. Trade any more contracts and your account becomes more and more likely to break under risk of ruin. You don’t have to trade the optimal f number of contracts, but you should never trade more than the optimal f number. The optimal f value, when divided into the largest losing trade for the idea gives the maximum leverage that can be applied to the idea and still avoid risk of ruin. Maximum leverage converts into the minimum account size required to trade n contracts of this idea.”  Henry Carstens, Vertical Solutions.

optimal f = (((1 + win loss ratio) * probability of winning trade) - 1)/ (win loss ratio)

maximum leverage = largest losing trade / optimal f

Example of T-Test, Optimal F and Maximum Leverage

For this example, you are testing a trading idea using 400 sample trades (found with the equation above), your average trade net profit is $200 or 4 ES points and the standard deviation of all trades is $450 or 9 points.  Our largest loss for this system was $700, our win/loss ratio was 1.5 and our winning percentage was 0.6.

Since your average trade has a profit of 4 points, a standard deviation of 9 points means that 68.2% of the trades fall within one standard deviation of the mean or between -5 pts and +13 pts and approximately 95% of all trades fall within two standard deviations of the mean or between -14 pts and +22 pts.

Did the results occur by chance?

T-test = SqRoot(400) * (200/450) = 8.888

Because the T-test is greater than 1.6, we can conclude that the results of this test are not by chance alone.

How many contracts should we trade?

optimal f = (((1 + 1.5) * .6) - 1) / 1.5

optimal f = .333

maximum leverage = 700 / .33 = $2121

maximum leverage = 1 contract for every $2121 in the account

The thinking is that if we keep $2121 in the account for each contract we trade, we will never blow out the account

Links

Sample size calculator.  Calculator.

Survey size calculator.  Survey Says.

Table of sample sizes for given populations.  Table of Sample Sizes Based on Population.

Formulas explained, including level of precision, confidence level and degree of variability.  Formulas for Determining Sample Size.

Z-scores for given levels of precision and confidence levels.  Z-score Table.

T-test, Optimal F and Maximum Leverage.  Vertical Solutions.

PDF of Introduction to Testing Trading Ideas.  Testing Trading Ideas PDF.

No comments:

Post a Comment