Sample Size Website - Log
Make no mistake - working out sample sizes is pretty difficult !
My goal is to set up the best, free, sample size calculation website on the internet today. This blog is my story of how the site comes to be.
So far, I have scoped out the project (and it is indeed a project, not a task). The site will work out estimated sample sizes for:
- Means
- Proportions
- Standard Deviations
- Rates
As a statistician, I already know what equations need to be used. As a programmer, I have written Java classes that calculate the Complimentary Error Function (ERFC), and it's t-test equivalent.
Fortunately, Java already provides the arcsin function, for use with the Dobson-Gebski correction for small proportions!
Next steps?
Test the "ERFC" Java class from user inputs.This should be shortly after the new year ...
Links to Definitions used in Sample Size Calculations
Sample sizes are almost always calculated as part of an experiment, to test a hypothesis. The following describes the hypothesis testing process, and hyperlinks to useful web pages are included.
- Define the hypothesis
- Decide what data type to use. Means, standard deviations, proportions, rates (Poisson) etc
What assumptions have been made?
- For means, is the data set Gaussian? If not, can the central limit theorem be used?
- For proportions, are there enough points? (e.g. n × p > 10)
- What Confidence and Power should used, or confidence interval?
- What is the size of the difference to be detected?
- Is the test for 1-sample or 2-sample?
- Is the test 1-tailed or 2-tailed?
- Formula for means
- Formula for proportions
- Formula for standard deviations
- Formula for rates
The formula provides an estimate of the sample size needed. The experiment is then run using the sample size provided. After the experiment, the appropriate statistical test is used to accept or reject the null hypothesis.