Hypothesis Testing

By Issa Bass

 Introduction

The confidence interval can help estimate the range within which we can, with a certain degree of confidence find the values of a the population mean and the population variance by analyzing a sample.

The hypothesis testing is about testing the validity of a hypothesis made about a population.

A hypothesis is a value judgment, a statement based on an opinion about a population. It is developed in order to make an inference about that population. 

Based on experience, a design engineer can make a hypothesis about the performance or qualities of the products she is about to produce, but the validity of that hypothesis needs to be ascertained in order to conform the products to the customer specifications, a test needs to be conducted in order to determine if the empirical evidence does confirm the hypothesis.

Some examples of hypothesis are:

 1. The average number of defects per circuit board produced on a given line is 3

 2. The lifetime of a given light bulb is 350 hours

 3. It will take less than 10 minutes for a given drug to start taking effect.

Most of the time, the population being studied is so large that examining every single item would not be cost effective.  So,  a sample will be taken and an inference will be made for the whole population.

4 -2 How to conduct a hypothesis testing

Suppose that Sikasso, a company that produces computer circuit boards wants to test a hypothesis made by an engineer that exactly 20% of the defects found on the boards are traceable to the CPU socket. Since the company produces thousands of boards a day, it would not be cost effective to test every single board to validate or reject that statement, so a sample of boards is analyzed and statistics computed. Based on the results found and some decision rules, the hypothesis is or is not rejected.  If 10% or 29% of the defects on the sample taken are actually traced to the CPU socket, the hypothesis will certainly be rejected, but what if 19.5% of the defects are actually traced to the CPU socket?

Should the 0.5 percentage be attributed to a sampling error? Should we reject the statement in that case?

To answer these questions, we need to understand how a hypothesis testing is conducted. There are six steps in the process of testing a hypothesis to determine if it is to be rejected or not beyond a reasonable doubt. The following six steps are usually followed to test a hypothesis.

1. Null hypothesis

The first step consists in stating the hypothesis. In the case of the circuit boards at Sikasso, the hypothesis would be: “exactly 20% of the defects on the circuit board are traceable to the CPU socket” . That statement is called the null hypothesis and is denoted and is read “H sub zero”.

The statement will be written as:

2. Alternate hypothesis

If the hypothesis is not rejected, 20% of the defects will actually be traced to the CPU socket but if enough evidence is statistically provided that the null hypothesis is untrue, an alternate hypothesis should be assumed to be true. That alternate hypothesis denoted tells what should be concluded if is rejected.

3.  Test statistic

The decision made on whether to reject or fail to reject it depends on the information provided by the sample taken from the population being studied.  The objective here is to generate a single number that will be compared to  for rejection or not. That number is called the test statistic.

In order to test the mean  the z formula is used when the sample size is greater than 30.

4. Level of significance or level of risk

The level of risk addresses the risk of failing to reject a hypothesis when it is actually false or rejecting a hypothesis when it is actually true.

Suppose that in case of the defects on the circuit boards, a sample of 40 boards was randomly taken for analysis and 45% of the defects were actually found to be traceable to the CPU sockets. In that case, we would reject the null hypothesis as false. But what if the sample was taken from a substandard population? We would have rejected the null hypothesis that may be true . We therefore would have committed what is called the type I error.

However, if we actually find that 20% of the defects are traceable to the CPU socket from a sample and only the boards on that sample out of the whole population happened to have those defects, we would have made the type II error. We would have assumed the null hypothesis to be true when it actually is false.

The probability of making a type I error is referred to as  and the probability of making a type II error is referred to as .

There is an inverse relationship between  and .

5. Decision rule determination

The decision rule determines the conditions under which the null hypothesis is rejected or not. The following one tailed (right-tailed) graph shows the region of rejection, the location of all the values for which the probability of the null hypothesis being true is infinitesimal.

The critical value is the dividing point between the area where is rejected and the area where it is assumed to be true.

6. Decision making

Only two decisions are considered, either the null hypothesis is assumed to be true or it is rejected. The decision to reject a null hypothesis or not depends on the level of significance. That level often varies between 0.01 and 0.10.

Even when we fail to reject the null hypothesis, we never say “we accept the null hypothesis” because failing to reject the null hypothesis that was assumed true does not equate proving its validity.

4 -3 Testing for a population mean

4 -3.1 Large sample with known

When the sample size is greater than 30 and is known, the z formula can be used to test a null hypothesis about the mean.

Example:

An old survey had found that the average income of operations managers for Fortune 500 companies was $80,000 a year. A pollster wants to test that figure to determine if it is still valid, she takes a random sample of 150 operations managers to determine if their average income is $80,000. The mean of the sample is found to be $78,000 with a standard deviation is assumed to be $15,000. The level of significance is set at 5%. Should she reject $80,000 as the average income or not?

Solution:

The null hypothesis will be $80,000 and the alternate hypothesis will be anything other than $80,000.

Since the sample n is larger than 30, we can use the Z formula to test the hypothesis. Since the type I error is set at 5%, in other words  and we are dealing with a two tailed test, the area under each tail of the distribution will be . The area between the mean and the critical value on each side will be 0.4750 (05 - 0.025). The critical Z value is obtained from the Z-score table by using the .4750 area under the table.

.4750 corresponds to .

The null hypothesis will not be rejected if  and rejected otherwise.

Since Z is within the interval , the statistical decision should be not to reject the null hypothesis.

$78,000 is just the sample mean, if a confidence interval were determined, $80,000 would have been the estimate point.

5 Statistical Inference about two populations

So far all our discussion has been focused on samples taken from one population. We have learnt how to determine sample sizes, how to determine confidence intervals for , for proportions and for and how to test a hypothesis about those statistics.

Very often, it is not enough to be able to make statistical inference about one population.  We sometimes want to compare two populations. A quality controller may want to compare data from a production line to see what effect the aging machines are having on the production process over a certain period of time. A manager may want to know how the productivity of her employees compares to the average productivity in the industry.

In this section, we will learn how to test and estimate the difference between two population means, proportions and variances.

5 -1 Inference about the difference between two means

Just as in the analysis of a single population, to estimate the difference between two populations, the researcher would draw samples from each population. The best estimator for the population mean  was the sample mean , so the best estimator of the difference between the population means  will be the difference between the sample means .

The central limit theorem applies in this case too. When the two populations are normal,  will be normally distributed and it will be approximately normal if  the samples sizes are large, 

 The standard deviation for  will be

and its expected value

Therefore

This equation can be transformed to obtain the confidence interval.

Example:

In December, the average productivity per employee at Senegal-Electric was 150 machines per hour with a standard deviation of 15 machines. For the same month, the average productivity per employee at Cazamance-Electromotive was 135 machines per hour with a standard deviation of 9 machines. If 45 employees at Senegal-Electric and 39 at Cazamance-Electromotive were randomly sampled, what is the probability that the difference in sample average would be greater than 20 machines?

Solution:

         

From the Z-score table, the probability of getting a value between 0 and 1.88 is 0.4699 and the probability for z to be larger than 1.88 will be 0.5 - 0.4699 = 0.0301.

So the probability that the difference in the sample average to be greater than 20 machines is 0.0301. In other word there are 3.01% chances that the difference would be at least 20 machines.

Since the populations standard deviations are seldom known, the formula above is rarely used, therefore the standard error of the sampling distribution has to be estimated. At least two conditions must be considered, the approach we take when making an inference about the two means depends on whether their variances are equal or not.

5 -2 Small independent samples with equal variances

In the previous example, the sample sizes were both greater than 30, so the z-test was used to determine the confidence interval. If one or both samples are smaller than 30, the t -statistic must be used.

If the population variances and are unknown and we assume that they are equal, they can be estimated using the sample variances and . The estimate based on the two sample variances is called the pooled sample variance. Its formula is given as:

is the degree of freedom for sample 1

 is the degree of freedom for sample 0

The denominator is just the sum of the two degrees of freedom.

Example 1:

The variances of two populations are assumed to be equal. A sample of 15 items was taken from population 1 and it generated a standard deviation of 3 and a sample of 19 from population 2 and a standard deviation of 2 was obtained.  Find the pooled sample variance.

Solution:

Note that

If

 then can be simplified

Therefore

For sample sizes smaller than 30, the t -statistic will be used.

But since  and are unknown and they can be estimated based on the samples standard deviation, the denominator will be changed.

and therefore,

This equation can be transformed to obtain the confidence interval for the populations means.

Example:

The general manager of Jolof -Semiconductors oversees two production plants and has decided to raise the customer satisfaction index (CSI) to at least 98. To determine if there is a difference in the mean of the CSI in the two plants, random samples are taken over several weeks. For the Kayor plant, a sample of 17 weeks has yielded a mean and a standard deviation of 96 and 3 CSI and for the plant of Matam a sample of 19 weeks has generated a mean and a standard deviation of 98 and 4 CSI.

At the 0.05 level, determine if a difference exists in the mean level of CSI for the two plants, assuming that the CSIs are normal and have the same variance.

Solution :

Let's estimate the common variance with the pooled sample variance .

The value of the test statistic is:

Therefore

Because the alternate hypothesis does not involve “greater than” or “less than” but rather “is different from”, we are faced with a two tailed rejection region with at the end of each tail with a degree of freedom df of 34. From the t  table we obtain , and is rejected when .

Conclusion:

Since is not in the rejected region, we cannot eject the null hypothesis. There is not enough evidence at a significance level of 0.05 to conclude that there is a difference in the mean level of the CSIs for the two plants.


About the author
Issa Bass is the managing editor of SixSigmaFirst. He can be reached at issa@sixsigmafirst.com

Tell us what you think about this article. Send a note to the Editor.

www.manorhouseassociates.com

 

Place your Ad here
Six Sigma Statistics
Order "Six Sigma Statistics with Excel and Minitab," the new book by Issa Bass.