经济管理专业英语
上QQ阅读APP看书,第一时间看更新

Basic Concepts of Hypothesis Testing

We have seen that, in making decisions, we always have to deal with uncertainty.We can never be absolutely sure that any decision we make is correct(except perhaps after the fact—when it may be too late).....(except perhaps after the fact—when it may be too late). ……(除了在事件发生以后——不过那时已太晚了)。 Although intuitive feelings and hunches often play a role in managerial decisions,.Although intuitive feelings and hunches often play a role in managerial decisions, ...虽然直觉和预感常在管理决策中起作用,…… we need to have a more obj ective manner to reduce the uncertainty. Unfortunately, as we shall see, we can never completely eliminate uncertainty.We can, however, remove some of the guesswork and make our decisions in a more obj ective manner.

In this chapter we shall examine the role of hypothesis testing in decision making.We shall develop rules that will help us control and minimize the probability of error in choosing among alternative.To illustrate some of the basic concepts in hypothesis testing, let us look at an example.

Ever since the Arab embargo on oil following the Yom Kippur War in 1973, numerous products have entered the market claiming to substantially increase the mile-per-gallon ratings of the family“dream-turning-nightmare”machine.....to substantially increase the mile-per-gallon ratings of the family“dream-turning-nightmare”machine. ……极大地增加那些家庭里突然变糟的汽车的每加仑英里数。(注:汽油的紧张使许多很好的但耗油多的大汽车忽然变得不受欢迎。) Some involve additives that are mixed with the gasoline of the car, whereas others are devices that must be installed along the fuel line or attached directly to the carburetor.....whereas others are devices that must be installed along the fuel line or attached directly to the carburetor. ……而其他设计必须安装在油路中或直接连接汽化器。 All claim significant improvements in gasoline mileage.You have undoubtedly seen the advertisements, “Enjoy 10 percent to 40 percent improvement in your car's mileage ratings.”On what basis are these claims made? Unfortunately, they often represent nothing more than testimonials by“satisfied”users, usually in the form of letters written to the manufacturer.We know that convenience sampling of this sort is rarely representative of the population of interest(i.e., all purchasers of the additive or device)..We know that convenience sampling of this sort is rarely representative of the population of interest(i.e., all purchasers of the additive or device).我们知道这种方便的抽样方法并不能代表统计数字的全体(即全体购买添加剂或装置的人)。 Moreover, few consumers are capable of conducting or motivated to conduct the objective and systematic research necessary to test the claims.Additionally, there is a strong psychological tendency for consumers to look for and find improvements in order to justify their expenditure of funds.

Fortunately, there are a number of federal and private agencies that are set up specifically to evaluate claims of this sort.Let's follow through both the logic and the procedures for testing the mileage claims.To begin with, the claim is either true or it is false:an additive either increases mileage or it does not.It is our role as researchers to establish procedures for deciding between these two alternatives.Stated in formal terms, these alternative possibilities are:

1.The additive has no effect on gasoline mileage.

2.The additive has an effect on gasoline mileage.

Since the first is often an hypothesis of“no effect, ”it is referred to as the null hypothesis(H0).As we shall see, the null hypothesis may also specify a particular value of the parameter of interest.But characteristically it denies the effect we are seeking to evaluate.The second hypothesis, called the alternative hypothesis(H1), asserts that the null hypothesis is false.Note that these two hypotheses are mutually exclusive and exhaustive.They cannot both be true or false at the same time.One must be true and one must be false.It is the role of our data collection and statistical procedures to decide between these two alternatives.But, if you think for a moment, you'll realize how difficult it is to conceive of proving the null hypothesis.Except for taking a complete census, which is not feasible considering the millions of cars on the road, we must rely on drawing samples from a population.We have already seen that these sample statistics distribute themselves about a central value,.We have already seen that these sample statistics distribute themselves about a central value, ...我们已经看见这些样本统计值分布在中心值附近,…… such as the population mean or the population proportion.The sample means virtually never equal the population mean, and the sample proportions virtually never equal the population proportion.In addition, they almost always differ from one another and by variable amounts.How then can sample statistics be used to prove no difference? In a word, they cannot.

However, since H0and H1are mutually exclusive and exhaustive, rejecting H0allows us to assert H1.Note that the proof is indirect.We assert H1by rejecting H0.Also, statistical proof is always one-way.We may reject H0and assert H1, but we cannot reject H1and, thereby, assert H0.In brief, just as we cannot prove H1directly neither can we disprove it.For example, if two sample means happen to be identical, this does not prove no difference in the population parameters.In other words, we have not proved that they come from the same population, or from two different populations in which the population means are the same.We know that sample statistics, even if drawn from different populations, can occasionally be the same.It should be noted that the null and alternative hypotheses are always stated in terms of population parameters rather than in terms of sample statistics, although we use sample statistics to test the hypotheses.

Let's briefly summarize the logic of hypothesis testing up to this point.

First, we set up two mutually exclusive and exhaustive hypotheses.

1.The null hypothesis(H0):

a.Cannot be proved.We cannot prove the mileage additive or device does not work.

b.Can be rejected.If rejected, we assert H1.We can reject the hypothesis that the additive or device does not work and thereby assert that it does work.

2.The alternative hypothesis(H1):

a.Cannot be proved directly.We cannot directly prove that the mileage additive or device works.

b.Cannot be rej ected directly.We cannot directly rej ect the possibility that the additive or device works.

Only by rej ecting H0(b)can we assert our“indirect proof”of the alternative hypothesis.

But how do we go about rej ecting H0? This is where probability theory enters the arena.First, it should be emphasized that in hypothesis testing we assume that the null hypothesis is the true distribution.If, in the sampling distribution of a statistic under the null hypothesis, a particular result would have a low probability of occurrence, then one of two possible conditions has occurred.Namely, either the null hypothesis is true, and we obtained a result which had a very small probability of occurring, or the null hypothesis is not true.We choose to accept the latter condition, i.e., we rej ect H0and assert H1.

How do we define low probability? The definition is arbitrary but not capricious..The definition is arbitrary but not capricious.定义是据情况而定的,但不是多变的。 Some researchers are willing to rej ect H0when the obtained statistic would have occurred 5 percent of the time or less in the appropriate sampling distribution.This criterion for rej ecting H0is variously referred to as the 5 percent significance level, the 0.05 level of significance, or simply the 5 percent or 0.05 level.The criterion of rejection is typically represented by the Greek letter αalpha).Thus, when we use the 0.05 significance level, α=0.05.

Other researchers set up a more stringent criterion for rej ecting H0and asserting H1, namely the 0.01 or 1 percent significance level(i.e., α=0.01). Only when the obtained statistic would have occurred 1 percent or fewer times in the sampling distribution of interest would the researcher be willing to rej ect H0and assert indirect proof of H1.These two levels of significance are commonly used, although we will occasionally encounter other values, such as 0.10 or 0.001.

The level of significance that is set is not merely a matter of preference 22 among different researchers or different statisticians.The choice has to do with the consequences of making one of two types of error—mistakenly rej ecting a true H0or failing to rej ect a false H0.The same researchers or statistician may use different levels of significance in different experiments.

Note that statistical proof is not absolute proof in any sense of the word. If our test allow us to rej ect the null hypothesis that the additive or device does not work, we have not demonstrated beyond all reasonable doubt that it does work.As we have noted repeatedly, statistical analysis and probability theory help to reduce uncertainty.They do not eliminate it altogether.Indeed, analysis of the logic of statistical inference reveals that there are two types of errors we may commit:

1.We may rej ect the null hypothesis when it is true.Thus, we may falsely reject H0, that the additive or device does not improve mileage, and assert H1, that it does improve mileage.Such an error of falsely rej ecting H0is known as a Type I or Typeαerror.The probability of this type of error is given byα. Thus, if α= 0.05, we will mistakenly reject a true null hypothesis approximately five percent of the time..Thus, if α= 0.05, we will mistakenly reject a true null hypothesis approximately five percent of the time.因此,如果α=0.05,我们将有5%的可能错误地拒绝一个真实的假设。 Hence, our claim to have demonstrated an effect of changed conditions(such as better mileage when mixing an additive with the gasoline)will be wrong about 5 times in 100.....wrong about 5 times in 100. ……100次里错5次。 Some might consider this risk of error too high.That is why some investigators use α=0.01 or less.They are more conservative about their willingness to claim an effect where there may not be one.

2.In the second type of error, we fail to rej ect H0when it is actually false. This class of error is known as Type II or Type β error.If the device or additive actually improved mileage and we did not rej ect H0, we would have failed to claim an effect when there was one.Note that we do not claim to have proved H0but merely that we failed to rej ect it.We should not minimize the importance of a Type II error.Many promising lines of research.Many promising lines of research...许多有希望的研究思路…… have undoubtedly been abandoned prematurely because the results of preliminary investigations were not encouraging.

An ideal situation is one that results in a balance between the two types of errors.In this ideal situation, we should be able to state in advance the probability of making both a Type Ⅰ and a Type II error.

As we have seen, we may state the probability of a TypeⅠ error in terms of the value ofαwe employ.The probability of a Type II error is represented byβ.How can we evaluate this probability? We can determineβonly when H0 is false and we know the true value of the parameter under H1.Since this is rarely the case, it is difficult to evaluate this probability.However, procedures are available that permit us to estimate β even when the parameter is not known.These procedures are beyond the scope of this text.

There are certain strategies, however, that may be employed to reduce the probability of a Type II error.For example, the lower we set our α, the greater the likelihood of a Type II error.Thus, in general, ifα=0.05, βwill be less than ifα=0.01..Thus, in general, ifα=0.05, βwill be less than ifα=0.01.因此,总的来说,如果α=0.05, β将比α=0.01时出现的可能性小。 Another way to reduceβis to increase the sample size. The larger the sample, the smaller the probability of a Type II error.

Table 3-1 summarizes the type of error as a function of the true status of H0and the decision we have made.Note that a TypeⅠ error can be made only when H0is true, and a Type II error only when H0is false.We see that(1-α) is the probability of accepting H0when it is true, andαis the probability of rejecting a true H0.Thus, if α=0.05 and H0is true, the probability of accepting H0equals 1-0.05=0.95.Or, stated another way, if the null hypothesis is true, there is a 95 percent chance that it will be accepted.

Table 3-1 The Type of Error Made as a Function of the True Status of H0and the Decision We Have Made

Richard P.Runyon and Audrey Haber, Business Statistics © 1982, pp.224228.Reprinted by permission of Richard D.Irwin, Inc., Homewood, Illinois.

KeyTerms and Concepts

hypothesis testing one of the basic subdivisions of statistical inference, dealing with methods for testing hypotheses about population parameters.

Yom Kippur War war in Middle East starting on Yom Kippur in 1973.

null hypothesis The basic hypothesis H0is generally referred to null hypothesis.Actually it is a statement that specifies hypothesized values for one or more of the population parameters.

alternative hypothesis The opposite hypothesis of null hypothesis is designated the alternative hypothesis, or a statement that specifies that the population parameter is a value other than that specified in the null hypothesis.

population mean mean of the population.Usually it is not available since it is the average of the complete set of measurements or observations that interest the person collecting a sample.

population proportion relative frequency of occurrence with qualitative variables in terms of the complete set of measures or observations.

significance level a probability value that A influences B to allow the assertion that nonchance factors are operating.

Suggestions for Further Study

Reading ranks first in importance when studying a foreign language. Reading is also the most productive strategy in learning a foreign language. If a student can read a word, most likely he can understand the word in chatting or broadcasting.

Three situations make reading ineffective.First, one reads slowly, hesitating or stopping at every new word.Second, one reads fast, but stops at the wrong places and misses important ideas.Third, one reads fast and stops at the right places, but misunderstands the message or theme.

When reading, not every word is important, and this idea can be applied to Chinese reading materials as well.English learners in China like to stop at every new word, just because English is a foreign language.If one day they can ignore new words as they ignore new Chinese characters in evening news, they are free.

Key Words Reading