And because the result was so unexpected and so revolutionary, that’s exactly what most physicists think happened - some undetected source of error. For one thing, it assumes that the researchers have done the analysis correctly and haven’t overlooked some systematic source of error. (A popular business-management strategy called “Six Sigma” derives from this term, and is based on instituting rigorous quality-control procedures to reduce waste.)īut in that CERN experiment, which had the potential to overturn a century’s worth of accepted physics that has been confirmed in thousands of different kinds of tests, that’s still not nearly good enough. In most cases, a five-sigma result is considered the gold standard for significance, corresponding to about a one-in-a-million chance that the findings are just a result of random variations six sigma translates to one chance in a half-billion that the result is a random fluke. Technically, the results of that experiment had a very high level of confidence: six sigma. That much uncertainty is fine for an opinion poll, but maybe not for the result of a crucial experiment challenging scientists’ understanding of an important phenomenon - such as last fall’s announcement of a possible detection of neutrinos moving faster than the speed of light in an experiment at the European Center for Nuclear Research, known as CERN. Of course, that also means that 5 percent of the time, the result would be outside the two-sigma range. If a poll found that 55 percent of the entire population favors candidate A, then 95 percent of the time, a second poll’s result would be somewhere between 52 and 58 percent. That means if you asked an entire population a survey question and got a certain answer, and then asked the same question to a random group of 1,000 people, there is a 95 percent chance that the second group’s results would fall within two-sigma from the first result. That two-sigma interval is what pollsters mean when they state the “margin of sampling error,” such as 3 percent, in their findings. Lebel Professor of Electrical Engineering at MIT, who teaches the course Fundamentals of Probability, says, “Statistics is an art, with a lot of room for creativity and mistakes.” Part of the art comes down to deciding what measures make sense for a given setting.įor example, if you’re taking a poll on how people plan to vote in an election, the accepted convention is that two standard deviations above or below the average, which gives a 95 percent confidence level, is reasonable. However, how to use this yardstick depends on the situation. So, when is a particular data point - or research result - considered significant? The standard deviation can provide a yardstick: If a data point is a few standard deviations away from the model being tested, this is strong evidence that the data point is not consistent with that model. Two sigmas above or below would include about 95 percent of the data, and three sigmas would include 99.7 percent. One standard deviation, or one sigma, plotted above or below the average value on that normal distribution curve, would define a region that includes 68 percent of all the data points. The standard deviation is just the square root of the average of all the squared deviations. In the coin example, a result of 47 has a deviation of three from the average (or “mean”) value of 50. The deviation is how far a given data point is from the average. If you plot your 100 tests on a graph, you’ll get a well-known shape called a bell curve that’s highest in the middle and tapers off on either side. You’ll get quite a few 45s or 55s, but almost no 20s or 80s. You’ll get almost as many cases with 49, or 51. But if you do this test 100 times, most of the results will be close to 50, but not exactly. In many situations, the results of an experiment follow what is called a “normal distribution.” For example, if you flip a coin 100 times and count how many times it comes up heads, the average result will be 50. The term refers to the amount of variability in a given set of data: whether the data points are all clustered together, or very spread out. The unit of measurement usually given when talking about statistical significance is the standard deviation, expressed with the lowercase Greek letter sigma (σ). It’s a question that arises with virtually every major new finding in science or medicine: What makes a result reliable enough to be taken seriously? The answer has to do with statistical significance - but also with judgments about what standards make sense in a given situation.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |