This document requires Netscape 3.x or compatible Web Browser.


UT Bullet Biostatistics for the Clinician

Biostatistics for the Clinician

UT Logo

University of Texas-Houston
Health Science Center

Lesson 1.1

Biostatistics: How Much, Why, When and What

Lesson 1: Summary Measures of Data 1.1 - 1 UT Bullet

UT Bullet Biostatistics for the Clinician

1.1 Biostatistics: How Much, Why, When and What

A critical distinction between the scientific approach and other methods of inquiry lies in the emphasis placed on real world validation. Where research has shown that particular approaches are appropriate and effective for specific applications, clinicians are wise to select those approaches. Clinicians are then called upon to defend decisions on the basis of empirical research evidence. Consequently, the clinician must be an intelligent consumer of medical research outcomes, able to understand, interpret, critically evaluate and apply valid results from the latest medical research.

1.1.1 How Much: The Six Year Old Biostatistician

Norman & Streiner (1986) tell an old story about three little French boys who happened to see a man and woman naked on a bed in a basement apartment. The four year old said, "Look, that man and woman are wrestling!". The five year old said, "You silly, they're not wrestling they're making love!" The six year old said, "Yes! And very poorly too!!" The four year old did not understand. The five year old had achieved a conceptual understanding. The six year old understood it well enough, presumably without actual experience, to be a critical evaluator. The intent of the following instruction is to make you a critical evaluator of medical research, a "six year old biostatistician". So that is the purpose of these lessons - to turn you into a six year old biostatistician.

Biostatistics: How Much, Why, When and What
Practice
Exercise 1:
Given achievement of the objectives of these lessons you should:

No Response
Be a competent biostatistician
Be a critical evaluator of medical research
Be a competent producer of medical research
None of the above


Lesson 1: Summary Measures of Data 1.1 - 2 UT Bullet

UT Bullet Biostatistics for the Clinician

Professional Benefits


You are about to overview the most frequently used and most important descriptive and inferential biostatistical methods as they are relevant for the clinician. The goal is that you will appreciate how the application of the theories of measurement, statistical inference, and decision trees contributes to better clinical decisions and ultimately to improved patient care and outcomes.

Biostatistics: How Much, Why, When and What
Practice
Exercise 2:
Being well informed about biostatistics contributes to (check all that apply):

No Response
Better clinical decisions
Improved patient outcomes
Improved patient care
All except "No Response" above


Lesson 1: Summary Measures of Data 1.1 - 3 UT Bullet

UT Bullet Biostatistics for the Clinician

Conceptual Understanding


Conceptual understanding, rather than computational ability, will be the focus. Development of an adequate vocabulary, an examination of fundamental principles and a survey of the widely used procedures or tools to extract information from data, will form a basis for fruitful collaboration with a professional biostatistician when appropriate.

Biostatistics: How Much, Why, When and What
Practice
Exercise 3:
Computation is a focus of these lessons.

No Response
True
False


Biostatistics: How Much, Why, When and What
Practice
Exercise 4:
Conceptual understanding is a goal of these lessons.

No Response
True
False

Lesson 1: Summary Measures of Data 1.1 - 4 UT Bullet

UT Bullet Biostatistics for the Clinician

Needs of Clinicians


The object is to help you understand the tools and procedures that are used in statistics and, when you really want to do statistics, to ask a biostatistician to help you with the statistics. So the objective here is not to make you into biostatisticians, but into appreciators of what biostatistics can contribute to the appropriate care of your patients and to seek the appropriate help when necessary. The needs of practicing physicians, not the skills to be a biostatistician or for sophisticated medical research, will inform the presentations.

Biostatistics: How Much, Why, When and What
Practice
Exercise 5:
Biostatistical issues emphasized will be based on the needs of:

No Response
Biostatisticians
Evaluators
Practicing physicians
Medical researchers


Lesson 1: Summary Measures of Data 1.1 - 5 UT Bullet

UT Bullet Biostatistics for the Clinician

1.1.2 Why You Need Biostatistics

Now if you had to boil it all down, the goal of biostatistics is very straightforward. And, that is to show that the treatment and only the treatment caused the effect. So that is really the job of biostatistics When you have a complex being like a human animal and you're performing experiments on that human animal you try to control as many variables as you can. You may take a sample that has only one gender. You may take a sample that has only a narrow range of ages.

OK let's go back and look at what biostatistics is going to do for us here. It's going to limit the effects of chance and thats the main thing that it does. You'll see this with small samples in particular, so its going to take care of things like the false positives, for example, you might get in a small sample.

It helps us determine sample size. You have to have a big enough sample to find effects. So its going to help you figure out whether the sample size is big enough to help you detect the results that you think are clinically important. To give you an example, what if you have a new formula that presumably increases the weight per week of newborns and you want to know whether this particular formula really is a useful formula in increasing the weights of newborns. Well, you have to answer a question from the statistician. First of all, what do you consider a useful increase in weight per week as a clinician. So you're going to be asked questions by a statistician, if they're going to help you with these things, as to what you think is clinically important. In order to design the experiment to have enough people in the sample to find small effects, you'll have to define what is the smallest useful effect to you.

Is 5 years useful in longevity? Is 1 pound more than formula X important in this formula for newborns? You're now going to try and control for confounding variables, gender, other kinds of things that might confound results. You're going to try and design alternative ways of measuring humans, because humans are not laboratory chemicals, the controls and things are very difficult in humans and in your epidemiology lectures you'll have talked about other ways than randomized trials. Because randomized trials have tremendous flaws, they are a perfect experimental design, but humans don't always accept being in random trials because of practical constraints. So you have special kinds of people who say I'm going to be in a random trial and others who don't. So you already have a special kind of population that's skewed a bit. The question is whether you can design alternative ways of measuring effects without going to the extreme of a randomized clinical trial.

But, still there are many other kinds of things that might affect the outcomes that you'd see if you apply a treatment to that individual. So that when you boil it all down what biostatistics is trying to do is to eliminate or to minimize anything that might interfere with your being able to prove that the treatment and only the treatment caused the effect. These issues are summarized below.

BIOSTATISTICS
Red Bullet Goal of Experimental Method:
  • To prove that the treatment and only the treatment caused the effect.
Red Bullet Usefulness:
  • Place limits on effects of chance in small sample experiments - (Alpha or False Positives).
  • Determine sample size needed to detect clinically relevant effects - (Beta or False Negatives).
  • Control for effects of one or more confounding variables.
  • Assist in developing alternative designs for human experiments.
  • Use maximum information content measurement.
  • Measure intangibles such as intelligence, depression, and well-being.

Lesson 1: Summary Measures of Data 1.1 - 6 UT Bullet

UT Bullet Biostatistics for the Clinician

1.1.3 When You Need Biostatistics

Whole Populations

Now, biostatistics is useful in some areas but not in others. You need to know when it makes sense to use biostatistics and when it doesn't make sense. To help illustrate let's look at some questions.
  • Let's say you want to know whether there are more women or men in your biostatistics class. How would you calculate the answer?

  • If you find a difference in the number of men and women, is the difference statistically significant? More specifically what test would you use to try to establish significance?

Well, for the first question all you have to do is count. No need for any fancy statistics because the number is relatively small and it's easy to count.

For the second question - once you have counted them, how would you know whether the the result was statistically signficant? Say there are 150 total students and you find out there are 80 women and 70 men. What would you do?

The answer is, no significance test is needed here. The reason is you have all the data from the entire population.

On the other hand, lets suppose you took a small sample from the larger population of the whole class and for that small group you measured the numbers, and then you wanted to try to extrapolate to the whole class. Then you need inferential statistics because your inferring from that small group what the whole population is like. When you use the entire population, inferential statistical methods are inappropriate, unnecessary and irrelevant. Inferential statistics is needed only when you're trying to infer from a small group to a larger group. So inferential statistics is worth zip, zero when you have the entire population and can count.

Biostatistics: How Much, Why, When and What
Practice
Exercise 6:
Inferential statistics are needed when you:

No Response
Summarize data from a group
Generalize from samples to populations
Compute statistical measures
Have huge samples


Lesson 1: Summary Measures of Data 1.1 - 7 UT Bullet

UT Bullet Biostatistics for the Clinician

Intuitive Biostatistics

Keep in mind that when you think biostatistics don't think of complicated technical jargon and computations. Don't think the names of the tests t-test, chi-squared and so on. But, try to approach it from the fresh point of view of a person who's kind of naive, walks in and says, "I think I'll try to figure out how to do this sort of thing." That's the way you'll develop a conceptual feel for this sort of thing, rather than relying on faded memories of times past when you were in some biostatistics course.

Biostatistics: How Much, Why, When and What
Practice
Exercise 7:
You will do better with biostatistics if you focus on the underlying concepts rather than the jargon and the notation.

No Response
True
False

Lesson 1: Summary Measures of Data 1.1 - 8 UT Bullet

UT Bullet Biostatistics for the Clinician

Huge Samples

Now lets take a second problem. Let's say you perform a test of fitness on people. Let's say you want to examine whether their fitness affects their longevity. In other words you're trying to find evidence concerning the hypothesis that longevity is associated with fitness. You want to be able to determine whether fitness causes a longevity effect. Let's say you get 300,000 males and follow them for 50 years to see how long they live. So you measure the fitness of these 300,000 males. You track them for 50 years, see when they die, and you get the following results. For those who are fit, the mean longevity is 75 years. For those that are not fit, it is 70 years. Is the difference in longevity statistically significant and what test would you calculate to do it?

Before you go to the statistics books and look through the lists of significance tests though, first do a little thinking. The last time you read the New England Journal of Medicine how many studies did you read that had 300,000 males in the trial?

Probably some have or tens of thousands or even scores of thousands. But there are probably no medical studies that involve that several hundred thousand. When you get very large numbers experimental error becomes negligibly small So for large numbers you don't need statistical tests. If you're able to go out and get 300,000 volunteers, once you get big numbers you don't need inferential statistics. Because the numbers are there. They are overwhelming. And, the error associated with such large samples is made very small by the fact that you have huge groups.

Biostatistics: How Much, Why, When and What
Practice
Exercise 8:
Inferential statistics are not useful when you:

No Response
Predict treatment effects
Generalize from samples to populations
Have small samples
Have huge samples


Lesson 1: Summary Measures of Data 1.1 - 9 UT Bullet

UT Bullet Biostatistics for the Clinician

Samples from Populations

Now let's start with the last question, but this time let's say you have 50 males. Let's say you get the same ratio of females to males. What tests should you use and are the results statistically significant?

Let's be more precise. Does increased fitness increase longevity? What test would you use to answer this question?

You would probably use a t-test. You are comparing the differences in the means of 2 independent groups. Our focus now, however is really not on the kind of inferential test that ought to be used. The point is that this is a situation where you would want to realize that the way to approach the problem is with inferential statistics. You have samples from larger populations and you want to use data from the sample to make inferences about and generalize to the large populations.

Clinical Relevance

Now let's go to a last question. Let's suppose you have results like the fitness longevity research just described. You have applied a significance test and found that the results are statistically significant. Is that useful to you as a physician? What measure would you apply or what do you call the measure of usefulness to physicians when you look at results like that?

So, you know that the statistics are significant! Does that mean something to you? And if it does, what do you call that kind of analysis you go through. In other words, you've combed the journal articles and you've found the journal articles that show these are statistically significant. So, you know the results were probably not due to chance. That's one of the mistakes that biostatistics prevents you from making, that the result you got was just by chance from having a funny group or something like that. So, you've got another result that's statistically significant. Is that the end of your analysis? Do you care about anything else?

Another way to phrase the question is, "What kinds of questions do you ask as a physician once you have results from a statistician that have been validated as statistically significant?" If you have 50 males in a study, obviously you've left out one group in the population. So the question is should you go back and look at another group or something like that so you can apply it to the population that may be in your practice? But, the questions are really broader than that.

Some obvious questions include:

  • Is it relevant? Does it pertain to my patients?
  • Is it feasible? Can it be implemented in making patients lives better?
  • Can the result be applied to my patient population?
  • Is it reproducible?
  • Is there a need for more evidence?
  • Is data to be applied to patients as a whole or individual patients?
  • Is it appropriate to apply results from a group to specific individuals?

Group Data vs. Individual Differences

Let's look more closely at these questions. Are you going to be applying the results to patients as a whole or individual patients? If you apply group results to individuals, is it appropriate? Can you apply results that have some statistical importance to individual patients? Do you have a better alternative?

Modern medicine is based upon this leap of faith -- that the best evidence you have is epidemiological evidence. But, one of the primary distractions physicians have to be careful about is the tendency, because they've seen results in 10 of their own patients, to inappropriately generalize from this perhaps unrepresentative sample to others, neglecting the population kinds of statistics.

Biostatistics: How Much, Why, When and What
Practice
Exercise 9:
What sources of information should the physician particularly guard against in clinical decision making?

No Response
Valid medical research studies
Patient's individual characteristics
Biased samples
None of the Above


So, appropriateness is certainly an important question. Can you apply everything you know to every person? You'll have to answer that based upon individual characteristics. But knowing nothing else about a person other than that you had this result, medicine as a discipline would strongly suggest that you have no alternative but to use this kind of statistically based information in clinical decision making. Otherwise you're distracted by all kinds of bias.

Biostatistics: How Much, Why, When and What
Practice
Exercise 10:
Can the physician appropriately apply results obtained from patient groups to his own individual patients?

No Response
Yes
No
It depends

Lesson 1: Summary Measures of Data 1.1 - 10 UT Bullet

UT Bullet Biostatistics for the Clinician

Who Decides?

Is it clinically significant? And, does it make sense to my patients? Is this result worth whatever it requires to be fit throughout my life? You look at things like smoking cessation. There are many countries around the world where smoking cessation is nothing like it is in the United states. And they have made some decisions that their lifestyle is more important, and that includes smoking, than is heart and lung disease, cost to the general population and all the terrible things that smoking causes.

Now, I wouldn't suggest that's a very wise clinical decision. The point though is you have to make a decision based upon a broader array of things than the numbers that the statistician gives you. If, for example, you have a treatment in which there is a small outcome, but the sample size is large enough for it to be statistically significant, the small difference may still not warrant a change. The change may be so small in your patients lives, that it may not be worth the treatment aspects.

So, fundamentally your job is to go to the literature, find out if somebody has applied the appropriate tests, and if they are statistically significant, that's where you're job begins. You say whether or not the result is clinically significant or appropriate for your patients. So, you collaborate with a biostatistician to get the best from the data if you're doing the analysis. But, in the final analysis it is the physician who determines whether or not to apply the result to particular patients, taking into account all the individual things you know about them, and whether or not the result is worth it.

Biostatistics: How Much, Why, When and What
Practice
Exercise 11:
Who decides whether a biostatistical result is clinically relevant?

No Response
Biostatistician
Patient
Physician
Other


Lesson 1: Summary Measures of Data 1.1 - 11 UT Bullet

UT Bullet Biostatistics for the Clinician

1.1.4 What Statistics

You are going to start by focusing on summary measures of data. That is, how you take data from a population or a group and somehow express it concisely in a few summary measures like some of those listed below.

Summary Measures of Data:
  • Definitions: Types of Variables, Quality of Measurements
  • Central Tendency: Mean, Mode, Median
  • Variability: Standard Deviation
  • Exploratory Data Analysis (EDA): Median, Hinges, Ranges, Box & Whisker Plots, Outliers
  • Distributions: Gaussian, Binomial, Poisson

In the rest of Lesson 1 you're going to look at several ways of summarizing data. First, in the next section, you will look at kinds of variables. Then you will get into measures of central tendency and variability and how they are used. You're then going to take a peek at exploratory data analysis, which is an area that would be very useful if you were going to be collecting data yourself. And, lastly you'll be looking at distributions of information, Gaussian, binomial and Poisson distributions. You'll get a big picture view of the whole of the field of descriptive statistics, the more technical name biostatisticians use to describe summary measures of data.


Final Instructions

Press Button below for your score.

  • After completing Lesson 1.1, including all practice exercises, press the "Submit... " button below for Lesson 1.1 research participation credit.
  • After you press "Submit..." it is possible Netscape may tell you it is unable to connect because of unusually high system demands. If you receive no error message upon submission you're OK. But, if Netscape gives you an error message after you press the "Submit..." button, wait a moment and resubmit or consult the attendant.
  • Finally, press the "Table of Contents..." button below to correctly end Lesson 1.1 and return to the Lesson 1 Table of Contents so you may continue with Lesson 1.2.

End Lesson 1.1
Biostatistics: How Much, Why, When and What


Lesson 1: Summary Measures of Data 1.1 - 12 UT Bullet