Wednesday, February 3, 2016

Reliability & Validity

Two big concepts-- Reliability and Validity

Without question, in order to understand effective social science research (or any kind of scientific research), you have to understand the concepts of "validity" and "reliability."

To put it simply, "validity" refers to whether a measure actually measures what you purport to be measuring.

For example, if you create a concept called "television use" and then decide to measure it by asking people how many TVs they own, that MIGHT be an indicator of how much TV they watch, but you definitely have some validity problems, right? Why

It would probably be better to measure the concept by asking people how many hours of TV per day they watch, on average (or better still, you might have them go hour by hour thinking only of yesterday and to report what they watched during the day).

I'm sure you can see that using a measurement like this is better than just asking how many TVs someone owns...

Reliability, meanwhile, simply refers to how often you can repeat the measurement and get the same result.

Let's take an example and put the two concepts together--

Suppose you have a digital bathroom scale and you step up on it and weigh yourself and it reads "145 lbs."

Now let's suppose you repeat that process ten times in a row and you get results like this--

1. 145
2. 144
3. 144.5
4. 145
5. 145.1
6. 144.5
7. 144.8
8. 145.2
9. 145
10. 144

Acting reasonably, we should see these results and say "this scale has come pretty close to giving me the same reading 10 times in a row, so I conclude it's reliable." If so, its "reliability" is strong and is not in question.

What we do not know, however, is if the scale is RIGHT. What if it's wrong by, say, 10 lbs. and you REALLy weigh closer to 155 lbs.?

The scale's readings are reliable, but we can't say for sure if the scale is valid.

To confirm that the scale really is measuring pounds, we might "test" it by weighing other items whose weights we already know. For example, a 10 lb. bag of potatoes, a weight (from a weight room) of 25 lbs., a 50 lb. bag of rock salt, the official rod of steel used as the standard to determine a "pound" and so on.

Now, if we weigh all those items and each time the scale gives us readings that are really close to what we should expect, we can conclude that the scale is indeed valid.

Understanding Independent (IV) and Dependent Variables (DV)

Understanding Variables

Ok, so in a nutshell, here's how quantitative research works (we'll talk about qualitative later).

First, you come up with an interesting question that you'd like to answer. In other words, you come up with concepts to see if they're related. Maybe you're interested in the relationship between TV viewing and obesity in kids; the portrayal of models of fashion magazines and real-life body image among females (or males); video games and reflexes; video games and violence; sexual content on TV and real-life sexual behavior; rap/rock lyrics and attitudes toward women; and so on. The possibilities are, literally, endless.

Then, using your own observations, thoughts, opinions, etc., you decide which way you believe the relationship goes. You come up with a declarative statement like "kids who watch a lot of TV get fat."

You believe this to be true for whatever reason. This is known as your "theoretical rationale." Why do you think it might be true? As long as it makes sense (face validity), you're probably on to something.

For example, you might say-- "well, kids who watch a lot of TV are spending time watching TV INSTEAD of running around and playing outside, so they're probably not getting a lot of exercise, so they're not burning as many calories. Also, it seems likely that kids watching TV are more likely to mindlessly snack than are kids playing kickball or some other activity. So, it seems to me that kids watching TV burn less calories and consume more calories, so it makes sense that this might lead to more childhood obesity."

Makes sense to me.

Every premise has a theoretical rationale.

We need to refine it, however, and form a hypothesis. A hypothesis is simply a declarative, testable statement that examines the relationship between variables.

Variables are either independent (IV) or dependent (DV). Sometimes called Predictor and Outcome variables. An IV or predictor variable is one that isn't changed... A DV or outcome variable, however, is the one that changes. We measure DVs.

For example, if I developed a hypothesis on the tv-obesity topic, I might come up with something like this:

H1: The more TV a child watches, the more likely the child is overweight.

In this one simple hypothesis, we have three concepts that we need to identify. What do we mean by "child," "TV watching," and "overweight?"

The conceptual definition is the dictionary definition of the concept, and how we plan on "measuring" that concept is called the "operational" definition.

For example, TV watching is defined as the number of hours, on average, someone watches TV per day (conceptual definition). To measure this, we had children circle the TV shows they watched "yesterday" from a grid provided to them (operational definition).

Get it? You'd have to do this for each concept.

In terms of variables, in our first hypothesis, the IV is "TV watching" and the DV is "weight." We are suggesting, at least in this hypothesis, that an individual's "weight" will change based on how much TV he/she watches. Since we're suggesting that TV viewing can "influence" weight, weight is the DV.

To make it easier for us, most scholars follow the form of putting your IV first and your DV second in any hypothesis.

Then, once we've got this all figured out, we need to figure out how the heck we would "test" this hypothesis. The "test" is the statistical method used to figure out if the relationship between the variables is significant.

In this case, the IV is "ratio" and the DV is also "ratio." When we have two ratio variables, we always use "correlation" as the statistical test.

When the IV is "nominal" and the DV is "nominal," we use chi-square.
When the IV is "nominal" and the DV is "interval/ratio," we use t-test.
When the IV is "interval/ratio" and the DV is "interval/ratio," we use correlation.

(In this class, we won't discuss what to use if the IV is "interval/ratio," and the DV is "nominal" -- logistical regression).

We'll talk more about the tests later in the course, but it's a good idea to know which test is used in which circumstance...