# Identifying independent and dependent variables

#### redsoxnation

##### New member
Yes, here I am again.

I am having a Dickens of a time trying to sort this problem out:

"By using statistical models, researchers found that for very extra year of education women had, the death rate for children under 5 dropped by almost 10 percent. In 2009, they estimated that 4.2 million fewer children died because women of child bearing age in developing countries were more educated. In 1970, women aged 18 to 44 in developing countries went to school for about 2 years. That rose to seven years in 2009."

The task here is to build the linear model implicit in this quotation, and to identify the independent and dependent variables, and the slope. In trying to do this, it just seems to me that there is too much information here. I understand I don't have to use all the information, but I just can't seem to identify what goes where. Is the "death rate" the dependent variable (as it depends on the education of the mother)? And the amount of education the independent variable? Is this y= mx + b?

I usually appreciate the fact that the "helpers" here give just enough information to get me on track, but if someone could "lay this out" so that i could get it, it would be much appreciated.

(To make matters worse, the "book" we are using isn't even a "book" yet -- it's a draft, and it's full of so many typographical errors that it makes my head spin).

Many, many thanks.

#### mmm4444bot

##### Super Moderator
Staff member
redsoxnation said:
Is the "death rate" the dependent variable (as it depends on the education of the mother)?

Yes. Specifically, y depends upon the number of years of education.

[Is] the amount of education the independent variable?

Yes. And the unit of this amount is "years", so x represents the number of years of education.

When x increases by 1, y goes down by 10.

#### Subhotosh Khan

##### Super Moderator
Staff member
redsoxnation said:
Yes, here I am again.

I am having a Dickens of a time trying to sort this problem out:

"By using statistical models, researchers found that for very extra year of education women had, the death rate for children under 5 dropped by almost 10 percent. In 2009, they estimated that 4.2 million fewer children died because women of child bearing age in developing countries were more educated. In 1970, women aged 18 to 44 in developing countries went to school for about 2 years. That rose to seven years in 2009."

The task here is to build the linear model implicit in this quotation, and to identify the independent and dependent variables, and the slope. In trying to do this, it just seems to me that there is too much information here. I understand I don't have to use all the information, but I just can't seem to identify what goes where. Is the "death rate" the dependent variable (as it depends on the education of the mother)? And the amount of education the independent variable? Is this y= mx + b? --- yes
I usually appreciate the fact that the "helpers" here give just enough information to get me on track, but if someone could "lay this out" so that i could get it, it would be much appreciated.

(To make matters worse, the "book" we are using isn't even a "book" yet -- it's a draft, and it's full of so many typographical errors that it makes my head spin).

Many, many thanks.

J

#### JeffM

##### Guest
Red

Mathematically, when y = f(x), x is the independent variable whereas y is the dependent variable. If you think of a function as a black box that generates a number (y) when you feed it a different number (x), y DEPENDS on what you feed into the black box.

It is frequently the case that if y = f(x) there is a different function, the inverse, such that x = g(y). This is true of most linear functions.
If y = mx + b = f(x) and |m| > 0, then x = [(y - b) / m] = g(y).
So, in mathematics, you can say that independent variable = argument of a function, but dependent variable = result of a function. The mathematics is not affected by what is cause and what is effect: it is a purely formal distinction.

In a much more informal way, however, the independent variable or variables are ASSUMED to be the cause, and dependent variable is ASSUMED to be the effect. So, it is impossible that the life expectancy of a woman's infant can influence how much education she previously received, but it is at least plausible that a mother's prior education can affect her infant's life expectancy. So, the math is set up formally to agree with the informal understanding, y is the presumed effect and is set up as y = f(x), where x is the presumed cause.

Does that make sense?

Now a warning. Statistical studies are set up to prove or disprove (with some degree of probability) an assumed cause and effect relationship between variables. What they actually prove or disprove is correlation. There may be some variable not taken into account that is driving all the variables. For example, it may be that maternal education has nothing to do with prolonged infant survival: greater wealth may permit both longer education and better medical care, but better educated mothers may do no better than uneducated ones if there is no decent medical care to be had.

#### redsoxnation

##### New member
Subhotosh Khan - Thank you for your confirmation.

JeffM - Yes, your answer does make sense. And I understand about the correlation; newspaper reporters often "cherry pick" exactly which facts they choose to leave in (and which to omit). Again, you've been a great help.

I'll be glad when this class is over (but right now, it's not looking good for the 'home team' - I may be on this board all summer....)