## BIO 550 Week 6 Discussion Questions

*BIO 550 Week 6 Discussion Questions*

DQ1 Does association imply causation? Why or why not?

DQ2 Are weak associations indicative of a noncausal relationship? Select two other peers postings and debate their rationale.

Most studies include multiple response variables, and the dependencies among them are often of great interest. For example, we may wish to know whether the levels of mRNA and the matching protein vary together in a tissue, or whether increasing levels of one metabolite are associated with changed levAre weak associations indicative of a noncausal relationship? Select two other peers postings and debate their rationale.els of another. This month we begin a series of columns about relationships between variables (or features of a system), beginning with how pairwise dependencies can be characterized using correlation.

Two variables are independent when the value of one gives no information about the value of the other. For variables *X* and *Y*, we can express independence by saying that the chance of measuring any one of the possible values of *X* is unaffected by the value of *Y*, and vice versa, or by using conditional probability, *P*(*X*|*Y*) = *P*(*X*). For example, successive tosses of a coin are independent—for a fair coin, *P*(*H*) = 0.5 regardless of the outcome of the previous toss, because a toss does not alter the properties of the coin. In contrast, if a system is changed by observation, measurements may become associated or, equivalently, dependent. Cards drawn without replacement are not independent; when a red card is drawn, the probability of drawing a black card increases, because now there are fewer red cards.

Association should not be confused with causality; if *X* causes *Y*, then the two are associated (dependent). However, associations can arise between variables in the presence (i.e., *X* causes *Y*) and absence (i.e., they have a common cause) of a causal relationship, as we’ve seen in the context of Bayesian networks^{1}. As an example, suppose we observe that people who daily drink more than 4 cups of coffee have a decreased chance of developing skin cancer. This does not necessarily mean that coffee confers resistance to cancer; one alternative explanation would be that people who drink a lot of coffee work indoors for long hours and thus have little exposure to the sun, a known risk. If this is the case, then the number of hours spent outdoors is a confounding variable—a cause common to both observations. In such a situation, a direct causal link cannot be inferred; the association merely suggests a hypothesis, such as a common cause, but does not offer proof. In addition, when many variables in complex systems are studied, spurious associations can arise. Thus, association does not imply causation.

In everyday language, dependence, association and correlation are used interchangeably. Technically, however, association is synonymous with dependence and is different from correlation (Fig. 1a). Association is a very general relationship: one variable provides information about another. Correlation is more specific: two variables are correlated when they display an increasing or decreasing trend. For example, in an increasing trend, observing that *X* > μ_{X} implies that it is more likely that *Y* > μ_{Y}. Because not all associations are correlations, and because causality, as discussed above, can be connected only to association, we cannot equate correlation with causality in either direction.