Skip to content Skip to sidebar Skip to footer

Conditional Density of Discrete X Given Continuous Y

In probability theory and statistics, given two jointly distributed random variables X {\displaystyle X} and Y {\displaystyle Y} , the conditional probability distribution of Y {\displaystyle Y} given X {\displaystyle X} is the probability distribution of Y {\displaystyle Y} when X {\displaystyle X} is known to be a particular value; in some cases the conditional probabilities may be expressed as functions containing the unspecified value x {\displaystyle x} of X {\displaystyle X} as a parameter. When both X {\displaystyle X} and Y {\displaystyle Y} are categorical variables, a conditional probability table is typically used to represent the conditional probability. The conditional distribution contrasts with the marginal distribution of a random variable, which is its distribution without reference to the value of the other variable.

If the conditional distribution of Y {\displaystyle Y} given X {\displaystyle X} is a continuous distribution, then its probability density function is known as the conditional density function.[1] The properties of a conditional distribution, such as the moments, are often referred to by corresponding names such as the conditional mean and conditional variance.

More generally, one can refer to the conditional distribution of a subset of a set of more than two variables; this conditional distribution is contingent on the values of all the remaining variables, and if more than one variable is included in the subset then this conditional distribution is the conditional joint distribution of the included variables.

Conditional discrete distributions [edit]

For discrete random variables, the conditional probability mass function of Y {\displaystyle Y} given X = x {\displaystyle X=x} can be written according to its definition as:

p Y | X ( y x ) P ( Y = y X = x ) = P ( { X = x } { Y = y } ) P ( X = x ) {\displaystyle p_{Y|X}(y\mid x)\triangleq P(Y=y\mid X=x)={\frac {P(\{X=x\}\cap \{Y=y\})}{P(X=x)}}\qquad }

Due to the occurrence of P ( X = x ) {\displaystyle P(X=x)} in the denominator, this is defined only for non-zero (hence strictly positive) P ( X = x ) . {\displaystyle P(X=x).}

The relation with the probability distribution of X {\displaystyle X} given Y {\displaystyle Y} is:

P ( Y = y X = x ) P ( X = x ) = P ( { X = x } { Y = y } ) = P ( X = x Y = y ) P ( Y = y ) . {\displaystyle P(Y=y\mid X=x)P(X=x)=P(\{X=x\}\cap \{Y=y\})=P(X=x\mid Y=y)P(Y=y).}

Example [edit]

Consider the roll of a fair die and let X = 1 {\displaystyle X=1} if the number is even (i.e., 2, 4, or 6) and X = 0 {\displaystyle X=0} otherwise. Furthermore, let Y = 1 {\displaystyle Y=1} if the number is prime (i.e., 2, 3, or 5) and Y = 0 {\displaystyle Y=0} otherwise.

D 1 2 3 4 5 6
X 0 1 0 1 0 1
Y 0 1 1 0 1 0

Then the unconditional probability that X = 1 {\displaystyle X=1} is 3/6 = 1/2 (since there are six possible rolls of the dice, of which three are even), whereas the probability that X = 1 {\displaystyle X=1} conditional on Y = 1 {\displaystyle Y=1} is 1/3 (since there are three possible prime number rolls—2, 3, and 5—of which one is even).

Conditional continuous distributions [edit]

Similarly for continuous random variables, the conditional probability density function of Y {\displaystyle Y} given the occurrence of the value x {\displaystyle x} of X {\displaystyle X} can be written as[2] : p. 99

f Y X ( y x ) = f X , Y ( x , y ) f X ( x ) {\displaystyle f_{Y\mid X}(y\mid x)={\frac {f_{X,Y}(x,y)}{f_{X}(x)}}\qquad }

where f X , Y ( x , y ) {\displaystyle f_{X,Y}(x,y)} gives the joint density of X {\displaystyle X} and Y {\displaystyle Y} , while f X ( x ) {\displaystyle f_{X}(x)} gives the marginal density for X {\displaystyle X} . Also in this case it is necessary that f X ( x ) > 0 {\displaystyle f_{X}(x)>0} .

The relation with the probability distribution of X {\displaystyle X} given Y {\displaystyle Y} is given by:

f Y X ( y x ) f X ( x ) = f X , Y ( x , y ) = f X | Y ( x y ) f Y ( y ) . {\displaystyle f_{Y\mid X}(y\mid x)f_{X}(x)=f_{X,Y}(x,y)=f_{X|Y}(x\mid y)f_{Y}(y).}

The concept of the conditional distribution of a continuous random variable is not as intuitive as it might seem: Borel's paradox shows that conditional probability density functions need not be invariant under coordinate transformations.

Example [edit]

The graph shows a bivariate normal joint density for random variables X {\displaystyle X} and Y {\displaystyle Y} . To see the distribution of Y {\displaystyle Y} conditional on X = 70 {\displaystyle X=70} , one can first visualize the line X = 70 {\displaystyle X=70} in the X , Y {\displaystyle X,Y} plane, and then visualize the plane containing that line and perpendicular to the X , Y {\displaystyle X,Y} plane. The intersection of that plane with the joint normal density, once rescaled to give unit area under the intersection, is the relevant conditional density of Y {\displaystyle Y} .

Y X = 70 N ( μ 1 + σ 1 σ 2 ρ ( 70 μ 2 ) , ( 1 ρ 2 ) σ 1 2 ) . {\displaystyle Y\mid X=70\ \sim \ {\mathcal {N}}\left(\mu _{1}+{\frac {\sigma _{1}}{\sigma _{2}}}\rho (70-\mu _{2}),\,(1-\rho ^{2})\sigma _{1}^{2}\right).}

Relation to independence [edit]

Random variables X {\displaystyle X} , Y {\displaystyle Y} are independent if and only if the conditional distribution of Y {\displaystyle Y} given X {\displaystyle X} is, for all possible realizations of X {\displaystyle X} , equal to the unconditional distribution of Y {\displaystyle Y} . For discrete random variables this means P ( Y = y | X = x ) = P ( Y = y ) {\displaystyle P(Y=y|X=x)=P(Y=y)} for all possible y {\displaystyle y} and x {\displaystyle x} with P ( X = x ) > 0 {\displaystyle P(X=x)>0} . For continuous random variables X {\displaystyle X} and Y {\displaystyle Y} , having a joint density function, it means f Y ( y | X = x ) = f Y ( y ) {\displaystyle f_{Y}(y|X=x)=f_{Y}(y)} for all possible y {\displaystyle y} and x {\displaystyle x} with f X ( x ) > 0 {\displaystyle f_{X}(x)>0} .

Properties [edit]

Seen as a function of y {\displaystyle y} for given x {\displaystyle x} , P ( Y = y | X = x ) {\displaystyle P(Y=y|X=x)} is a probability mass function and so the sum over all y {\displaystyle y} (or integral if it is a conditional probability density) is 1. Seen as a function of x {\displaystyle x} for given y {\displaystyle y} , it is a likelihood function, so that the sum over all x {\displaystyle x} need not be 1.

Additionally, a marginal of a joint distribution can be expressed as the expectation of the corresponding conditional distribution. For instance, p X ( x ) = E Y [ p X | Y ( X | Y ) ] {\displaystyle p_{X}(x)=E_{Y}[p_{X|Y}(X\ |\ Y)]} .

Measure-theoretic formulation [edit]

Let ( Ω , F , P ) {\displaystyle (\Omega ,{\mathcal {F}},P)} be a probability space, G F {\displaystyle {\mathcal {G}}\subseteq {\mathcal {F}}} a σ {\displaystyle \sigma } -field in F {\displaystyle {\mathcal {F}}} . Given A F {\displaystyle A\in {\mathcal {F}}} , the Radon-Nikodym theorem implies that there is[3] a G {\displaystyle {\mathcal {G}}} -measurable random variable P ( A G ) : Ω R {\displaystyle P(A\mid {\mathcal {G}}):\Omega \to \mathbb {R} } , called the conditional probability, such that

G P ( A G ) ( ω ) d P ( ω ) = P ( A G ) {\displaystyle \int _{G}P(A\mid {\mathcal {G}})(\omega )dP(\omega )=P(A\cap G)}

for every G G {\displaystyle G\in {\mathcal {G}}} , and such a random variable is uniquely defined up to sets of probability zero. A conditional probability is called regular if P ( B ) ( ω ) {\displaystyle \operatorname {P} (\cdot \mid {\mathcal {B}})(\omega )} is a probability measure on ( Ω , F ) {\displaystyle (\Omega ,{\mathcal {F}})} for all ω Ω {\displaystyle \omega \in \Omega } a.e.

Special cases:

Let X : Ω E {\displaystyle X:\Omega \to E} be a ( E , E ) {\displaystyle (E,{\mathcal {E}})} -valued random variable. For each B E {\displaystyle B\in {\mathcal {E}}} , define

μ X | G ( B | G ) = P ( X 1 ( B ) | G ) . {\displaystyle \mu _{X\,|\,{\mathcal {G}}}(B\,|\,{\mathcal {G}})=\mathrm {P} (X^{-1}(B)\,|\,{\mathcal {G}}).}

For any ω Ω {\displaystyle \omega \in \Omega } , the function μ X | G ( | G ) ( ω ) : E R {\displaystyle \mu _{X\,|{\mathcal {G}}}(\cdot \,|{\mathcal {G}})(\omega ):{\mathcal {E}}\to \mathbb {R} } is called the conditional probability distribution of X {\displaystyle X} given G {\displaystyle {\mathcal {G}}} . If it is a probability measure on ( E , E ) {\displaystyle (E,{\mathcal {E}})} , then it is called regular.

For a real-valued random variable (with respect to the Borel σ {\displaystyle \sigma } -field R 1 {\displaystyle {\mathcal {R}}^{1}} on R {\displaystyle \mathbb {R} } ), every conditional probability distribution is regular.[4] In this case, E [ X G ] = x μ ( d x , ) {\displaystyle E[X\mid {\mathcal {G}}]=\int _{-\infty }^{\infty }x\,\mu (dx,\cdot )} almost surely.

Relation to conditional expectation [edit]

For any event A F {\displaystyle A\in {\mathcal {F}}} , define the indicator function:

1 A ( ω ) = { 1 if ω A , 0 if ω A , {\displaystyle \mathbf {1} _{A}(\omega )={\begin{cases}1\;&{\text{if }}\omega \in A,\\0\;&{\text{if }}\omega \notin A,\end{cases}}}

which is a random variable. Note that the expectation of this random variable is equal to the probability of A itself:

E ( 1 A ) = P ( A ) . {\displaystyle \operatorname {E} (\mathbf {1} _{A})=\operatorname {P} (A).\;}

Given a σ {\displaystyle \sigma } -field G F {\displaystyle {\mathcal {G}}\subseteq {\mathcal {F}}} , the conditional probability P ( A G ) {\displaystyle \operatorname {P} (A\mid {\mathcal {G}})} is a version of the conditional expectation of the indicator function for A {\displaystyle A} :

P ( A B ) = E ( 1 A B ) {\displaystyle \operatorname {P} (A\mid {\mathcal {B}})=\operatorname {E} (\mathbf {1} _{A}\mid {\mathcal {B}})\;}

An expectation of a random variable with respect to a regular conditional probability is equal to its conditional expectation.

See also [edit]

  • Conditioning (probability)
  • Conditional probability
  • Regular conditional probability
  • Bayes' theorem

References [edit]

Citations [edit]

  1. ^ Ross, Sheldon M. (1993). Introduction to Probability Models (Fifth ed.). San Diego: Academic Press. pp. 88–91. ISBN0-12-598455-3.
  2. ^ Park,Kun Il (2018). Fundamentals of Probability and Stochastic Processes with Applications to Communications. Springer. ISBN978-3-319-68074-3.
  3. ^ Billingsley (1995), p. 430
  4. ^ Billingsley (1995), p. 439

Sources [edit]

  • Billingsley, Patrick (1995). Probability and Measure (3rd ed.). New York, NY: John Wiley and Sons.

roosathisevers.blogspot.com

Source: https://en.wikipedia.org/wiki/Conditional_probability_distribution

Post a Comment for "Conditional Density of Discrete X Given Continuous Y"