Foundations of Statistics A Report 1

Student ID: 1W23CF13
Name: TAI Yungche
Date: 2025/07/01

Question 1

To prove that the sum of the probabilities for a Poisson distribution equals 1, start with the definition of the probability mass function (PMF).

The PMF for a Poisson random variable X is given by: $P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}$, for $k = 0, 1, 2, \dots$ and $\lambda > 0$.

Show that $\sum_{k=0}^{\infty} P(X = k) = 1$.

Evaluate the sum: $$ \sum_{k=0}^{\infty} P(X = k) = \sum_{k=0}^{\infty} \frac{\lambda^k e^{-\lambda}}{k!} $$ Since the term $e^{-\lambda}$ does not depend on the summation index $k$, factor it out of the summation: $$ = e^{-\lambda} \sum_{k=0}^{\infty} \frac{\lambda^k}{k!} $$ The summation part is the Maclaurin series expansion for $e^{\lambda}$: $$ \sum_{k=0}^{\infty} \frac{\lambda^k}{k!} = 1 + \frac{\lambda}{1!} + \frac{\lambda^2}{2!} + \frac{\lambda^3}{3!} + \dots = e^{\lambda} $$ Substituting this result back into the expression: $$ = e^{-\lambda} \cdot e^{\lambda} $$ $$ = e^{-\lambda + \lambda} $$ $$ = e^0 $$ $$ = 1 $$

So, it is shown that: $$ \sum_{k=0}^{\infty} P(X = k) = 1 $$

Question 2

Part (i)

The cumulative distribution function (CDF), $F(x)$, is defined as $F(x) = P(X \leq x) = \int_{-\infty}^{x} f(t) dt$.

For $x < 0$, the density function $f(x) = 0$. $$ F(x) = \int_{-\infty}^{x} 0 \, dt = 0 $$

For $x \geq 0$, the density function is $f(x) = \beta e^{-\beta x}$.

Combining both cases, the distribution function $F(x)$ is: $$ F(x) = \begin{cases} 1 - e^{-\beta x}, & \text{if } x \geq 0 \ 0, & \text{if } x < 0 \end{cases} $$

Part (ii)

This property is known as the memoryless property. By the definition of conditional probability: $$ P(X > t + s | X > s) = \frac{P({X > t + s} \cap {X > s})}{P(X > s)} $$ Since $s > 0$ and $t > 0$, $t+s > s$. Therefore, if $X > t+s$, it is automatically true that $X > s$. The intersection of the two events is simply $\{X > t+s\}$. $$ P(X > t + s | X > s) = \frac{P(X > t + s)}{P(X > s)} $$ First, find the probability $P(X > a)$ for any $a > 0$. This is the survival function. $$ P(X > a) = 1 - P(X \leq a) = 1 - F(a) $$ From part (i), for $a \geq 0$, $F(a) = 1 - e^{-\beta a}$. $$ P(X > a) = 1 - (1 - e^{-\beta a}) = e^{-\beta a} $$ Substitute this into the expression for conditional probability, using $s$ and $t+s$ for $a$. Since $s, t > 0$, both $s$ and $t+s$ are positive. $$ P(X > s) = e^{-\beta s} $$ $$ P(X > t+s) = e^{-\beta(t+s)} $$ Therefore, $$ P(X > t + s | X > s) = \frac{e^{-\beta(t+s)}}{e^{-\beta s}} = \frac{e^{-\beta t}e^{-\beta s}}{e^{-\beta s}} = e^{-\beta t} $$ The right-hand side of the equation to be proven is $P(X > t)$. Since $t > 0$: $$ P(X > t) = e^{-\beta t} $$ Thus, it is shown that $$ P(X > t + s | X > s) = e^{-\beta t} = P(X > t) $$

Question 3

Let $X_1$ and $X_2$ be independent and identically distributed (i.i.d.) random variables with a common cumulative distribution function (CDF) $F(x)$. That is, $P(X_1 \leq x) = P(X_2 \leq x) = F(x)$.

Part (i)

Let $Y = \max(X_1, X_2)$. The distribution function of Y, denoted by $G(y)$, is defined as $G(y) = P(Y \leq y)$.

\[ G(y) = P(\max(X_1, X_2) \leq y) $$ The event $\max(X_1, X_2) \leq y$ is equivalent to the event that both $X_1 \leq y$ and $X_2 \leq y$. $$ G(y) = P(X_1 \leq y \text{ and } X_2 \leq y) $$ Since $X_1$ and $X_2$ are independent, the joint probability is the product of their individual probabilities. $$ G(y) = P(X_1 \leq y) \cdot P(X_2 \leq y) $$ Since $X_1$ and $X_2$ are identically distributed with CDF $F$: $$ G(y) = F(y) \cdot F(y) $$ $$ G(y) = [F(y)]^2 \]

Part (ii)

Let $Z = \min(X_1, X_2)$. The distribution function of Z, denoted by $H(z)$, is defined as $H(z) = P(Z \leq z)$. It is often easier to first compute the survival function, $P(Z > z)$.

\[ P(Z > z) = P(\min(X_1, X_2) > z) $$ The event $\min(X_1, X_2) > z$ is equivalent to the event that both $X_1 > z$ and $X_2 > z$. $$ P(Z > z) = P(X_1 > z \text{ and } X_2 > z) $$ Due to independence: $$ P(Z > z) = P(X_1 > z) \cdot P(X_2 > z) $$ The probability $P(X > z)$ can be expressed in terms of the CDF as $1 - P(X \leq z) = 1 - F(z)$. Since $X_1$ and $X_2$ are identically distributed: $$ P(Z > z) = (1 - P(X_1 \leq z)) \cdot (1 - P(X_2 \leq z)) $$ $$ P(Z > z) = (1 - F(z)) \cdot (1 - F(z)) $$ $$ P(Z > z) = [1 - F(z)]^2 $$ The distribution function $H(z)$ is the complement of the survival function. $$ H(z) = P(Z \leq z) = 1 - P(Z > z) $$ $$ H(z) = 1 - [1 - F(z)]^2 \]

Question 4

Part (i)

This formula is often taken as the definition of variance. The proof demonstrates its relationship with the computational formula for variance, $E[X^2] - (E[X])^2$. Let $\mu = E[X]$.

Start by expanding the right-hand side: $$ E[(X - E[X])^2] = E[(X - \mu)^2] $$ Expanding the square inside the expectation: $$ = E[X^2 - 2\mu X + \mu^2] $$ By the linearity of expectation, $E[A+B] = E[A] + E[B]$: $$ = E[X^2] - E[2\mu X] + E[\mu^2] $$ Since $\mu=E[X]$ is a constant, use the properties $E[cA] = cE[A]$ and $E[c]=c$: $$ = E[X^2] - 2\mu E[X] + \mu^2 $$ Substitute $\mu = E[X]$ back into the expression: $$ = E[X^2] - 2E[X]E[X] + (E[X])^2 $$ $$ = E[X^2] - 2(E[X])^2 + (E[X])^2 $$ $$ = E[X^2] - (E[X])^2 $$ This is the common computational formula for variance. Thus, the two forms are equivalent. If variance is defined as $\text{Var}[X] := E[X^2] - (E[X])^2$, then it is proven that $\text{Var}[X] = E[(X - E[X])^2]$.

Part (ii)

Use the definition of variance, $\text{Var}[Y] = E[(Y - E[Y])^2]$, where $Y = aX + b$.

First, find the expected value of $aX+b$: $$ E[aX + b] = aE[X] + b $$ Now substitute this into the variance formula: $$ \text{Var}[aX + b] = E[((aX + b) - E[aX + b])^2] $$ $$ = E[((aX + b) - (aE[X] + b))^2] $$ Simplify the expression inside the expectation: $$ = E[(aX + b - aE[X] - b)^2] $$ $$ = E[(aX - aE[X])^2] $$ $$ = E[(a(X - E[X]))^2] $$ $$ = E[a^2(X - E[X])^2] $$ Factor the constant $a^2$ out of the expectation: $$ = a^2 E[(X - E[X])^2] $$ By the definition of variance, $E[(X - E[X])^2] = \text{Var}[X]$. $$ = a^2 \text{Var}[X] $$ Thus, it is proven that $\text{Var}[aX + b] = a^2\text{Var}[X]$.

Question 5

Part (i)

The marginal density $f_X(x)$ is obtained by integrating the joint density $f_{X,Y}(x, y)$ with respect to $y$. $$ f_X(x) = \int_{-\infty}^{\infty} f_{X,Y}(x, y) \, dy = \int_{-\infty}^{\infty} \frac{1}{2\pi\sqrt{1-\rho^2}} \exp\left(-\frac{1}{2(1-\rho^2)}(x^2 + y^2 - 2\rho xy)\right) \, dy $$ The exponent can be rewritten by completing the square for the variable $y$: $$ x^2 + y^2 - 2\rho xy = y^2 - 2(\rho x)y + x^2 = (y - \rho x)^2 - (\rho x)^2 + x^2 = (y - \rho x)^2 + x^2(1-\rho^2) $$ Substituting this back into the exponent of the exponential function gives: $$ -\frac{1}{2(1-\rho^2)} \left( (y - \rho x)^2 + x^2(1-\rho^2) \right) = -\frac{(y - \rho x)^2}{2(1-\rho^2)} - \frac{x^2}{2} $$ The integral for $f_X(x)$ becomes: $$ f_X(x) = \int_{-\infty}^{\infty} \frac{1}{2\pi\sqrt{1-\rho^2}} \exp\left(-\frac{(y - \rho x)^2}{2(1-\rho^2)} - \frac{x^2}{2}\right) \, dy $$ $$ f_X(x) = \frac{1}{2\pi\sqrt{1-\rho^2}} e^{-x^2/2} \int_{-\infty}^{\infty} \exp\left(-\frac{(y - \rho x)^2}{2(1-\rho^2)}\right) \, dy $$ The integral is the integral of the kernel of a normal probability density function with mean $\mu = \rho x$ and variance $\sigma^2 = 1-\rho^2$. The value of such an integral is $\sqrt{2\pi\sigma^2}$. $$ \int_{-\infty}^{\infty} \exp\left(-\frac{(y - \rho x)^2}{2(1-\rho^2)}\right) \, dy = \sqrt{2\pi(1-\rho^2)} $$ Substituting this result back: $$ f_X(x) = \frac{1}{2\pi\sqrt{1-\rho^2}} e^{-x^2/2} \cdot \sqrt{2\pi(1-\rho^2)} $$ $$ f_X(x) = \frac{1}{\sqrt{2\pi}} e^{-x^2/2} $$ This is the probability density function for a standard normal distribution, $N(0, 1)$.

Part (ii)

To compute $E[XY]$, use the law of total expectation: $E[XY] = E[E[XY|X]]$. $$ E[XY|X=x] = E[xY|X=x] = xE[Y|X=x] $$ The conditional density of $Y$ given $X=x$ is $f_{Y|X}(y|x) = \frac{f_{X,Y}(x,y)}{f_X(x)}$. Using the expressions from part (i): $$ f_{X,Y}(x,y) = \frac{1}{\sqrt{2\pi}} e^{-x^2/2} \cdot \frac{1}{\sqrt{2\pi(1-\rho^2)}} \exp\left(-\frac{(y - \rho x)^2}{2(1-\rho^2)}\right) = f_X(x) \cdot \frac{1}{\sqrt{2\pi(1-\rho^2)}} \exp\left(-\frac{(y - \rho x)^2}{2(1-\rho^2)}\right) $$ So, the conditional density is: $$ f_{Y|X}(y|x) = \frac{1}{\sqrt{2\pi(1-\rho^2)}} \exp\left(-\frac{(y - \rho x)^2}{2(1-\rho^2)}\right) $$ This is the PDF of a normal distribution with mean $E[Y|X=x] = \rho x$ and variance $1-\rho^2$. Thus, $E[Y|X] = \rho X$. Now, compute $E[XY]$: $$ E[XY] = E[X \cdot E[Y|X]] = E[X \cdot (\rho X)] = E[\rho X^2] = \rho E[X^2] $$ From part (i), it is known that $X$ has a standard normal distribution, $X \sim N(0, 1)$. The variance of $X$ is $\text{Var}[X] = 1$. $$ \text{Var}[X] = E[X^2] - (E[X])^2 $$ $$ 1 = E[X^2] - (0)^2 \implies E[X^2] = 1 $$ Therefore, $$ E[XY] = \rho \cdot 1 = \rho $$

To find the value of $\rho$ for which $E[XY] = E[X]E[Y]$, first find $E[X]$ and $E[Y]$. Since $X \sim N(0,1)$, $E[X] = 0$. By symmetry of the joint PDF, $Y$ also follows a standard normal distribution, $Y \sim N(0,1)$, so $E[Y] = 0$. Thus, $E[X]E[Y] = 0 \cdot 0 = 0$. The condition becomes: $$ E[XY] = 0 $$ $$ \rho = 0 $$ The condition $E[XY] = E[X]E[Y]$ holds when $\rho=0$. This corresponds to the case where $X$ and $Y$ are uncorrelated, and for a joint normal distribution, this implies they are independent.

Question 6

To prove that the random variables $X$ and $-X$ have the same distribution, show that their characteristic functions are identical. By the uniqueness theorem for characteristic functions, if two random variables have the same characteristic function, they have the same distribution.

Let $\varphi_X(t)$ be the characteristic function of $X$. By definition: $$ \varphi_X(t) = E[e^{itX}] $$ Let $\varphi_{-X}(t)$ be the characteristic function of $-X$. By definition: $$ \varphi_{-X}(t) = E[e^{it(-X)}] = E[e^{i(-t)X}] = \varphi_X(-t) $$ To show that $\varphi_X(t) = \varphi_{-X}(t)$ for all $t$. This is equivalent to showing that $\varphi_X(t) = \varphi_X(-t)$.

Given that $\varphi_X(t)$ is a real function. For any complex number $z$, if $z$ is real, then $z = \bar{z}$, where $\bar{z}$ is the complex conjugate of $z$. Therefore: $$ \varphi_X(t) = \overline{\varphi_X(t)} $$ Compute the complex conjugate of $\varphi_X(t)$ from its definition: $$ \overline{\varphi_X(t)} = \overline{E[e^{itX}]} $$ The conjugate of the expectation is the expectation of the conjugate: $$ = E[\overline{e^{itX}}] $$ Using Euler's formula, $\overline{e^{i\theta}} = \overline{\cos(\theta) + i\sin(\theta)} = \cos(\theta) - i\sin(\theta) = e^{-i\theta}$. Thus: $$ = E[e^{-itX}] = E[e^{i(-t)X}] $$ This is the definition of the characteristic function of $X$ evaluated at $-t$: $$ = \varphi_X(-t) $$ So it is shown that $\overline{\varphi_X(t)} = \varphi_X(-t)$.

Since $\varphi_X(t)$ is real, $\varphi_X(t) = \overline{\varphi_X(t)}$. Combining this with our derived property: $$ \varphi_X(t) = \varphi_X(-t) $$ As established earlier, $\varphi_{-X}(t) = \varphi_X(-t)$. Therefore: $$ \varphi_X(t) = \varphi_{-X}(t) $$ Since the characteristic functions of $X$ and $-X$ are identical, by the uniqueness theorem, $X$ and $-X$ have the same distribution. This means the distribution of $X$ is symmetric about 0.