$\require{cancel} \newcommand{\Ket}[1]{\left|{#1}\right\rangle} \newcommand{\Bra}[1]{\left\langle{#1}\right|} \newcommand{\Braket}[1]{\left\langle{#1}\right\rangle} \newcommand{\Rsr}[1]{\frac{1}{\sqrt{#1}}} \newcommand{\RSR}[1]{1/\sqrt{#1}} \newcommand{\Verti}{\rvert} \newcommand{\HAT}[1]{\hat{\,#1~}} \DeclareMathOperator{\Tr}{Tr}$

Probability - Discrete Probability Distributions¶

^{First created in February 2017}

Random Variables¶

A random variable is a real function defined in a sample space. One can imagine $X$ being a function accepting no argument and producing a result randomly. Despite the random nature of $X$, the probability of each result (i.e. the frequency of occurances in infinite number of trials) is well defined, and therefore can be studied.

Cumulative Distribution Function: $F_X(x)=P(X\le x),~$ where $x\in\mathbb{R}$. $F_X$ is non-decreasing.

For an interval $\displaystyle [a,b], P(a<x\le b)=F(b)-F(a),~~0=\lim_{x\to -\infty}F_X(x)\le F(a)\le F(b)\le\lim_{x\to +\infty}F_X(x)=1.$

Discrete Random Variables¶

A random variable $X$ is discrete if the image is countable.

$\displaystyle p_k=P(X=x_k),~F(x)=\sum_{k\le x}p_k,~0\le p_k\le 1\text{ and }\sum_k p_k=1.$

Expected Value (or mean): $\displaystyle\mathrm{E}(X)=\sum_k x_kp_k.$

Note: Consider a $k$-dimensional space with a random variable vector $\mathbf{x}=(x_1,\ldots,x_k)$ and a probability vector $\mathbf{p}=(p_1,\ldots,p_k),~\mathrm{E}(X)=\mathbf{x}\cdot\mathbf{p}$

Theorem: If $\displaystyle Y=g(X),~\mathrm{E}(Y)=\mathrm{E}(g(X))=\sum_k g(x_k)p_k.$

Theorem: $\displaystyle \mathrm{E}(g(X))=g(\mathrm{E}(X)),$ where $g(\cdot)$ is linear.

Proof: $\mathrm{E}(g(X))=\sum_k g(x_k)p_k=g\left(\sum_k x_k p_k\right)=g(\mathrm{E}(X)).$

This property is called linearity of expectation - expected value of linear transformation is linear transformation of expected value.

Variance: $\displaystyle \mathrm{Var}(X)=\mathrm{E}\big((X-\mathrm{E}(X))^2\big)=\sum_k\big(x_k-\mu\big)^2p_k, \text{ where }\mu=\mathrm{E}(X) .$

Note: Variance is "produced" by taking the difference of the random variable and the expected value $\big(X-E(X)\big)$, square it $\big(X-E(X)\big)^2$, then the expected value of this squared function is the variance.

Standard Deviation: $\mathrm{SD}(X)=\sqrt{\mathrm{Var}(X)}.$

Theorem: $\mathrm{Var}(X)=\mathrm{E}(X^2)-\big(\mathrm{E}(X)\big)^2.$ (Variance is the expected value of square minus the square of the expected value.)

Proof: Let $\mu=\mathrm{E}(X)$ and $g(X)=(X-\mu)^2$, then $\mathrm{Var}(X) =\sum_k(x_k-\mu)^2p_k =\sum_k(x_k^2-2x_k\mu+\mu^2)~p_k$

$\qquad=\sum_k x_k^2p_k-2\mu\sum_k x_k p_k+\mu^2\sum_k p_k =\mathrm{E}(X^2)-2\mu^2+\mu^2=\mathrm{E}(X^2)-\big(\mathrm{E}(X)\big)^2 .$

It follows that $\mathrm{E}(x^2)=\mu^2+\sigma^2.$

Theorem: $\displaystyle \mathrm{E}(aX+b)=a\mathrm{E}(X)+b,~~ \mathrm{Var}(aX+b)=a^2\mathrm{Var}(X),~~ \mathrm{SD}(aX+b)=|a|\mathrm{SD}(X).~~$ ($a$ and $b$ are constants.)

Proof: $\mathrm{E}(aX+b)=a\mathrm{E}(X)+b,~$ as the function is linear.

$\displaystyle \qquad\mathrm{Var}(aX+b) =\mathrm{E}\Big(\big((aX+b)-\mathrm{E}(aX+b)\big)^2\Big) =\mathrm{E}\Big(\big(aX+b-a\mathrm{E}(X)-b\big)^2\Big) =\mathrm{E}\big(a^2(X-\mathrm{E}(X))^2\big)$

$\displaystyle \qquad\qquad=a^2\mathrm{E}\big((X-\mathrm{E}(X))^2\big) =a^2\mathrm{Var}(X) .$

$\displaystyle \qquad\mathrm{SD}(aX+b) =\sqrt{\mathrm{Var}(aX+b)} =\sqrt{a^2\mathrm{Var}(X)} =|a|\sqrt{\mathrm{Var}(X)} =|a|\mathrm{SD}(X) .$

Binomial Distribution¶

$\displaystyle B(n,p,k)=\binom{n}{k}p^k(1-p)^{n-k}~$ for $n\in\mathbb{N}$ and $k=0,1,\ldots,n.$

The value represents the probability of exactly $k$ successes in $n$ trial. Therefore, the sum of values of all possible $k=0,1,\ldots,n$ is unity.

Let $\displaystyle q=1-p.~\sum_k B(n,p,k)=\sum_{k=0}^n B(n,p,k)=\sum_{k=0}^n\binom{n}{k}p^k q^{n-k}=(p+q)^n=1 .$

If $X$ is the random variable that counts the successes of some Bernoulli process (a process that gives a boolean outcome) with $n$ trials, each having success probability $p$, then $X$ has the binomial distribution $B(n,p)$. We write $X\sim B(n,p).$

$\mathrm{E}(X)=np.$

Proof: $\displaystyle p+q=1,~~ \frac{d}{dp}\left(\frac{p}{q}\right)=\frac{~1~}{q^2},~ \frac{d}{dp}\left(\frac{p}{q}\right)^k=k\left(\frac{p}{q}\right)^{k-1}\frac{1}{q^2}~,~ k\left(\frac{p}{q}\right)^{k-1}=q^2\frac{d}{dp}\left(\frac{p}{q}\right)^k .$

$\displaystyle x_k=k,~\mathrm{E}(X) =\sum_{k=0}^n x_k p_k =\sum_{k=0}^n k\binom{n}{k}p^k q^{n-k} =pq^{n-1}\sum_{k=0}^n \binom{n}{k}k\left(\frac{p}{q}\right)^{k-1} =pq^{n-1}\sum_{k=0}^n \binom{n}{k}q^2\frac{d}{dp}\left(\frac{p}{q}\right)^k$

$\displaystyle \qquad=pq^{n+1}\frac{d}{dp}\left[\sum_{k=0}^n \binom{n}{k}\left(\frac{p}{q}\right)^k\right] =pq^{n+1}\frac{d}{dp}\left[1+\frac{p}{q}\right]^n =pq^{n+1}\frac{d}{dp}\left[\frac{~1~}{q}\right]^n =pq^{n+1}\frac{n}{{q}^{n+1}} =np .$

$\mathrm{Var}(X)=npq=np(1-p).$

Proof: $\displaystyle \frac{d^2}{dp^2}\left(\frac{p}{q}\right)^k =\frac{d}{dp}\left[k\left(\frac{p}{q}\right)^{k-1}\frac{1}{q^2}\right] =\frac{k}{q^2}\frac{d}{dp}\left[\left(\frac{p}{q}\right)^{k-1}\right]+k\left(\frac{p}{q}\right)^{k-1}\frac{d}{dp}\frac{1}{q^2}$

$\displaystyle \qquad=\frac{k(k-1)}{q^4}\left(\frac{p}{q}\right)^{k-2}+2k\left(\frac{p}{q}\right)^{k-1}\frac{1}{q^3} =\frac{k^2}{q^4}\left(\frac{p}{q}\right)^{k-2}-\frac{k}{q^4}\left(\frac{p}{q}\right)^{k-2}+2k\left(\frac{p}{q}\right)^{k-1}\frac{1}{q^3}$

$\displaystyle \qquad=\frac{k^2}{q^4}\left(\frac{p}{q}\right)^{k-2}-k\left(\frac{p}{q}\right)^{k-1}\left(\frac{1}{q^4}\left(\frac{p}{q}\right)^{-1}-\frac{2}{q^3}\right) =\frac{k^2}{q^4}\left(\frac{p}{q}\right)^{k-2}-q^2\frac{d}{dp}\left(\frac{p}{q}\right)^k\cdot \left(\frac{1}{q^4}\left(\frac{p}{q}\right)^{-1}-\frac{2}{q^3}\right)$

$\displaystyle \qquad=\frac{k^2}{q^4}\left(\frac{p}{q}\right)^{k-2}-\frac{1}{q^4}\frac{d}{dp}\left(\frac{p}{q}\right)^k\cdot \left(\frac{q^3}{p}-2q^3\right) .$

$\displaystyle ~~~k^2\left(\frac{p}{q}\right)^{k-2} =q^4\frac{d^2}{dp^2}\left(\frac{p}{q}\right)^k+\frac{d}{dp}\left(\frac{p}{q}\right)^k\cdot\left(\frac{q^3}{p}-2q^3\right) .$

\qquad Let $\displaystyle \mu=\mathrm{E}(X)=np,~ \mathrm{Var}(X) =\mathrm{E}(X^2)-\big(\mathrm{E}(X)\big)^2 =\left[\sum_{k=0}^n x_k^2p_k\right]-(np)^2 =\left[\sum_{k=0}^n k^2\binom{n}{k}p^k q^{n-k}\right]-(np)^2$

$\displaystyle \qquad=p^2q^{n-2}\left[\sum_{k=0}^n\binom{n}{k}k^2\left(\frac{p}{q}\right)^{k-2}\right]-(np)^2$

$\displaystyle \qquad=p^2q^{n-2}\left[\sum_{k=0}^n\binom{n}{k}\left[q^4\frac{d^2}{dp^2}\left(\frac{p}{q}\right)^k+\frac{d}{dp}\left(\frac{p}{q}\right)^k\cdot\left(\frac{q^3}{p}-2q^3\right)\right]\right]-(np)^2$

$\displaystyle \qquad=p^2q^{n-2}\left[q^4\frac{d^2}{dp^2}\sum_{k=0}^n\binom{n}{k}\left(\frac{p}{q}\right)^k+\left(\frac{q^3}{p}-2q^3\right)\cdot\frac{d}{dp}\sum_{k=0}^n\binom{n}{k}\left(\frac{p}{q}\right)^k\right]-(np)^2$

$\displaystyle \qquad=p^2q^{n-2}\left[q^4\frac{d^2}{dp^2}\left(1+\frac{p}{q}\right)^n+\left(\frac{q^3}{p}-2q^3\right)\cdot\frac{d}{dp}\left(1+\frac{p}{q}\right)^n\right]-(np)^2$

$\displaystyle \qquad=p^2q^{n-2}\left[q^4\frac{d^2}{dp^2}q^{-n}+\left(\frac{q^3}{p}-2q^3\right)\cdot\frac{d}{dp}q^{-n}\right]-(np)^2$

$\displaystyle \qquad=p^2q^{n-2}\left[q^4n(n+1)q^{-n-2}+\left(\frac{q^3}{p}-2q^3\right)nq^{-n-1}\right]-(np)^2$

$\displaystyle \qquad=p^2q^{n-2}q^4n(n+1)q^{-n-2}+p^2q^{n-2}\frac{q^3}{p}nq^{-n-1}-p^2q^{n-2}\cdot 2q^3nq^{-n-1}-(np)^2$

$\displaystyle \qquad=n(n+1)p^2+np-2np^2-n^2p^2 =n^2p^2+np^2+np-2np^2-n^2p^2 =np-np^2=np(1-p) .$

Geometric Distribution¶

$G(p,k)=(1-p)^{k-1}p\text{ and }k=1,2,\ldots~.$

Let $\displaystyle q=1-p.~\sum_{k=1}^\infty G(p,k)=\sum_{k=1}^\infty q^{k-1}p=p\sum_{k=0}^\infty q^k=p\cdot\frac{1}{1-q}=1 .$

Consider an infinite Bernoulli process of trials, each has a success probability of $p$. If the random variable $X$ is the number of trials conducted until success occurs for the first time, then $X$ has the geometric distribution $G(p)$. We write $X\sim G(p).$

$\displaystyle\mathrm{E}(X)=\frac{1}{p}.$

Proof: $\displaystyle\mathrm{E}(X) =\sum_{k=1}^\infty x_k p_k =\sum_{k=1}^\infty kq^{k-1}p =\sum_{k=1}^\infty\frac{d}{dq}q^k\cdot p =p\cdot\frac{d}{dq}\sum_{k=1}^\infty q^k =p\cdot\frac{d}{dq}\left(\frac{q}{1-q}\right) =p\cdot\frac{d}{dq}\left(\frac{q}{p}\right) =p\cdot\frac{p+q}{p^2} =\frac{1}{p} .$

$\displaystyle\mathrm{Var}(X)=\frac{q}{p^2}=\frac{1-p}{p^2}.$

Proof: $\displaystyle \frac{d}{dq}q^k=kq^{k-1},~ \frac{d^2}{dq^2}q^k =k(k-1)q^{k-2} =(k^2q^{k-1}-kq^{k-1})q^{-1},$

$\displaystyle ~~~k^2q^{k-1} =q\frac{d^2}{dq^2}q^k+kq^{k-1} =q\frac{d^2}{dq^2}q^k+\frac{d}{dq}q^k .$

$\displaystyle \qquad\mathrm{Var}(X) =\mathrm{E}(X^2)-\big(\mathrm{E}(X)\big)^2 =\left(\sum_{k=1}^\infty x_k^2 p_k\right)-\tfrac{1}{p}^2 =\left(\sum_{k=1}^\infty k^2q^{k-1}p\right)-p^{-2}$

$\displaystyle \qquad=p\sum_{k=1}^\infty\left(q\frac{d^2}{dq^2}q^k+\frac{d}{dq}q^k\right)-p^{-2} =p\left(q\frac{d^2}{dq^2}\sum_{k=1}^\infty q^k+\frac{d}{dq}\sum_{k=1}^\infty q^k\right)-p^{-2} =p\left(q\frac{d^2}{dq^2}\left(\frac{q}{1-q}\right)+\frac{d}{dq}\left(\frac{q}{1-q}\right)\right)-p^{-2}$

$\displaystyle \qquad=p\left(q\frac{d^2}{dq^2}\left(\frac{q}{p}\right)+\frac{d}{dq}\left(\frac{q}{p}\right)\right)-p^{-2} =p\left(q\cdot\frac{2}{p^3}+\frac{1}{p^2}\right)-p^{-2} =\frac{2q}{p^2}+\frac{p}{p^2}-\frac{1}{p^2} =\frac{2q+p-1}{p^2} =\frac{1-p}{p^2} .$

Theorem: $P(X>n)=(1-p)^n=q^n,$ where $X\sim G(p),~~n=1,2,\ldots~.$

Proof: $\displaystyle P(X>n)=\sum_{k=n+1}^\infty q^{k-1}p=pq^n\sum_{k=0}^\infty q^k=pq^n\cdot\frac{1}{1-q}=pq^n\cdot\frac{1}{p}=q^n .$

Home	Algorithms	Commercialization	Data Science	Information Theories	Quantum Theories	Lab	Linear Algebra
<< Probability - Normal Distribution			PDF		Probability - Continuous Probability Distributions >>

Probability - Discrete Probability Distributions¶

Random Variables¶

Discrete Random Variables¶

Binomial Distribution¶

Geometric Distribution¶

End of Article¶