机器学习的数学基础(下)
矩阵的特征值和特征向量
1.矩阵的特征值和特征向量的概念及性质
(1) 设λ\\lambdaλ是AAA的一个特征值,则
kA,aA+bE,A2,Am,f(A),AT,A−1,A∗\\text{kA},\\text{aA} + \\text{bE},A^{2},A^{m},f(A),A^{T},A^{- 1},A^{\\ast}kA,aA+bE,A2,Am,f(A),AT,A−1,A∗有一个特征值分别为
kλ,aλ+b,λ2,λm,f(λ),λ,λ−1,∣A∣λ,\\text{kλ},\\text{aλ} + b,\\lambda^{2},\\lambda^{m},f(\\lambda),\\lambda,\\lambda^{- 1},\\frac{|A|}{\\lambda},kλ,aλ+b,λ2,λm,f(λ),λ,λ−1,λ∣A∣,且对应特征向量相同(ATA^{T}AT
例外)。
(2)
若λ1,λ2,⋯,λn\\lambda_{1},\\lambda_{2},\\cdots,\\lambda_{n}λ1,λ2,⋯,λn为AAA的nnn个特征值,则∑i=1nλi=∑i=1naii,∏i=1nλi=∣A∣\\sum_{i = 1}^{n}\\lambda_{i} = \\sum_{i = 1}^{n}a_{\\text{ii}},\\prod_{i = 1}^{n}\\lambda_{i} = |A|∑i=1nλi=∑i=1naii,∏i=1nλi=∣A∣
,从而∣A∣≠0⇔A|A| \\neq 0 \\Leftrightarrow A∣A∣=0⇔A没有特征值。
(3)
设λ1,λ2,⋯,λs\\lambda_{1},\\lambda_{2},\\cdots,\\lambda_{s}λ1,λ2,⋯,λs为AAA的sss个特征值,对应特征向量为
α1,α2,⋯,αs\\alpha_{1},\\alpha_{2},\\cdots,\\alpha_{s}α1,α2,⋯,αs,
若:
α=k1α1+k2α2+⋯+ksαs\\alpha = k_{1}\\alpha_{1} + k_{2}\\alpha_{2} + \\cdots + k_{s}\\alpha_{s}α=k1α1+k2α2+⋯+ksαs
,
则:
Anα=k1Anα1+k2Anα2+⋯+ksAnαs=k1λ1nα1+k2λ2nα2+⋯ksλsnαsA^{n}\\alpha = k_{1}A^{n}\\alpha_{1} + k_{2}A^{n}\\alpha_{2} + \\cdots + k_{s}A^{n}\\alpha_{s} = k_{1}\\lambda_{1}^{n}\\alpha_{1} + k_{2}\\lambda_{2}^{n}\\alpha_{2} + \\cdots k_{s}\\lambda_{s}^{n}\\alpha_{s}Anα=k1Anα1+k2Anα2+⋯+ksAnαs=k1λ1nα1+k2λ2nα2+⋯ksλsnαs
。
2.相似变换、相似矩阵的概念及性质
(1) 若A∼BA \\sim BA∼B,则
1) AT∼BT,A−1∼B−1,,A∗∼B∗A^{T} \\sim B^{T},A^{- 1} \\sim B^{- 1},,A^{\\ast} \\sim B^{\\ast}AT∼BT,A−1∼B−1,,A∗∼B∗
∣A∣=∣B∣,∑i=1nAii=∑i=1nbii,r(A)=r(B)|A| = |B|,\\sum_{i = 1}^{n}A_{\\text{ii}} = \\sum_{i = 1}^{n}b_{\\text{ii}},r(A) = r(B)∣A∣=∣B∣,∑i=1nAii=∑i=1nbii,r(A)=r(B)
3) ∣λE−A∣=∣λE−B∣|\\lambda E - A| = |\\lambda E - B|∣λE−A∣=∣λE−B∣,对∀λ\\forall\\lambda∀λ成立
3.矩阵可相似对角化的充分必要条件
(1)
设AAA为nnn阶方阵,则AAA可对角化⇔\\Leftrightarrow⇔对每个kik_{i}ki重根特征值λi\\lambda_{i}λi,有n−r(λiE−A)=kin - r(\\lambda_{i}E - A) = k_{i}n−r(λiE−A)=ki
(2)
设AAA可对角化,则由P−1AP=Λ,P^{- 1}\\text{AP} = \\Lambda,P−1AP=Λ,有A=PΛP−1A = \\text{PΛ}P^{- 1}A=PΛP−1,从而An=PΛnP−1A^{n} = P\\Lambda^{n}P^{- 1}An=PΛnP−1
(3) 重要结论
1) 若A∼B,C∼DA \\sim B,C \\sim DA∼B,C∼D,则[AOOC]∼[BOOD]\\begin{bmatrix} & A\\quad O \\\\ & O\\quad C \\\\ \\end{bmatrix} \\sim \\begin{bmatrix} & B\\quad O \\\\ & O\\quad D \\\\ \\end{bmatrix}[AOOC]∼[BOOD].
若A∼BA \\sim BA∼B,则f(A)∼f(B),∣f(A)∣∼∣f(B)∣f(A) \\sim f(B),\\left| f(A) \\right| \\sim \\left| f(B) \\right|f(A)∼f(B),∣f(A)∣∼∣f(B)∣,其中f(A)f(A)f(A)为关于nnn阶方阵AAA的多项式。
3) 若AAA为可对角化矩阵,则其非零特征值的个数(重根重复计算)=秩(AAA)
4.实对称矩阵的特征值、特征向量及相似对角阵
(1)相似矩阵:设A,BA,BA,B为两个nnn阶方阵,如果存在一个可逆矩阵PPP,使得B=P−1APB = P^{- 1}\\text{AP}B=P−1AP成立,则称矩阵AAA与BBB相似,记为A∼BA \\sim BA∼B。
(2)相似矩阵的性质:如果A∼BA \\sim BA∼B则有:
1) AT∼BTA^{T} \\sim B^{T}AT∼BT
2) A−1∼B−1A^{- 1} \\sim B^{- 1}A−1∼B−1 (若AAA,BBB均可逆)
3) Ak∼BkA^{k} \\sim B^{k}Ak∼Bk (kkk为正整数)
∣λE−A∣=∣λE−B∣\\left| \\text{λE} - A \\right| = \\left| \\text{λE} - B \\right|∣λE−A∣=∣λE−B∣,从而A,BA,BA,B
有相同的特征值
5) ∣A∣=∣B∣\\left| A \\right| = \\left| B \\right|∣A∣=∣B∣,从而A,BA,BA,B同时可逆或者不可逆
秩(A)=\\left( A \\right) =(A)=秩(B),∣λE−A∣=∣λE−B∣\\left( B \\right),\\left| \\text{λE} - A \\right| = \\left| \\text{λE} - B \\right|(B),∣λE−A∣=∣λE−B∣,A,BA,BA,B不一定相似
二次型
1.n\\mathbf{n}n个变量x1,x2,⋯,xn\\mathbf{x}_{\\mathbf{1}}\\mathbf{,}\\mathbf{x}_{\\mathbf{2}}\\mathbf{,\\cdots,}\\mathbf{x}_{\\mathbf{n}}x1,x2,⋯,xn的二次齐次函数
f(x1,x2,⋯,xn)=∑i=1n∑j=1naijxiyjf(x_{1},x_{2},\\cdots,x_{n}) = \\sum_{i = 1}^{n}{\\sum_{j = 1}^{n}{a_{\\text{ij}}x_{i}y_{j}}}f(x1,x2,⋯,xn)=∑i=1n∑j=1naijxiyj,其中aij=aji(i,j=1,2,⋯,n)a_{\\text{ij}} = a_{\\text{ji}}(i,j = 1,2,\\cdots,n)aij=aji(i,j=1,2,⋯,n),称为nnn元二次型,简称二次型.
若令x=[x1x1⋮xn],A=[a11a12⋯a1na21a22⋯a2n⋯⋯⋯⋯⋯an1an2⋯ann]x = \\ \\begin{bmatrix} x_{1} \\\\ x_{1} \\\\ \\vdots \\\\ x_{n} \\\\ \\end{bmatrix},A = \\begin{bmatrix} & a_{11}\\quad a_{12}\\quad\\cdots\\quad a_{1n} \\\\ & a_{21}\\quad a_{22}\\quad\\cdots\\quad a_{2n} \\\\ & \\quad\\cdots\\cdots\\cdots\\cdots\\cdots \\\\ & a_{n1}\\quad a_{n2}\\quad\\cdots\\quad a_{\\text{nn}} \\\\ \\end{bmatrix}x= x1x1⋮xn,A=a11a12⋯a1na21a22⋯a2n⋯⋯⋯⋯⋯an1an2⋯ann,这二次型fff可改写成矩阵向量形式f=xTAxf = x^{T}\\text{Ax}f=xTAx。其中AAA称为二次型矩阵,因为aij=aji(i,j=1,2,⋯,n)a_{\\text{ij}} = a_{\\text{ji}}(i,j = 1,2,\\cdots,n)aij=aji(i,j=1,2,⋯,n),所以二次型矩阵均为对称矩阵,且二次型与对称矩阵一一对应,并把矩阵AAA的秩称为二次型的秩。
2.惯性定理,二次型的标准形和规范形
(1) 惯性定理
对于任一二次型,不论选取怎样的合同变换使它化为仅含平方项的标准型,其正负惯性指数与所选变换无关,这就是所谓的惯性定理。
(2) 标准形
二次型f=(x1,x2,⋯,xn)=xTAxf = \\left( x_{1},x_{2},\\cdots,x_{n} \\right) = x^{T}\\text{Ax}f=(x1,x2,⋯,xn)=xTAx经过合同变换x=Cyx = \\text{Cy}x=Cy化为f=xTAx=yTCTACf = x^{T}\\text{Ax} = y^{T}C^{T}\\text{AC}f=xTAx=yTCTAC
y=∑i=1rdiyi2y = \\sum_{i = 1}^{r}{d_{i}y_{i}^{2}}y=∑i=1rdiyi2称为
f(r≤n)f(r \\leq n)f(r≤n)的标准形。在一般的数域内,二次型的标准形不是唯一的,与所作的合同变换有关,但系数不为零的平方项的个数由r(A)r(A)r(A)唯一确定。
(3) 规范形
任一实二次型fff都可经过合同变换化为规范形f=z12+z22+⋯+zp2−zp+12−⋯−zr2f = z_{1}^{2} + z_{2}^{2} + \\cdots + z_{p}^{2} - z_{p + 1}^{2} - \\cdots - z_{r}^{2}f=z12+z22+⋯+zp2−zp+12−⋯−zr2,其中rrr为AAA的秩,ppp为正惯性指数,r−pr - pr−p为负惯性指数,且规范型唯一。
3.用正交变换和配方法化二次型为标准形,二次型及其矩阵的正定性
设AAA正定⇒kA(k>0),AT,A−1,A∗\\Rightarrow \\text{kA}(k > 0),A^{T},A^{- 1},A^{\\ast}⇒kA(k>0),AT,A−1,A∗正定;∣A∣>0|A| > 0∣A∣>0,AAA可逆;aii>0a_{\\text{ii}} > 0aii>0,且∣Aii∣>0|A_{\\text{ii}}| > 0∣Aii∣>0
AAA,BBB正定⇒A+B\\Rightarrow A + B⇒A+B正定,但AB\\text{AB}AB,BA\\text{BA}BA不一定正定
AAA正定⇔f(x)=xTAx>0,∀x≠0\\Leftrightarrow f(x) = x^{T}\\text{Ax} > 0,\\forall x \\neq 0⇔f(x)=xTAx>0,∀x=0
⇔A\\Leftrightarrow A⇔A的各阶顺序主子式全大于零
⇔A\\Leftrightarrow A⇔A的所有特征值大于零
⇔A\\Leftrightarrow A⇔A的正惯性指数为nnn
⇔\\Leftrightarrow⇔存在可逆阵PPP使A=PTPA = P^{T}PA=PTP
⇔\\Leftrightarrow⇔存在正交矩阵QQQ,使QTAQ=Q−1AQ=(λ1⋱λn),Q^{T}\\text{AQ} = Q^{- 1}\\text{AQ} = \\begin{pmatrix} \\lambda_{1} & & \\\\ \\begin{matrix} & \\\\ & \\\\ \\end{matrix} & \\ddots & \\\\ & & \\lambda_{n} \\\\ \\end{pmatrix},QTAQ=Q−1AQ=λ1⋱λn,
其中λi>0,i=1,2,⋯,n.\\lambda_{i} > 0,i = 1,2,\\cdots,n.λi>0,i=1,2,⋯,n.正定⇒kA(k>0),AT,A−1,A∗\\Rightarrow \\text{kA}(k > 0),A^{T},A^{- 1},A^{\\ast}⇒kA(k>0),AT,A−1,A∗正定;
∣A∣>0,A|A| > 0,A∣A∣>0,A可逆;aii>0a_{\\text{ii}} > 0aii>0,且∣Aii∣>0|A_{\\text{ii}}| > 0∣Aii∣>0 。
概率论和数理统计
随机事件和概率
1.事件的关系与运算
(1) 子事件:A⊂BA \\subset BA⊂B,若AAA发生,则BBB发生。
(2) 相等事件:A=BA = BA=B,即A⊂BA \\subset BA⊂B,且B⊂AB \\subset AB⊂A 。
(3) 和事件:A⋃BA\\bigcup BA⋃B(或A+BA + BA+B),AAA与BBB中至少有一个发生。
(4) 差事件:A−BA - BA−B,AAA发生但BBB不发生。
(5) 积事件:A⋂BA\\bigcap BA⋂B(或AB\\text{AB}AB),AAA与BBB同时发生。
(6) 互斥事件(互不相容):A⋂BA\\bigcap BA⋂B=∅\\varnothing∅。
(7) 互逆事件(对立事件):
A⋂B=∅,A⋃B=Ω,A=B‾,B=A‾A\\bigcap B = \\varnothing,A\\bigcup B = \\Omega,A = \\overline{B},B = \\overline{A}A⋂B=∅,A⋃B=Ω,A=B,B=A
。
2.运算律
(1) 交换律:A⋃B=B⋃A,A⋂B=B⋂AA\\bigcup B = B\\bigcup A,A\\bigcap B = B\\bigcap AA⋃B=B⋃A,A⋂B=B⋂A
(2) 结合律:(A⋃B)⋃C=A⋃(B⋃C)(A\\bigcup B)\\bigcup C = A\\bigcup(B\\bigcup C)(A⋃B)⋃C=A⋃(B⋃C);
(A⋂B)⋂C=A⋂(B⋂C)(A\\bigcap B)\\bigcap C = A\\bigcap(B\\bigcap C)(A⋂B)⋂C=A⋂(B⋂C)
(3) 分配律:(A⋃B)⋂C=(A⋂C)⋃(B⋂C)(A\\bigcup B)\\bigcap C = (A\\bigcap C)\\bigcup(B\\bigcap C)(A⋃B)⋂C=(A⋂C)⋃(B⋂C)
3.德.\\mathbf{.}.摩根律
A⋃B‾=A‾⋂B‾\\overline{A\\bigcup B} = \\overline{A}\\bigcap\\overline{B}A⋃B=A⋂B
A⋂B‾=A‾⋃B‾\\overline{A\\bigcap B} = \\overline{A}\\bigcup\\overline{B}A⋂B=A⋃B
4.完全事件组
A1A2⋯AnA_{1}A_{2}\\cdots A_{n}A1A2⋯An两两互斥,且和事件为必然事件,即Ai⋂Aj=∅,i≠j,⋃ni=1=ΩA_{i}\\bigcap A_{j} = \\varnothing,i \\neq j,\\underset{i = 1}{\\bigcup^{n}}\\, = \\OmegaAi⋂Aj=∅,i=j,i=1⋃n=Ω
5.概率的基本概念
(1) 概率:事件发生的可能性大小的度量,其严格定义如下:
概率P(g)P(g)P(g)为定义在事件集合上的满足下面3个条件的函数:
1)对任何事件AAA,P(A)≥0P(A) \\geq 0P(A)≥0
2)对必然事件Ω\\OmegaΩ,P(Ω)=1P(\\Omega) = 1P(Ω)=1
3)对A1A2⋯An,⋯A_{1}A_{2}\\cdots A_{n},\\cdotsA1A2⋯An,⋯
,若AiAj=∅(i≠j)A_{i}A_{j} = \\varnothing(i \\neq j)AiAj=∅(i=j),则:P(⋃∞i=1Ai)=∑i=1∞P(A).P(\\underset{i = 1}{\\bigcup^{\\infty}}\\, A_{i}) = \\sum_{i = 1}^{\\infty}{P(A).}P(i=1⋃∞Ai)=∑i=1∞P(A).
(2) 概率的基本性质
1) P(A‾)=1−P(A)P(\\overline{A}) = 1 - P(A)P(A)=1−P(A);
2) P(A−B)=P(A)−P(AB);P(A - B) = P(A) - P(AB);P(A−B)=P(A)−P(AB);
3) P(A⋃B)=P(A)+P(B)−P(AB)P(A\\bigcup B) = P(A) + P(B) - P(AB)P(A⋃B)=P(A)+P(B)−P(AB)
特别,当B⊂AB \\subset AB⊂A时,P(A−B)=P(A)−P(B)P(A - B) = P(A) - P(B)P(A−B)=P(A)−P(B)且P(B)≤P(A)P(B) \\leq P(A)P(B)≤P(A);
P(A⋃B⋃C)=P(A)+P(B)+P(C)−P(AB)−P(BC)−P(AC)+P(ABC)P(A\\bigcup B\\bigcup C) = P(A) + P(B) + P(C) - P(AB) - P(BC) - P(AC) + P(ABC)P(A⋃B⋃C)=P(A)+P(B)+P(C)−P(AB)−P(BC)−P(AC)+P(ABC)
4)
若A1,A2,⋯,AnA_{1},A_{2},\\cdots,A_{n}A1,A2,⋯,An两两互斥,则P(⋃ni=1Ai)=∑i=1n(P(Ai)P(\\underset{i = 1}{\\bigcup^{n}}\\, A_{i}) = \\sum_{i = 1}^{n}{(P(A_{i})}P(i=1⋃nAi)=∑i=1n(P(Ai)
(3) 古典型概率: 实验的所有结果只有有限个,
且每个结果发生的可能性相同,其概率计算公式: P(A)=AP(A) = \\frac{A}{}P(A)=A
(4) 几何型概率: 样本空间Ω\\OmegaΩ为欧氏空间中的一个区域,
且每个样本点的出现具有等可能性,其概率计算公式:P(A)=A()Ω()P(A) = \\frac{A()}{\\Omega()}P(A)=Ω()A()
6.概率的基本公式
(1) 条件概率: P(B∣A)=P(AB)P(A)P(B|A) = \\frac{P(AB)}{P(A)}P(B∣A)=P(A)P(AB)
,表示AAA发生的条件下,BBB发生的概率
(2) 全概率公式:
P(A)=∑i=1nP(A∣Bi)P(Bi),BiBj=∅,i≠j,⋃ni=1Bi=Ω.P(A) = \\sum_{i = 1}^{n}{P(A|B_{i})P(B_{i}),B_{i}B_{j}} = \\varnothing,i \\neq j,\\underset{i = 1}{\\bigcup^{n}}\\, B_{i} = \\Omega.P(A)=∑i=1nP(A∣Bi)P(Bi),BiBj=∅,i=j,i=1⋃nBi=Ω.
(3) Bayes公式:
P(Bj∣A)=P(A∣Bj)P(Bj)∑i=1nP(A∣Bi)P(Bi),j=1,2,⋯,nP(B_{j}|A) = \\frac{P(A|B_{j})P(B_{j})}{\\sum_{i = 1}^{n}{P(A|B_{i})P(B_{i})}},j = 1,2,\\cdots,nP(Bj∣A)=∑i=1nP(A∣Bi)P(Bi)P(A∣Bj)P(Bj),j=1,2,⋯,n
注:上述公式中事件BiB_{i}Bi的个数可为可列个.
(4)乘法公式:
P(A1A2)=P(A1)P(A2∣A1)=P(A2)P(A1∣A2)P(A_{1}A_{2}) = P(A_{1})P(A_{2}|A_{1}) = P(A_{2})P(A_{1}|A_{2})P(A1A2)=P(A1)P(A2∣A1)=P(A2)P(A1∣A2)
P(A1A2⋯An)=P(A1)P(A2∣A1)P(A3∣A1A2)⋯P(An∣A1A2⋯An−1)P(A_{1}A_{2}\\cdots A_{n}) = P(A_{1})P(A_{2}|A_{1})P(A_{3}|A_{1}A_{2})\\cdots P(A_{n}|A_{1}A_{2}\\cdots A_{n - 1})P(A1A2⋯An)=P(A1)P(A2∣A1)P(A3∣A1A2)⋯P(An∣A1A2⋯An−1)
7.事件的独立性
(1)
A与B相互独立⇔P(AB)=P(A)P(B)\\Leftrightarrow P\\left( \\text{AB} \\right) = P\\left( A \\right)P\\left( B \\right)⇔P(AB)=P(A)P(B)
(2) A,B,C两两独立
⇔P(AB)=P(A)P(B);P(BC)=P(B)P(C);\\Leftrightarrow P(\\text{AB}) = P(A)P(B);P(\\text{BC}) = P(B)P(C);⇔P(AB)=P(A)P(B);P(BC)=P(B)P(C);
P(AC)=P(A)P(C);P(\\text{AC}) = P(A)P(C);P(AC)=P(A)P(C);
(3) A,B,C相互独立 ⇔P(AB)=P(A)P(B);\\Leftrightarrow P(\\text{AB}) = P(A)P(B);⇔P(AB)=P(A)P(B);
P(BC)=P(B)P(C);P(\\text{BC}) = P(B)P(C);P(BC)=P(B)P(C); P(AC)=P(A)P(C);P(\\text{AC}) = P(A)P(C);P(AC)=P(A)P(C);
P(ABC)=P(A)P(B)P(C).P(\\text{ABC}) = P(A)P(B)P(C).P(ABC)=P(A)P(B)P(C).
8.独立重复试验
将某试验独立重复n次,若每次实验中事件A发生的概率为p,则n次试验中A发生k次的概率为:
$P\\left( X = k \\right) = C_{n}{k}p{k}\\left( 1 - p \\right)^{n - k}\\ $。
9.重要公式与结论
(1) P(A‾)=1−P(A)P\\left( \\overline{A} \\right) = 1 - P\\left( A \\right)P(A)=1−P(A)
(2) P(A⋃B)=P(A)+P(B)−P(AB)P(A\\bigcup B) = P(A) + P(B) - P(\\text{AB})P(A⋃B)=P(A)+P(B)−P(AB)
P(A⋃B⋃C)=P(A)+P(B)+P(C)−P(AB)−P(BC)−P(AC)+P(ABC)P(A\\bigcup B\\bigcup C) = P(A) + P(B) + P(C) - P(\\text{AB}) - P(\\text{BC}) - P(\\text{AC}) + P(\\text{ABC})P(A⋃B⋃C)=P(A)+P(B)+P(C)−P(AB)−P(BC)−P(AC)+P(ABC)
(3)
P(A−B)=P(A)−P(AB)P\\left( A - B \\right) = P\\left( A \\right) - P\\left( \\text{AB} \\right)P(A−B)=P(A)−P(AB)
(4)
P(AB‾)=P(A)−P(AB),P(A)=P(AB)+P(AB‾),P(A\\overline{B}) = P(A) - P(\\text{AB}),P(A) = P(\\text{AB}) + P(A\\overline{B}),P(AB)=P(A)−P(AB),P(A)=P(AB)+P(AB),
P(A⋃B)=P(A)+P(A‾B)=P(AB)+P(AB‾)+P(A‾B)P(A\\bigcup B) = P(A) + P(\\overline{A}B) = P(\\text{AB}) + P(A\\overline{B}) + P(\\overline{A}B)P(A⋃B)=P(A)+P(AB)=P(AB)+P(AB)+P(AB)
(5) 条件概率P(∣B)P(|B)P(∣B)满足概率的所有性质,
例如:. P(A‾1∣B)=1−P(A1∣B)P({\\overline{A}}_{1}|B) = 1 - P(A_{1}|B)P(A1∣B)=1−P(A1∣B)
P(A1⋃A2∣B)=P(A1∣B)+P(A2∣B)−P(A1A2∣B)P(A_{1}\\bigcup A_{2}|B) = P(A_{1}|B) + P(A_{2}|B) - P(A_{1}A_{2}|B)P(A1⋃A2∣B)=P(A1∣B)+P(A2∣B)−P(A1A2∣B)
P(A1A2∣B)=P(A1∣B)P(A2∣A1B)P(A_{1}A_{2}|B) = P(A_{1}|B)P(A_{2}|A_{1}B)P(A1A2∣B)=P(A1∣B)P(A2∣A1B)
(6)
若A1,A2,⋯,AnA_{1},A_{2},\\cdots,A_{n}A1,A2,⋯,An相互独立,则P(⋂i=1nAi)=∏i=1nP(Ai),P(\\bigcap_{i = 1}^{n}A_{i}) = \\prod_{i = 1}^{n}{P(A_{i})},P(⋂i=1nAi)=∏i=1nP(Ai),
P(⋃i=1nAi)=∏i=1n(1−P(Ai))P(\\bigcup_{i = 1}^{n}A_{i}) = \\prod_{i = 1}^{n}{(1 - P(A_{i}))}P(⋃i=1nAi)=∏i=1n(1−P(Ai))
(7) 互斥、互逆与独立性之间的关系:
A与B互逆⇒\\Rightarrow⇒A与B互斥,但反之不成立,A与B互
斥(或互逆)且均非零概率事件⇒\\Rightarrow⇒A与B不独立.
(8)
若A1,A2,⋯,Am,B1,B2,⋯,BnA_{1},A_{2},\\cdots,A_{m},B_{1},B_{2},\\cdots,B_{n}A1,A2,⋯,Am,B1,B2,⋯,Bn相互独立,则f(A1,A2,⋯,Am)f(A_{1},A_{2},\\cdots,A_{m})f(A1,A2,⋯,Am)与
g(B1,B2,⋯,Bn)g(B_{1},B_{2},\\cdots,B_{n})g(B1,B2,⋯,Bn)也相互独立,其中f(),g()f(),g()f(),g()分别表示对相应事件做任意事件运算后所得的事件,另外,概率为1(或0)的事件与任何事件相互独立.
随机变量及其概率分布
1.随机变量及概率分布
取值带有随机性的变量,严格地说是定义在样本空间上,取值于实数的函数称为随机变量,概率分布通常指分布函数或分布律
2.分布函数的概念与性质
定义: F(x)=P(X≤x),−∞<x<+∞F(x) = P(X \\leq x), - \\infty < x < + \\inftyF(x)=P(X≤x),−∞<x<+∞
性质:(1)0≤F(x)≤10 \\leq F(x) \\leq 10≤F(x)≤1 (2)F(x)F(x)F(x)单调不减
(3)右连续F(x+0)=F(x)F(x + 0) = F(x)F(x+0)=F(x) (4)F(−∞)=0,F(+∞)=1F( - \\infty) = 0,F( + \\infty) = 1F(−∞)=0,F(+∞)=1
3.离散型随机变量的概率分布
P(X=xi)=pi,i=1,2,⋯,n,⋯pi≥0,∑i=1∞pi=1P(X = x_{i}) = p_{i},i = 1,2,\\cdots,n,\\cdots\\quad\\quad p_{i} \\geq 0,\\sum_{i = 1}^{\\infty}p_{i} = 1P(X=xi)=pi,i=1,2,⋯,n,⋯pi≥0,∑i=1∞pi=1
4.连续型随机变量的概率密度
概率密度f(x);f(x);f(x);非负可积,且:(1)f(x)≥0,f(x) \\geq 0,f(x)≥0,
(2)∫−∞+∞f(x)dx=1\\int_{- \\infty}^{+ \\infty}{f(x)\\text{dx} = 1}∫−∞+∞f(x)dx=1
(3)xxx为f(x)f(x)f(x)的连续点,则:
f(x)=F′(x)f(x) = F'(x)f(x)=F′(x)分布函数F(x)=∫−∞xf(t)dtF(x) = \\int_{- \\infty}^{x}{f(t)\\text{dt}}F(x)=∫−∞xf(t)dt
5.常见分布
(1) 0-1分布:P(X=k)=pk(1−p)1−k,k=0,1P(X = k) = p^{k}{(1 - p)}^{1 - k},k = 0,1P(X=k)=pk(1−p)1−k,k=0,1
(2) 二项分布:B(n,p)B(n,p)B(n,p):
P(X=k)=Cnkpk(1−p)n−k,k=0,1,⋯,nP(X = k) = C_{n}^{k}p^{k}{(1 - p)}^{n - k},k = 0,1,\\cdots,nP(X=k)=Cnkpk(1−p)n−k,k=0,1,⋯,n
(3) Poisson分布:p(λ)p(\\lambda)p(λ):
P(X=k)=λkk!e−λ,λ>0,k=0,1,2⋯P(X = k) = \\frac{\\lambda^{k}}{k!}e^{- \\lambda},\\lambda > 0,k = 0,1,2\\cdotsP(X=k)=k!λke−λ,λ>0,k=0,1,2⋯
(4) 均匀分布U(a,b)U(a,b)U(a,b):$f(x) = \\left{ \\begin{matrix}
& \\frac{1}{b - a},a < x < b \\
& 0, \\
\\end{matrix} \\right.\\ $
(5) 正态分布:N(μ,σ2):N(\\mu,\\sigma^{2}):N(μ,σ2):
φ(x)=12πσe−(x−μ)22σ2,σ>0,−∞<x<+∞\\varphi(x) = \\frac{1}{\\sqrt{2\\pi}\\sigma}e^{- \\frac{{(x - \\mu)}^{2}}{2\\sigma^{2}}},\\sigma > 0, - \\infty < x < + \\inftyφ(x)=2πσ1e−2σ2(x−μ)2,σ>0,−∞<x<+∞
(6)指数分布:$E(\\lambda):f(x) = \\left{ \\begin{matrix}
& \\lambda e^{- \\text{λx}},x > 0,\\lambda > 0 \\
& 0, \\
\\end{matrix} \\right.\\ $
(7)几何分布:G(p):P(X=k)=(1−p)k−1p,0<p<1,k=1,2,⋯.G(p):P(X = k) = {(1 - p)}^{k - 1}p,0 < p < 1,k = 1,2,\\cdots.G(p):P(X=k)=(1−p)k−1p,0<p<1,k=1,2,⋯.
(8)超几何分布:
H(N,M,n):P(X=k)=CMkCN−Mn−kCNn,k=0,1,⋯,min(n,M)H(N,M,n):P(X = k) = \\frac{C_{M}^{k}C_{N - M}^{n - k}}{C_{N}^{n}},k = 0,1,\\cdots,min(n,M)H(N,M,n):P(X=k)=CNnCMkCN−Mn−k,k=0,1,⋯,min(n,M)
6.随机变量函数的概率分布
(1)离散型:P(X=x1)=pi,Y=g(X)P(X = x_{1}) = p_{i},Y = g(X)P(X=x1)=pi,Y=g(X)
则: P(Y=yj)=∑g(xi)=yiP(X=xi)P(Y = y_{j}) = \\sum_{g(x_{i}) = y_{i}}^{}{P(X = x_{i})}P(Y=yj)=∑g(xi)=yiP(X=xi)
(2)连续型:X~fX(x),Y=g(x)X\\tilde{\\ }f_{X}(x),Y = g(x)X ~fX(x),Y=g(x)
则:Fy(y)=P(Y≤y)=P(g(X)≤y)=∫g(x)≤yfx(x)dxF_{y}(y) = P(Y \\leq y) = P(g(X) \\leq y) = \\int_{g(x) \\leq y}^{}{f_{x}(x)dx}Fy(y)=P(Y≤y)=P(g(X)≤y)=∫g(x)≤yfx(x)dx,
fY(y)=FY′(y)f_{Y}(y) = F'_{Y}(y)fY(y)=FY′(y)
7.重要公式与结论
(1)
X∼N(0,1)⇒φ(0)=12π,Φ(0)=12,X\\sim N(0,1) \\Rightarrow \\varphi(0) = \\frac{1}{\\sqrt{2\\pi}},\\Phi(0) = \\frac{1}{2},X∼N(0,1)⇒φ(0)=2π1,Φ(0)=21,
Φ(−a)=P(X≤−a)=1−Φ(a)\\Phi( - a) = P(X \\leq - a) = 1 - \\Phi(a)Φ(−a)=P(X≤−a)=1−Φ(a)
(2)
X∼N(μ,σ2)⇒X−μσ∼N(0,1),P(X≤a)=Φ(a−μσ)X\\sim N\\left( \\mu,\\sigma^{2} \\right) \\Rightarrow \\frac{X - \\mu}{\\sigma}\\sim N\\left( 0,1 \\right),P(X \\leq a) = \\Phi(\\frac{a - \\mu}{\\sigma})X∼N(μ,σ2)⇒σX−μ∼N(0,1),P(X≤a)=Φ(σa−μ)
(3) X∼E(λ)⇒P(X>s+t∣X>s)=P(X>t)X\\sim E(\\lambda) \\Rightarrow P(X > s + t|X > s) = P(X > t)X∼E(λ)⇒P(X>s+t∣X>s)=P(X>t)
(4) X∼G(p)⇒P(X=m+k∣X>m)=P(X=k)X\\sim G(p) \\Rightarrow P(X = m + k|X > m) = P(X = k)X∼G(p)⇒P(X=m+k∣X>m)=P(X=k)
(5)
离散型随机变量的分布函数为阶梯间断函数;连续型随机变量的分布函数为连续函数,但不一定为处处可导函数。
(6) 存在既非离散也非连续型随机变量。
多维随机变量及其分布
1.二维随机变量及其联合分布
由两个随机变量构成的随机向量(X,Y)(X,Y)(X,Y),
联合分布为F(x,y)=P(X≤x,Y≤y)F(x,y) = P(X \\leq x,Y \\leq y)F(x,y)=P(X≤x,Y≤y)
2.二维离散型随机变量的分布
(1) 联合概率分布律
P{X=xi,Y=yj}=pij;i,j=1,2,⋯P\\{ X = x_{i},Y = y_{j}\\} = p_{\\text{ij}};i,j = 1,2,\\cdotsP{X=xi,Y=yj}=pij;i,j=1,2,⋯
(2) 边缘分布律
pi⋅=∑j=1∞pij,i=1,2,⋯p_{i \\cdot} = \\sum_{j = 1}^{\\infty}p_{\\text{ij}},i = 1,2,\\cdotspi⋅=∑j=1∞pij,i=1,2,⋯
p⋅j=∑i∞pij,j=1,2,⋯p_{\\cdot j} = \\sum_{i}^{\\infty}p_{\\text{ij}},j = 1,2,\\cdotsp⋅j=∑i∞pij,j=1,2,⋯
(3) 条件分布律
P{X=xi∣Y=yj}=pijp⋅jP\\{ X = x_{i}|Y = y_{j}\\} = \\frac{p_{\\text{ij}}}{p_{\\cdot j}}P{X=xi∣Y=yj}=p⋅jpij
P{Y=yj∣X=xi}=pijpi⋅P\\{ Y = y_{j}|X = x_{i}\\} = \\frac{p_{\\text{ij}}}{p_{i \\cdot}}P{Y=yj∣X=xi}=pi⋅pij
3. 二维连续性随机变量的密度
(1) 联合概率密度f(x,y):f(x,y):f(x,y):
1) f(x,y)≥0f(x,y) \\geq 0f(x,y)≥0 2)
∫−∞+∞∫−∞+∞f(x,y)dxdy=1\\int_{- \\infty}^{+ \\infty}{\\int_{- \\infty}^{+ \\infty}{f(x,y)dxdy}} = 1∫−∞+∞∫−∞+∞f(x,y)dxdy=1
(2)
分布函数:F(x,y)=∫−∞x∫−∞yf(u,v)dudvF(x,y) = \\int_{- \\infty}^{x}{\\int_{- \\infty}^{y}{f(u,v)dudv}}F(x,y)=∫−∞x∫−∞yf(u,v)dudv
(3) 边缘概率密度:
fX(x)=∫−∞+∞f(x,y)dyf_{X}\\left( x \\right) = \\int_{- \\infty}^{+ \\infty}{f\\left( x,y \\right)\\text{dy}}fX(x)=∫−∞+∞f(x,y)dy
fY(y)=∫−∞+∞f(x,y)dxf_{Y}(y) = \\int_{- \\infty}^{+ \\infty}{f(x,y)dx}fY(y)=∫−∞+∞f(x,y)dx
(4)
条件概率密度:fX∣Y(x|y)=f(x,y)fY(y)f_{X|Y}\\left( x \\middle| y \\right) = \\frac{f\\left( x,y \\right)}{f_{Y}\\left( y \\right)}fX∣Y(x∣y)=fY(y)f(x,y)
fY∣X(y∣x)=f(x,y)fX(x)f_{Y|X}(y|x) = \\frac{f(x,y)}{f_{X}(x)}fY∣X(y∣x)=fX(x)f(x,y)
4.常见二维随机变量的联合分布
(1) 二维均匀分布:(x,y)∼U(D)(x,y) \\sim U(D)(x,y)∼U(D) ,$f(x,y) = \\left{ \\begin{matrix}
& \\frac{1}{S(D)},(x,y) \\in D \\
& 0,\\ \\ \\
\\end{matrix} \\right.\\ $
(2)
二维正态分布:(X,Y)∼N(μ1,μ2,σ12,σ22,ρ)X,Y)\\sim N(\\mu_{1},\\mu_{2},\\sigma_{1}^{2},\\sigma_{2}^{2},\\rho)X,Y)∼N(μ1,μ2,σ12,σ22,ρ)
f(x,y)=12πσ1σ21−ρ2.exp{−12(1−ρ2)[(x−μ1)2σ12−2ρ(x−μ1)(y−μ2)σ1σ2+(y−μ2)2σ22]}f(x,y) = \\frac{1}{2\\pi\\sigma_{1}\\sigma_{2}\\sqrt{1 - \\rho^{2}}}.\\exp\\left\\{ \\frac{- 1}{2(1 - \\rho^{2})}\\lbrack\\frac{{(x - \\mu_{1})}^{2}}{\\sigma_{1}^{2}} - 2\\rho\\frac{(x - \\mu_{1})(y - \\mu_{2})}{\\sigma_{1}\\sigma_{2}} + \\frac{{(y - \\mu_{2})}^{2}}{\\sigma_{2}^{2}}\\rbrack \\right\\}f(x,y)=2πσ1σ21−ρ21.exp{2(1−ρ2)−1[σ12(x−μ1)2−2ρσ1σ2(x−μ1)(y−μ2)+σ22(y−μ2)2]}
5.随机变量的独立性和相关性
XXX和YYY的相互独立:⇔F(x,y)=FX(x)FY(y)\\Leftrightarrow F\\left( x,y \\right) = F_{X}\\left( x \\right)F_{Y}\\left( y \\right)⇔F(x,y)=FX(x)FY(y):
⇔pij=pi⋅⋅p⋅j\\Leftrightarrow p_{\\text{ij}} = p_{i \\cdot} \\cdot p_{\\cdot j}⇔pij=pi⋅⋅p⋅j(离散型)
⇔f(x,y)=fX(x)fY(y)\\Leftrightarrow f\\left( x,y \\right) = f_{X}\\left( x \\right)f_{Y}\\left( y \\right)⇔f(x,y)=fX(x)fY(y)(连续型)
XXX和YYY的相关性:
相关系数ρXY=0\\rho_{\\text{XY}} = 0ρXY=0时,称XXX和YYY不相关, 否则称XXX和YYY相关
6.两个随机变量简单函数的概率分布
离散型:
P(X=xi,Y=yi)=pij,Z=g(X,Y)P\\left( X = x_{i},Y = y_{i} \\right) = p_{\\text{ij}},Z = g\\left( X,Y \\right)P(X=xi,Y=yi)=pij,Z=g(X,Y)
则:
P(Z=zk)=P{g(X,Y)=zk}=∑g(xi,yi)=zkP(X=xi,Y=yj)P(Z = z_{k}) = P\\left\\{ g\\left( X,Y \\right) = z_{k} \\right\\} = \\sum_{g\\left( x_{i},y_{i} \\right) = z_{k}}^{}{P\\left( X = x_{i},Y = y_{j} \\right)}P(Z=zk)=P{g(X,Y)=zk}=∑g(xi,yi)=zkP(X=xi,Y=yj)
连续型:
(X,Y)∼f(x,y),Z=g(X,Y)\\left( X,Y \\right) \\sim f\\left( x,y \\right),Z = g\\left( X,Y \\right)(X,Y)∼f(x,y),Z=g(X,Y)
则:
Fz(z)=P{g(X,Y)≤z}=∬g(x,y)≤zf(x,y)dxdyF_{z}\\left( z \\right) = P\\left\\{ g\\left( X,Y \\right) \\leq z \\right\\} = \\iint_{g(x,y) \\leq z}^{}{f(x,y)dxdy}Fz(z)=P{g(X,Y)≤z}=∬g(x,y)≤zf(x,y)dxdy,fz(z)=Fz′(z)f_{z}(z) = F'_{z}(z)fz(z)=Fz′(z)
7.重要公式与结论
(1) 边缘密度公式: fX(x)=∫−∞+∞f(x,y)dy,f_{X}(x) = \\int_{- \\infty}^{+ \\infty}{f(x,y)dy,}fX(x)=∫−∞+∞f(x,y)dy,
fY(y)=∫−∞+∞f(x,y)dxf_{Y}(y) = \\int_{- \\infty}^{+ \\infty}{f(x,y)dx}fY(y)=∫−∞+∞f(x,y)dx
(2)
P{(X,Y)∈D}=∬Df(x,y)dxdyP\\left\\{ \\left( X,Y \\right) \\in D \\right\\} = \\iint_{D}^{}{f\\left( x,y \\right)\\text{dxdy}}P{(X,Y)∈D}=∬Df(x,y)dxdy
(3)
若(X,Y)(X,Y)(X,Y)服从二维正态分布N(μ1,μ2,σ12,σ22,ρ)N(\\mu_{1},\\mu_{2},\\sigma_{1}^{2},\\sigma_{2}^{2},\\rho)N(μ1,μ2,σ12,σ22,ρ)
则有:
X∼N(μ1,σ12),Y∼N(μ2,σ22).X\\sim N\\left( \\mu_{1},\\sigma_{1}^{2} \\right),Y\\sim N(\\mu_{2},\\sigma_{2}^{2}).X∼N(μ1,σ12),Y∼N(μ2,σ22).
2) XXX与YYY相互独立⇔ρ=0\\Leftrightarrow \\rho = 0⇔ρ=0,即XXX与YYY不相关。
C1X+C2Y∼N(C1μ1+C2μ2,C12σ12+C22σ22+2C1C2σ1σ2ρ)C_{1}X + C_{2}Y\\sim N(C_{1}\\mu_{1} + C_{2}\\mu_{2},C_{1}^{2}\\sigma_{1}^{2} + C_{2}^{2}\\sigma_{2}^{2} + 2C_{1}C_{2}\\sigma_{1}\\sigma_{2}\\rho)C1X+C2Y∼N(C1μ1+C2μ2,C12σ12+C22σ22+2C1C2σ1σ2ρ)
4) X\\text{\\ X} X关于Y=y的条件分布为:
N(μ1+ρσ1σ2(y−μ2),σ12(1−ρ2))N(\\mu_{1} + \\rho\\frac{\\sigma_{1}}{\\sigma_{2}}(y - \\mu_{2}),\\sigma_{1}^{2}(1 - \\rho^{2}))N(μ1+ρσ2σ1(y−μ2),σ12(1−ρ2))
5) YYY关于X=xX = xX=x的条件分布为:
N(μ2+ρσ2σ1(x−μ1),σ22(1−ρ2))N(\\mu_{2} + \\rho\\frac{\\sigma_{2}}{\\sigma_{1}}(x - \\mu_{1}),\\sigma_{2}^{2}(1 - \\rho^{2}))N(μ2+ρσ1σ2(x−μ1),σ22(1−ρ2))
(4)
若XXX与YYY独立,且分别服从N(μ1,σ12),N(μ1,σ22),N(\\mu_{1},\\sigma_{1}^{2}),N(\\mu_{1},\\sigma_{2}^{2}),N(μ1,σ12),N(μ1,σ22),
则:
(X,Y)∼N(μ1,μ2,σ12,σ22,0),\\left( X,Y \\right)\\sim N(\\mu_{1},\\mu_{2},\\sigma_{1}^{2},\\sigma_{2}^{2},0),(X,Y)∼N(μ1,μ2,σ12,σ22,0),
C1X+C2Y~N(C1μ1+C2μ2,C12σ12+C22σ22).C_{1}X + C_{2}Y\\tilde{\\ }N(C_{1}\\mu_{1} + C_{2}\\mu_{2},C_{1}^{2}\\sigma_{1}^{2} + C_{2}^{2}\\sigma_{2}^{2}).C1X+C2Y ~N(C1μ1+C2μ2,C12σ12+C22σ22).
(5)
若XXX与YYY相互独立,f(x)f\\left( x \\right)f(x)和g(x)g\\left( x \\right)g(x)为连续函数,
则f(X)f\\left( X \\right)f(X)和g(Y)g(Y)g(Y)也相互独立。
随机变量的数字特征
1.数学期望
离散型:P{X=xi}=pi,E(X)=∑ixipiP\\left\\{ X = x_{i} \\right\\} = p_{i},E(X) = \\sum_{i}^{}{x_{i}p_{i}}P{X=xi}=pi,E(X)=∑ixipi;
连续型: X∼f(x),E(X)=∫−∞+∞xf(x)dxX\\sim f(x),E(X) = \\int_{- \\infty}^{+ \\infty}{xf(x)dx}X∼f(x),E(X)=∫−∞+∞xf(x)dx
性质:
(1) E(C)=C,E[E(X)]=E(X)E(C) = C,E\\lbrack E(X)\\rbrack = E(X)E(C)=C,E[E(X)]=E(X)
(2) E(C1X+C2Y)=C1E(X)+C2E(Y)E(C_{1}X + C_{2}Y) = C_{1}E(X) + C_{2}E(Y)E(C1X+C2Y)=C1E(X)+C2E(Y)
(3) 若X和Y独立,则E(XY)=E(X)E(Y)E(XY) = E(X)E(Y)E(XY)=E(X)E(Y)
(4)[E(XY)]2≤E(X2)E(Y2)\\left\\lbrack E(XY) \\right\\rbrack^{2} \\leq E(X^{2})E(Y^{2})[E(XY)]2≤E(X2)E(Y2)
2.方差:D(X)=E[X−E(X)]2=E(X2)−[E(X)]2D(X) = E\\left\\lbrack X - E(X) \\right\\rbrack^{2} = E(X^{2}) - \\left\\lbrack E(X) \\right\\rbrack^{2}D(X)=E[X−E(X)]2=E(X2)−[E(X)]2
3.标准差:D(X)\\sqrt{D(X)}D(X),
4.离散型:D(X)=∑i[xi−E(X)]2piD(X) = \\sum_{i}^{}{\\left\\lbrack x_{i} - E(X) \\right\\rbrack^{2}p_{i}}D(X)=∑i[xi−E(X)]2pi
5.连续型:D(X)=∫−∞+∞[x−E(X)]2f(x)dxD(X) = {\\int_{- \\infty}^{+ \\infty}\\left\\lbrack x - E(X) \\right\\rbrack}^{2}f(x)dxD(X)=∫−∞+∞[x−E(X)]2f(x)dx
性质:
(1)D(C)=0,D[E(X)]=0,D[D(X)]=0\\ D(C) = 0,D\\lbrack E(X)\\rbrack = 0,D\\lbrack D(X)\\rbrack = 0 D(C)=0,D[E(X)]=0,D[D(X)]=0
(2)X\\ X X与YYY相互独立,则D(X±Y)=D(X)+D(Y)D(X \\pm Y) = D(X) + D(Y)D(X±Y)=D(X)+D(Y)
(3)D(C1X+C2)=C12D(X)\\ D\\left( C_{1}X + C_{2} \\right) = C_{1}^{2}D\\left( X \\right) D(C1X+C2)=C12D(X)
(4) 一般有
D(X±Y)=D(X)+D(Y)±2Cov(X,Y)=D(X)+D(Y)±2ρD(X)D(Y)D(X \\pm Y) = D(X) + D(Y) \\pm 2Cov(X,Y) = D(X) + D(Y) \\pm 2\\rho\\sqrt{D(X)}\\sqrt{D(Y)}D(X±Y)=D(X)+D(Y)±2Cov(X,Y)=D(X)+D(Y)±2ρD(X)D(Y)
(5)D(X)<E(X−C)2,C≠E(X)\\ D\\left( X \\right) < E\\left( X - C \\right)^{2},C \\neq E\\left( X \\right) D(X)<E(X−C)2,C=E(X)
(6)D(X)=0⇔P{X=C}=1\\ D(X) = 0 \\Leftrightarrow P\\left\\{ X = C \\right\\} = 1 D(X)=0⇔P{X=C}=1
6.随机变量函数的数学期望
(1) 对于函数Y=g(x)Y = g(x)Y=g(x)
XXX为离散型:P{X=xi}=pi,E(Y)=∑ig(xi)piP\\{ X = x_{i}\\} = p_{i},E(Y) = \\sum_{i}^{}{g(x_{i})p_{i}}P{X=xi}=pi,E(Y)=∑ig(xi)pi;
XXX为连续型:X∼f(x),E(Y)=∫−∞+∞g(x)f(x)dxX\\sim f(x),E(Y) = \\int_{- \\infty}^{+ \\infty}{g(x)f(x)dx}X∼f(x),E(Y)=∫−∞+∞g(x)f(x)dx
(2)
Z=g(X,Y)Z = g(X,Y)Z=g(X,Y);(X,Y)∼P{X=xi,Y=yj}=pij\\left( X,Y \\right)\\sim P\\{ X = x_{i},Y = y_{j}\\} = p_{\\text{ij}}(X,Y)∼P{X=xi,Y=yj}=pij;
E(Z)=∑i∑jg(xi,yj)pijE(Z) = \\sum_{i}^{}{\\sum_{j}^{}{g(x_{i},y_{j})p_{\\text{ij}}}}E(Z)=∑i∑jg(xi,yj)pij
(X,Y)∼f(x,y)\\left( X,Y \\right)\\sim f(x,y)(X,Y)∼f(x,y);E(Z)=∫−∞+∞∫−∞+∞g(x,y)f(x,y)dxdyE(Z) = \\int_{- \\infty}^{+ \\infty}{\\int_{- \\infty}^{+ \\infty}{g(x,y)f(x,y)dxdy}}E(Z)=∫−∞+∞∫−∞+∞g(x,y)f(x,y)dxdy
7.协方差
Cov(X,Y)=E[(X−E(X)(Y−E(Y))]Cov(X,Y) = E\\left\\lbrack (X - E(X)(Y - E(Y)) \\right\\rbrackCov(X,Y)=E[(X−E(X)(Y−E(Y))]
8.相关系数
ρXY=Cov(X,Y)D(X)D(Y)\\rho_{\\text{XY}} = \\frac{Cov(X,Y)}{\\sqrt{D(X)}\\sqrt{D(Y)}}ρXY=D(X)D(Y)Cov(X,Y),kkk阶原点矩
E(Xk)E(X^{k})E(Xk); kkk阶中心矩
E{[X−E(X)]k}E\\left\\{ {\\lbrack X - E(X)\\rbrack}^{k} \\right\\}E{[X−E(X)]k}
性质:
(1)Cov(X,Y)=Cov(Y,X)\\ Cov(X,Y) = Cov(Y,X) Cov(X,Y)=Cov(Y,X)
(2)Cov(aX,bY)=abCov(Y,X)\\ Cov(aX,bY) = abCov(Y,X) Cov(aX,bY)=abCov(Y,X)
(3)Cov(X1+X2,Y)=Cov(X1,Y)+Cov(X2,Y)\\ Cov(X_{1} + X_{2},Y) = Cov(X_{1},Y) + Cov(X_{2},Y) Cov(X1+X2,Y)=Cov(X1,Y)+Cov(X2,Y)
(4)∣ρ(X,Y)∣≤1\\ \\left| \\rho\\left( X,Y \\right) \\right| \\leq 1 ∣ρ(X,Y)∣≤1
(5)ρ(X,Y)=1⇔P(Y=aX+b)=1\\ \\rho\\left( X,Y \\right) = 1 \\Leftrightarrow P\\left( Y = aX + b \\right) = 1 ρ(X,Y)=1⇔P(Y=aX+b)=1
,其中a>0a > 0a>0
ρ(X,Y)=−1⇔P(Y=aX+b)=1\\rho\\left( X,Y \\right) = - 1 \\Leftrightarrow P\\left( Y = aX + b \\right) = 1ρ(X,Y)=−1⇔P(Y=aX+b)=1
,其中a<0a < 0a<0
9.重要公式与结论
(1)D(X)=E(X2)−E2(X)\\ D(X) = E(X^{2}) - E^{2}(X) D(X)=E(X2)−E2(X)
(2)Cov(X,Y)=E(XY)−E(X)E(Y)\\ Cov(X,Y) = E(XY) - E(X)E(Y) Cov(X,Y)=E(XY)−E(X)E(Y)
(3) ∣ρ(X,Y)∣≤1,\\left| \\rho\\left( X,Y \\right) \\right| \\leq 1,∣ρ(X,Y)∣≤1,且
ρ(X,Y)=1⇔P(Y=aX+b)=1\\rho\\left( X,Y \\right) = 1 \\Leftrightarrow P\\left( Y = aX + b \\right) = 1ρ(X,Y)=1⇔P(Y=aX+b)=1,其中a>0a > 0a>0
ρ(X,Y)=−1⇔P(Y=aX+b)=1\\rho\\left( X,Y \\right) = - 1 \\Leftrightarrow P\\left( Y = aX + b \\right) = 1ρ(X,Y)=−1⇔P(Y=aX+b)=1,其中a<0a < 0a<0
(4) 下面5个条件互为充要条件:
ρ(X,Y)=0\\rho(X,Y) = 0ρ(X,Y)=0 ⇔Cov(X,Y)=0\\Leftrightarrow Cov(X,Y) = 0⇔Cov(X,Y)=0
⇔E(X,Y)=E(X)E(Y)\\Leftrightarrow E(X,Y) = E(X)E(Y)⇔E(X,Y)=E(X)E(Y)
⇔D(X+Y)=D(X)+D(Y)\\Leftrightarrow D(X + Y) = D(X) + D(Y)⇔D(X+Y)=D(X)+D(Y)
⇔D(X−Y)=D(X)+D(Y)\\Leftrightarrow D(X - Y) = D(X) + D(Y)⇔D(X−Y)=D(X)+D(Y)
注:XXX与YYY独立为上述5个条件中任何一个成立的充分条件,但非必要条件。
数理统计的基本概念
1.基本概念
总体:研究对象的全体,它是一个随机变量,用XXX表示。
个体:组成总体的每个基本元素。
简单随机样本:来自总体XXX的nnn个相互独立且与总体同分布的随机变量X1,X2⋯,XnX_{1},X_{2}\\cdots,X_{n}X1,X2⋯,Xn,称为容量为nnn的简单随机样本,简称样本。
统计量:设X1,X2⋯,Xn,X_{1},X_{2}\\cdots,X_{n},X1,X2⋯,Xn,是来自总体XXX的一个样本,g(X1,X2⋯,Xn)g(X_{1},X_{2}\\cdots,X_{n})g(X1,X2⋯,Xn))是样本的连续函数,且g()g()g()中不含任何未知参数,则称g(X1,X2⋯,Xn)g(X_{1},X_{2}\\cdots,X_{n})g(X1,X2⋯,Xn)为统计量
样本均值:X‾=1n∑i=1nXi\\overline{X} = \\frac{1}{n}\\sum_{i = 1}^{n}X_{i}X=n1∑i=1nXi
样本方差:S2=1n−1∑i=1n(Xi−X‾)2S^{2} = \\frac{1}{n - 1}\\sum_{i = 1}^{n}{(X_{i} - \\overline{X})}^{2}S2=n−11∑i=1n(Xi−X)2
样本矩:样本kkk阶原点矩:Ak=1n∑i=1nXik,k=1,2,⋯A_{k} = \\frac{1}{n}\\sum_{i = 1}^{n}X_{i}^{k},k = 1,2,\\cdotsAk=n1∑i=1nXik,k=1,2,⋯
样本kkk阶中心矩:Bk=1n∑i=1n(Xi−X‾)k,k=1,2,⋯B_{k} = \\frac{1}{n}\\sum_{i = 1}^{n}{(X_{i} - \\overline{X})}^{k},k = 1,2,\\cdotsBk=n1∑i=1n(Xi−X)k,k=1,2,⋯
2.分布
χ2\\chi^{2}χ2分布:χ2=X12+X22+⋯+Xn2∼χ2(n)\\chi^{2} = X_{1}^{2} + X_{2}^{2} + \\cdots + X_{n}^{2}\\sim\\chi^{2}(n)χ2=X12+X22+⋯+Xn2∼χ2(n),其中X1,X2⋯,Xn,X_{1},X_{2}\\cdots,X_{n},X1,X2⋯,Xn,相互独立,且同服从N(0,1)N(0,1)N(0,1)
ttt分布:T=XY/n∼t(n)T = \\frac{X}{\\sqrt{Y/n}}\\sim t(n)T=Y/nX∼t(n)
,其中X∼N(0,1),Y∼χ2(n),X\\sim N\\left( 0,1 \\right),Y\\sim\\chi^{2}(n),X∼N(0,1),Y∼χ2(n),且XXX,YYY 相互独立。
F分布:F=X/n1Y/n2∼F(n1,n2)F = \\frac{X/n_{1}}{Y/n_{2}}\\sim F(n_{1},n_{2})F=Y/n2X/n1∼F(n1,n2),其中X∼χ2(n1),Y∼χ2(n2),X\\sim\\chi^{2}\\left( n_{1} \\right),Y\\sim\\chi^{2}(n_{2}),X∼χ2(n1),Y∼χ2(n2),且XXX,YYY相互独立。
分位数:若P(X≤xα)=α,P(X \\leq x_{\\alpha}) = \\alpha,P(X≤xα)=α,则称xαx_{\\alpha}xα为XXX的α\\alphaα分位数
3.正态总体的常用样本分布
(1) 设X1,X2⋯,XnX_{1},X_{2}\\cdots,X_{n}X1,X2⋯,Xn为来自正态总体N(μ,σ2)N(\\mu,\\sigma^{2})N(μ,σ2)的样本,
X‾=1n∑i=1nXi,S2=1n−1∑i=1n(Xi−X‾)2,\\overline{X} = \\frac{1}{n}\\sum_{i = 1}^{n}X_{i},S^{2} = \\frac{1}{n - 1}\\sum_{i = 1}^{n}{{(X_{i} - \\overline{X})}^{2},}X=n1∑i=1nXi,S2=n−11∑i=1n(Xi−X)2,则:
X‾∼N(μ,σ2n)\\overline{X}\\sim N\\left( \\mu,\\frac{\\sigma^{2}}{n} \\right)\\text{\\ \\ }X∼N(μ,nσ2) 或者X‾−μσn∼N(0,1)\\frac{\\overline{X} - \\mu}{\\frac{\\sigma}{\\sqrt{n}}}\\sim N(0,1)nσX−μ∼N(0,1)
(n−1)S2σ2=1σ2∑i=1n(Xi−X‾)2∼χ2(n−1)\\frac{(n - 1)S^{2}}{\\sigma^{2}} = \\frac{1}{\\sigma^{2}}\\sum_{i = 1}^{n}{{(X_{i} - \\overline{X})}^{2}\\sim\\chi^{2}(n - 1)}σ2(n−1)S2=σ21∑i=1n(Xi−X)2∼χ2(n−1)
1σ2∑i=1n(Xi−μ)2∼χ2(n)\\frac{1}{\\sigma^{2}}\\sum_{i = 1}^{n}{{(X_{i} - \\mu)}^{2}\\sim\\chi^{2}(n)}σ21∑i=1n(Xi−μ)2∼χ2(n)
4)X‾−μS/n∼t(n−1)\\text{\\ \\ }\\frac{\\overline{X} - \\mu}{S/\\sqrt{n}}\\sim t(n - 1) S/nX−μ∼t(n−1)
4.重要公式与结论
(1)
对于χ2∼χ2(n)\\chi^{2}\\sim\\chi^{2}(n)χ2∼χ2(n),有E(χ2(n))=n,D(χ2(n))=2n;E(\\chi^{2}(n)) = n,D(\\chi^{2}(n)) = 2n;E(χ2(n))=n,D(χ2(n))=2n;
(2) 对于T∼t(n)T\\sim t(n)T∼t(n),有E(T)=0,D(T)=nn−2(n>2)E(T) = 0,D(T) = \\frac{n}{n - 2}(n > 2)E(T)=0,D(T)=n−2n(n>2);
(3) 对于F~F(m,n)F\\tilde{\\ }F(m,n)F ~F(m,n),有
1F∼F(n,m),Fa/2(m,n)=1F1−a/2(n,m);\\frac{1}{F}\\sim F(n,m),F_{a/2}(m,n) = \\frac{1}{F_{1 - a/2}(n,m)};F1∼F(n,m),Fa/2(m,n)=F1−a/2(n,m)1;
(4) 对于任意总体XXX,有
E(X‾)=E(X),E(S2)=D(X),D(X‾)=D(X)nE(\\overline{X}) = E(X),E(S^{2}) = D(X),D(\\overline{X}) = \\frac{D(X)}{n}E(X)=E(X),E(S2)=D(X),D(X)=nD(X)