充分统计量(Sufficient Statistics)

摘要:
而充分统计量,则能够完全捕捉这些参数所包含的关于分布的信息。也就是说,如果知道充分统计量的值,那么这个随机变量关于它的条件分布,不再取决于原来参数的值。网上找到的定义如下:Instatistics,astatisticissufficientfortheparameterθ,whichindexesthedistributionfamilyofthedata,preciselywhenthedata'sconditionalprobabilitydistribution,giventhestatistic'svalue,nolongerdependsonθ.P=P(x|t)Supposeonehassamplesfromadistribution,doesnotknowexactlywhatthatdistributionis,butdoesknowthatitcomesfromacertainsetofdistributionsthatisdeterminedpartlyorwhollybyacertainparameter,q.Astatisticissufficientforinferenceaboutqifandonlyifthevaluesofanysamplefromthatdistributiongivenomoreinformationaboutqthandoesthevalueofthestatisticonthatsample.E.g.ifweknowthatadistributionisnormalwithvariance1buthasanunknownmean,thesampleaverageisasufficientstatisticforthemean.Sufficientstatisticshavemanyusesinstatisticalinferenceproblems.Inhypothesistesting,theLikelihoodRatioTestcanoftenbereducedtoasufficientstatisticofthedata.Inparameterestimation,theMinimumVarianceUnbiasedEstimatorofaparameterθcanbecharacterizedbysufficientstatisticsandtheRao-BlackwellTheorem.Minimalsufficientstatisticsare,roughlyspeaking,sufficientstatisticsthatcannotbecompressedanymorewithoutlosinginformationabouttheunknownparameter.Completenessisatechnicalcharacterizationofsufficientstatisticsthatallowsonetoproveminimality.Thesetopicsarecoveredindetailinthismodule.FurtherexamplesofsufficientstatisticsmaybefoundinthemoduleontheFisher-NeymanFactorizationTheorem统计量是样本的不带任何未知量的函数,一般而言,统计量所包含的信息比样本要少,但可能这些漏掉的信息是无关紧要的。性格肯定是一个随机变量,它的分布取决于太多的因素,比如家庭、生长的地域、受的教育、还有生理等诸多因素。

一个随机变量的分布,可以取决于一些参数的值。而充分统计量,则能够完全捕捉这些参数所包含的关于分布的信息。也就是说,如果知道充分统计量的值,那么这个随机变量关于它的条件分布,不再取决于原来参数的值。网上找到的定义如下:

  1. In statistics, a statistic is sufficient for the parameter θ, which indexes the distribution family of the data, precisely when the data's conditional probability distribution, given the statistic's value, no longer depends on θ. P(x|t,θ) = P(x|t)
  2. Suppose one has samples from a distribution, does not know exactly what that distribution is, but does know that it comes from a certain set of distributions that is determined partly or wholly by a certain parameter, q. A statistic is sufficient for inference about q if and only if the values of any sample from that distribution give no more information about q than does the value of the statistic on that sample. E.g. if we know that a distribution is normal with variance 1 but has an unknown mean, the sample average is a sufficient statistic for the mean.
  3. Sufficient statistics have many uses in statistical inference problems. In hypothesis testing, the Likelihood Ratio Test can often be reduced to a sufficient statistic of the data. In parameter estimation, the Minimum Variance Unbiased Estimator of a parameter θ can be characterized by sufficient statistics and the Rao-Blackwell Theorem. Minimal sufficient statistics are, roughly speaking, sufficient statistics that cannot be compressed any more without losing information about the unknown parameter. Completeness is a technical characterization of sufficient statistics that allows one to prove minimality. These topics are covered in detail in this module. Further examples of sufficient statistics may be found in the module on the Fisher-Neyman Factorization Theorem

统计量是样本的不带任何未知量的函数,一般而言,统计量所包含的信息比样本要少,但可能这些漏掉的信息是无关紧要的。比如正态分布,均值和方差就是充分统计量,它包含的信息比样本要少,但是给定均值和方差的值,总体的条件分布不再依赖于其他参数的值。

一个现实中的小例子[1],就是星座与性格的关系。性格肯定是一个随机变量,它的分布取决于太多的因素,比如家庭、生长的地域、受的教育、还有生理等诸多因素。但莫明其妙的是,在很多情况下,这么多因素的信息居然浓缩在“星座”这一个信息里。比如,你想判断一个人的性格,你可以问他或她是什么星座的,给定星座的情况下,你对他/她性格的“分布”会有一个估计。

很多情况下,你还可以加上血型这样一个统计量,估计会更精确点。但匪夷所思的是,有人还再加上“生肖”这样一个中国特有的“统计量”,再对各星座的性格做出统计判断。

莫名奇妙的组合,玄得近乎“巫术”的推断,居然在大多数情况下,都是吻合的!谁能告诉我,这背后的道理是什么?难道真有这么神奇的事情?抑或是上帝的安排。

参考文献

[1] http://sinokylin.spaces.live.com/Blog/cns!F34E44CB40CC7976!543.entry

[2] L. Scharf. (1991). Statistical Signal Processing. Addison-Wesley.

免责声明:文章转载自《充分统计量(Sufficient Statistics)》仅用于学习参考。如对内容有疑问,请及时联系本站处理。

上篇Qt之MVC编程FOFA网页爬取最新 批量版本下篇

宿迁高防,2C2G15M,22元/月;香港BGP,2C5G5M,25元/月 雨云优惠码:MjYwNzM=

随便看看

Flutter——数组以符号隔开转字符串

///数组转换为字符串StringgetTaskScreen(Listlist){ListtempList=List();Stringstr='';List.forEach((f){tempList.add(f.title);});临时列表。forEach((f){if(str==“”){str=“$f”;}否则{str=“$str”,“$f”;}});re...

arcgispro 计算字段示例

使用两个或四个空格来定义每个逻辑级别。Python计算表达式字段将使用感叹号(。简单计算简单字符串示例一系列Python字符串函数支持使用字符串。字符串字段中的字符可以通过索引和拆分操作访问“”bcd“”Python还支持使用format()方法的字符串格式!)常用Python字符串操作简单数学示例Python提供了处理数字的工具。...

登陆脚本

#!' num_ count+=1其他:lock_ input(用户名)#############1##########_###!...

python中如何调用.py文件

步骤3来自。文件名不应与Python中的文件名相同,并且没有必要添加Py后缀。询问开发人员,“理论上,如果当前包的内容已经加载到python虚拟机中,如果您再次加载包,它将不会输出,导入相当于无效”。...

记号一次更换IBM X3650M4主板后RAID无法启动的解决

您需要设置主板的引导选项。2.选择bootmanager并进入以下屏幕。3.选择addbootation以输入4.选择uefifullpath 5.选择第一个。然后你可以看到有一个红帽选项。选择它进入下一个屏幕并选择grub。输入6作为Efi,然后选择输入描述。输入要命名的新启动项目的名称。7.确认提交更改。...

eri

本地主机。crt-bakvim/etc/netplan/50云初始化。yaml写入网卡root@master:~#cat/etc/netplan/50 cloud init.yaml#此文件是根据#thedatasource提供的信息生成的。对其所做的更改不会在任何时候持续...