首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > 开发语言 > 编程 >

贝叶斯并行归类分析

2012-12-21 
贝叶斯并行分类分析1 贝叶斯训练器所在包:Package org.apache.mahout.classifier.bayes实现机制The implem

贝叶斯并行分类分析

1 贝叶斯训练器

所在包:Package org.apache.mahout.classifier.bayes实现机制

The implementation is divided up into three parts:

    The Trainer -- responsible for doing the counting of the words and the labels

    The Model -- responsible for holding the training data in a useful way

    The Classifier -- responsible for using the trainers output to determine the category of previously unseen documents

1训练器

The trainer is manifested in several classes:

    BayesDriver

    创建Hadoop贝叶斯作业,输出模型,这个类封装了4map/reduce类。

    common.BayesFeatureDriver

    common.BayesTfIdfDriver

    common.BayesWeightSummerDriver

    BayesThetaNormalizerDriver

训练器的输入是KeyValueTextInputFormat格式,第一个字符时类标签,剩余的是特征(单词),如下面的格式:

hockey puck stick goalie forward defenseman referee ice checking slapshot helmet football field football pigskin referee helmet turf tackle 

hockey 和football 是类标签,剩下的是特征。

2模型

热点排行