决策树算法--DecisionTreeClassifier类

code • 2022-5-16 • python • 阅读 53

目录

主要参数
- criterion
- splitter
- max_depth
- min_sample_splits
- min_samples_leaf
- max_features
- max_leaf_nodes
类属性
- classes_
- feature_importances_
- max_features_
- n_classes_
- n_features_in_
类方法
- apply(X[, check_input])
- decision_path(X[, check_input])
- fit(X, y[, sample_weight, check_input, ...])
- get_depth()
- get_n_leaves()
- get_params([deep])
- predict(X[, check_input])
- predict_proba(X[, check_input])
- score(X, y[, sample_weight])
- set_params(**params)

sklearn.tree.DecisionTreeClassifier(*, criterion='gini', splitter='best', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, random_state=None, max_leaf_nodes=None, min_impurity_decrease=0.0, class_weight=None, ccp_alpha=0.0)

主要参数 criterion

{“gini”, “entropy”}, default=”gini”
特征选择算法

可选项	描述
“gini”	基尼不纯度算法
“entropy”	基于信息熵的算法

splitter

{“best”, “random”}, default=”best”
创建决策树分支的选项，一种基于选择最优的分支创建原则

max_depth

int, default=None
树的最大深度

min_sample_splits

int or float, default=2
指定创建分支的数据集大小
如果一个节点的数据样本个数小于该数值，就不会再创建分支

min_samples_leaf

int or float, default=1
创建分支后的节点样本数量必须大于等于该数值，否则不会再创建分支

max_features

int, float or {“auto”, “sqrt”, “log2”}, default=None
每次分支时，寻找最优分支考虑到的特征数量

可选值	该参数的值等于
int	max_features
float	max_features*n_features
“auto”	sqrt(n_features).
“sqrt”	sqrt(n_features).
“log2”	log2(n_features).
None	n_features

max_leaf_nodes

int, default=None
限制最大的样本节点个数

类属性 classes_

ndarray of shape (n_classes,) or list of ndarray
目标标签

feature_importances_

ndarray of shape (n_features,)
返回特征的重要性

max_features_

int
返回参数max_features的价值

n_classes_

int or list of int
目标种类的数量或者对于多输出问题包含种类数量的列表

n_features_in_

int
训练决策树时使用到的特征数量

类方法 apply(X[, check_input])

返回每个样本被预测为叶子的索引

decision_path(X[, check_input])

返回树的决策过程

fit(X, y[, sample_weight, check_input, …])

对给定的训练数据集创建一个决策树

get_depth()

返回决策树深度

get_n_leaves()

返回决策树的叶子数量

get_params([deep])

获取该决策树的参数

predict(X[, check_input])

预测样本X的类或者其回归值

predict_proba(X[, check_input])

预测输入样本X属于不同类的概率

score(X, y[, sample_weight])

返回给定的测试数据和标签上的平均得分

set_params(**params)

设置类属性

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/langs/916996.html

决策树 sklearn

打赏

微信扫一扫

支付宝扫一扫

code 管理员组

django中django-pure-pagination分页器

上一篇 2022-05-16

【CF #790 H2. Maximum Crossings

下一篇 2022-05-16

发表评论

登录后才能评论

评论列表（0条）