上QQ阅读APP看书，第一时间看更新

Tuning hyperparameters

The simplest way to simplify the decision tree is to limit its depth. How deep is it now? You can see 20 splits, or 21 layers, in Figure 2.5. At the same time, we have only three features. There are six of them actually, if we are taking into account one-hot encoded categorical color. Let's limit the maximum depth of the tree aggressively to be comparable with the number of features. tree_model object has a max_depth property, and so we're setting it to be less than the number of features:

In []: 
tree_model.max_depth = 4

After these manipulations, we can retrain our model and reevaluate its accuracy:

In []: 
tree_model = tree_model.fit(X_train, y_train) 
tree_model.score(X_train, y_train) 
Out[]: 
0.90571428571428569

Note that accuracy on training is now set less by about 6%. How about test set?

In []: 
tree_model.score(X_test, y_test) 
Out[]: 
0.92000000000000004

Accuracy on previously unseen data is now higher, by about 4%. This doesn't look like a great achievement, until you realize that it's an additional 40 correctly classified creatures from our initial set of 1,000. In modern machine learning contests, the final difference between 1^st and 100^th place can easily be about 1%.

Let's draw a tree structure after pruning. Code for this visualization is the same as before:

Figure 2.7: Tree structure after limiting its depth

本周热推：

从零开始学51单片机C语言电脑日常维护与故障排除打印机维修不是事儿（第2版）计算机主板维修不是事儿（第2版）电脑组装、维护、维修全能一本通（全彩版）