上QQ阅读APP看书,第一时间看更新
Features
When we are talking about features in the context of ML , what we mean is some characteristic property of the object or phenomenon we are investigating.
Other names for the same concept you'll see in some publications are explanatory variable, independent variable, and predictor.
Features are used to distinguish objects from each other and to measure the similarity between them.
For instance:
- If the objects of our interest are books, features could be a title, page count, author's name, a year of publication, genre, and so on
- If the objects of interest are images, features could be intensities of each pixel
- If the objects are blog posts, features could be language, length, or presence of some terms
It's useful to imagine your data as a spreadsheet table. In this case, each sample (data point) would be a row, and each feature would be a column. For example, Table 1.1 shows a tiny dataset of books consisting of four samples where each has eight features.
Table 1.1: an example of a ML dataset (dummy books):