上QQ阅读APP看书，第一时间看更新

How to do it...

To access the observation variables, type:

iris.data

This outputs a NumPy array:

array([[ 5.1,  3.5,  1.4,  0.2],
       [ 4.9,  3. ,  1.4,  0.2],
       [ 4.7,  3.2,  1.3,  0.2], 
#...rest of output suppressed because of length

Let's examine the NumPy array:

iris.data.shape

This returns:

(150L, 4L)

This means that the data is 150 rows by 4 columns. Let's look at the first row:

iris.data[0]

array([ 5.1,  3.5,  1.4,  0.2])

The NumPy array for the first row has four numbers.

To determine what they mean, type:

          iris.feature_names
['sepal length (cm)',
 'sepal width (cm)',
 'petal length (cm)',
 'petal width (cm)']
         

The feature or column names name the data. They are strings, and in this case, they correspond to dimensions in different types of flowers. Putting it all together, we have 150 examples of flowers with four measurements per flower in centimeters. For example, the first flower has measurements of 5.1 cm for sepal length, 3.5 cm for sepal width, 1.4 cm for petal length, and 0.2 cm for petal width. Now, let's look at the output variable in a similar manner:

iris.target

This yields an array of outputs: 0, 1, and 2. There are only three outputs. Type this:

iris.target.shape

You get a shape of:

(150L,)

This refers to an array of length 150 (150 x 1). Let's look at what the numbers refer to:

iris.target_names

array(['setosa', 'versicolor', 'virginica'], 
      dtype='|S10')

The output of the iris.target_names variable gives the English names for the numbers in the iris.target variable. The number zero corresponds to the setosa flower, number one corresponds to the versicolor flower, and number two corresponds to the virginica flower. Look at the first row of iris.target:

iris.target[0]

This produces zero, and thus the first row of observations we examined before correspond to the setosa flower.