更新时间:2021-06-11 18:27:53
封面
版权信息
Preface
About the Book
1. Introduction to Data Science in Python
Introduction
Application of Data Science
Overview of Python
Python for Data Science
Scikit-Learn
Summary
2. Regression
Simple Linear Regression
Multiple Linear Regression
Conducting Regression Analysis Using Python
Multiple Regression Analysis
Assumptions of Regression Analysis
Explaining the Results of Regression Analysis
3. Binary Classification
Understanding the Business Context
Feature Engineering
Data-Driven Feature Engineering
Correlation Matrix and Visualization
4. Multiclass Classification with RandomForest
Training a Random Forest Classifier
Evaluating the Model's Performance
Maximum Depth
Minimum Sample in Leaf
Maximum Features
5. Performing Your First Cluster Analysis
Clustering with k-means
Interpreting k-means Results
Choosing the Number of Clusters
Initializing Clusters
Calculating the Distance to the Centroid
Standardizing Data
6. How to Assess Performance
Splitting Data
Assessing Model Performance for Regression Models
Assessing Model Performance for Classification Models
The Confusion Matrix
Receiver Operating Characteristic Curve
Area Under the ROC Curve
Saving and Loading Models
7. The Generalization of Machine Learning Models
Overfitting
Underfitting
Data
Random State
Cross-Validation
cross_val_score
LogisticRegressionCV
Hyperparameter Tuning with GridSearchCV
Hyperparameter Tuning with RandomizedSearchCV
Model Regularization with Lasso Regression
Ridge Regression
8. Hyperparameter Tuning
What Are Hyperparameters?
Finding the Best Hyperparameterization
Tuning Using Grid Search
GridSearchCV
Random Search
9. Interpreting a Machine Learning Model
Linear Model Coefficients
RandomForest Variable Importance
Variable Importance via Permutation
Partial Dependence Plots
Local Interpretation with LIME
10. Analyzing a Dataset
Exploring Your Data
Analyzing Your Dataset
Analyzing the Content of a Categorical Variable
Summarizing Numerical Variables
Visualizing Your Data
Boxplots
11. Data Preparation
Handling Row Duplication
Converting Data Types
Handling Incorrect Values
Handling Missing Values