IBM SPSS Modeler Cookbook

更新时间：2021-07-23 16:01:53

最新章节：Index

封面

版权信息

Credits

Foreword

About the Authors

About the Reviewers

www.PacktPub.com

Preface

Chapter 1. Data Understanding

Introduction>

Using an empty aggregate to evaluate sample size>

Evaluating the need to sample from the initial data>

Using CHAID stumps when interviewing an SME>

Using a single cluster K-means as an alternative to anomaly detection>

Using an @NULL multiple Derive to explore missing data>

Creating an Outlier report to give to SMEs>

Detecting potential model instability early using the Partition node and Feature Selection node>

Chapter 2. Data Preparation – Select

Introduction>

Using the Feature Selection node creatively to remove or decapitate perfect predictors>

Running a Statistics node on anti-join to evaluate the potential missing data>

Evaluating the use of sampling for speed>

Removing redundant variables using correlation matrices>

Selecting variables using the CHAID Modeling node>

Selecting variables using the Means node>

Selecting variables using single-antecedent Association Rules>

Chapter 3. Data Preparation – Clean

Introduction>

Binning scale variables to address missing data>

Using a full data model/partial data model approach to address missing data>

Imputing in-stream mean or median>

Imputing missing values randomly from uniform or normal distributions>

Using random imputation to match a variable's distribution>

Searching for similar records using a Neural Network for inexact matching>

Using neuro-fuzzy searching to find similar names>

Producing longer Soundex codes>

Chapter 4. Data Preparation – Construct

Introduction>

Building transformations with multiple Derive nodes>

Calculating and comparing conversion rates>

Grouping categorical values>

Transforming high skew and kurtosis variables with a multiple Derive node>

Creating flag variables for aggregation>

Using Association Rules for interaction detection/feature creation>

Creating time-aligned cohorts>

Chapter 5. Data Preparation – Integrate and Format

Introduction>

Speeding up merge with caching and optimization settings>

Merging a lookup table>

Shuffle-down (nonstandard aggregation)>

Cartesian product merge using key-less merge by key>

Multiplying out using Cartesian product merge user source and derive dummy>

Changing large numbers of variable names without scripting>

Parsing nonstandard dates>

Parsing and performing a conversion on a complex stream>

Sequence processing>

Chapter 6. Selecting and Building a Model

Introduction>

Evaluating balancing with Auto Classifier>

Building models with and without outliers>

Using Neural Network for Feature Selection>

Creating a bootstrap sample>

Creating bagged logistic regression models>

Using KNN to match similar cases>

Using Auto Classifier to tune models>

Next-Best-Offer for large datasets>

Chapter 7. Modeling – Assessment Evaluation Deployment and Monitoring

Introduction>

How (and why) to validate as well as test>

Using classification trees to explore the predictions of a Neural Network>

Correcting a confusion matrix for an imbalanced target variable by incorporating priors>

Using aggregate to write cluster centers to Excel for conditional formatting>

Creating a classification tree financial summary using aggregate and an Excel Export node>

Reformatting data for reporting with a Transpose node>

Changing formatting of fields in a Table node>

Combining generated filters>

Chapter 8. CLEM Scripting

Introduction>

Building iterative Neural Network forecasts>

Quantifying variable importance with Monte Carlo simulation>

Implementing champion/challenger model management>

Detecting outliers with the jackknife method>

Optimizing K-means cluster solutions>

Automating time series forecasts>

Automating HTML reports and graphs>

Rolling your own modeling algorithm – Weibull analysis>

Appendix A. Business Understanding

Introduction>

Define business objectives by Tom Khabaza>

Assessing the situation by Meta Brown>

Translating your business objective into a data mining objective by Dean Abbott>

Produce a project plan – ensuring a realistic timeline by Keith McCormick>

Index

更新时间：2021-07-23 16:01:53