Applied Predictive Analytics: Principles and Techniques for the Professional Data Analyst
Lýsing:
Predictive Analytics shows tech-savvy business managers and data analysts how to use the techniques of predictive analytics to solve practical business problems. It teaches readers the methods, principles, and techniques for conducting predictive analytics projects, from start to finish. The author focuses on best practices---including tips and tricks---that are essential for successful predictive modeling.
The author explains the theory behind the principles of predictive analytics in plain English; readers don't need an extensive background in math and statistics, which makes it ideal for most tech-savvy business and data analysts. Each of the techniques chapters will begin with a description of the specific technique and how it relates to the overall process model for predictive analytics. The depth of the description of a technique will match the complexity of the approach; the intent is to describe the techniques in enough depth for a practitioner to understand the effect of the major parameters needed to effectively use the technique and interpret the results.
For example, with decision trees, the primary algorithms (C5, CART and CHAID) will be described in qualitative terms (what are trees, what is a split), how they are similar and different (Gini vs. Entropy vs. chi-square tests), why one might use one technique over another, how one can be fooled by the models built using each algorithm (i. e. , their weaknesses), what knobs one can adjust (depth, complexity penalties, priors, costs, etc.
), and how to interpret the results. Each of the techniques is illustrated by hands-on examples, either unique to the task or as part of a more comprehensive case study. The companion website will provide all of the data sets used to generate these examples, along with a free trial version of software, so that readers can recreate and explore the examples and case studies. The book concludes with a series of in-depth case studies that apply predictive analytics to common types of business scenarios.
Annað
- Höfundur: Dean Abbott
- Útgáfa:1
- Útgáfudagur: 2014-04-07
- Hægt að prenta út 2 bls.
- Hægt að afrita 10 bls.
- Format:Page Fidelity
- ISBN 13: 9781118727935
- Print ISBN: 9781118727966
- ISBN 10: 1118727932
Efnisyfirlit
- Title Page
- Copyright
- Contents
- Chapter 1 Overview of Predictive Analytics
- What Is Analytics?
- What Is Predictive Analytics?
- Supervised vs. Unsupervised Learning
- Parametric vs. Non-Parametric Models
- Business Intelligence
- Predictive Analytics vs. Business Intelligence
- Do Predictive Models Just State the Obvious?
- Similarities between Business Intelligence and Predictive Analytics
- Predictive Analytics vs. Statistics
- Statistics and Analytics
- Predictive Analytics and Statistics Contrasted
- Predictive Analytics vs. Data Mining
- Who Uses Predictive Analytics?
- Challenges in Using Predictive Analytics
- Obstacles in Management
- Obstacles with Data
- Obstacles with Modeling
- Obstacles in Deployment
- What Educational Background Is Needed to Become a Predictive Modeler?
- Chapter 2 Setting Up the Problem
- Predictive Analytics Processing Steps: CRISP-DM
- Business Understanding
- The Three-Legged Stool
- Business Objectives
- Defining Data for Predictive Modeling
- Defining the Columns as Measures
- Defining the Unit of Analysis
- Which Unit of Analysis?
- Defining the Target Variable
- Temporal Considerations for Target Variable
- Defining Measures of Success for Predictive Models
- Success Criteria for Classification
- Success Criteria for Estimation
- Other Customized Success Criteria
- Doing Predictive Modeling Out of Order
- Building Models First
- Early Model Deployment
- Case Study: Recovering Lapsed Donors
- Overview
- Business Objectives
- Data for the Competition
- The Target Variables
- Modeling Objectives
- Model Selection and Evaluation Criteria
- Model Deployment
- Case Study: Fraud Detection
- Overview
- Business Objectives
- Data for the Project
- The Target Variables
- Modeling Objectives
- Model Selection and Evaluation Criteria
- Model Deployment
- Summary
- Chapter 3 Data Understanding
- What the Data Looks Like
- Single Variable Summaries
- Mean
- Standard Deviation
- The Normal Distribution
- Uniform Distribution
- Applying Simple Statistics in Data Understanding
- Skewness
- Kurtosis
- Rank-Ordered Statistics
- Categorical Variable Assessment
- Data Visualization in One Dimension
- Histograms
- Multiple Variable Summaries
- Hidden Value in Variable Interactions: Simpson’s Paradox
- The Combinatorial Explosion of Interactions
- Correlations
- Spurious Correlations
- Back to Correlations
- Crosstabs
- Data Visualization, Two or Higher Dimensions
- Scatterplots
- Anscombe’s Quartet
- Scatterplot Matrices
- Overlaying the Target Variable in Summary
- Scatterplots in More Than Two Dimensions
- The Value of Statistical Significance
- Pulling It All Together into a Data Audit
- Summary
- Chapter 4 Data Preparation
- Variable Cleaning
- Incorrect Values
- Consistency in Data Formats
- Outliers
- Multidimensional Outliers
- Missing Values
- Fixing Missing Data
- Feature Creation
- Simple Variable Transformations
- Fixing Skew
- Binning Continuous Variables
- Numeric Variable Scaling
- Nominal Variable Transformation
- Ordinal Variable Transformations
- Date and Time Variable Features
- ZIP Code Features
- Which Version of a Variable Is Best?
- Multidimensional Features
- Variable Selection Prior to Modeling
- Sampling
- Example: Why Normalization Matters for K-Means Clustering
- Summary
- Variable Cleaning
- Chapter 5 Itemsets and Association Rules
- Terminology
- Condition
- Left-Hand-Side, Antecedent(s)
- Right-Hand-Side, Consequent, Output, Conclusion
- Rule (Item Set)
- Support
- Antecedent Support
- Confidence, Accuracy
- Lift
- Parameter Settings
- How the Data Is Organized
- Standard Predictive Modeling Data Format
- Transactional Format
- Measures of Interesting Rules
- Deploying Association Rules
- Variable Selection
- Interaction Variable Creation
- Problems with Association Rules
- Redundant Rules
- Too Many Rules
- Too Few Rules
- Building Classification Rules from Association Rules
- Summary
- Terminology
- Chapter 6 Descriptive Modeling
- Data Preparation Issues with Descriptive Modeling
- Principal Component Analysis
- The PCA Algorithm
- Applying PCA to New Data
- PCA for Data Interpretation
- Additional Considerations before Using PCA
- The Effect of Variable Magnitude on PCA Models
- Clustering Algorithms
- The K-Means Algorithm
- Data Preparation for K-Means
- Selecting the Number of Clusters
- The Kohonen SOM Algorithm
- Visualizing Kohonen Maps
- Similarities with K-Means
- Summary
- Chapter 7 Interpreting Descriptive Models
- Standard Cluster Model Interpretation
- Problems with Interpretation Methods
- Identifying Key Variables in Forming Cluster Models
- Cluster Prototypes
- Cluster Outliers
- Summary
- Standard Cluster Model Interpretation
- Chapter 8 Predictive Modeling
- Decision Trees
- The Decision Tree Landscape
- Building Decision Trees
- Decision Tree Splitting Metrics
- Decision Tree Knobs and Options
- Reweighting Records: Priors
- Reweighting Records: Misclassification Costs
- Other Practical Considerations for Decision Trees
- Logistic Regression
- Interpreting Logistic Regression Models
- Other Practical Considerations for Logistic Regression
- Neural Networks
- Building Blocks: The Neuron
- Neural Network Training
- The Flexibility of Neural Networks
- Neural Network Settings
- Neural Network Pruning
- Interpreting Neural Networks
- Neural Network Decision Boundaries
- Other Practical Considerations for Neural Networks
- K-Nearest Neighbor
- The k-NN Learning Algorithm
- Distance Metrics for k-NN
- Other Practical Considerations for k-NN
- Naïve Bayes
- Bayes’ Theorem
- The Naïve Bayes Classifier
- Interpreting Naïve Bayes Classifiers
- Other Practical Considerations for Naïve Bayes
- Regression Models
- Linear Regression
- Linear Regression Assumptions
- Variable Selection in Linear Regression
- Interpreting Linear Regression Models
- Using Linear Regression for Classification
- Other Regression Algorithms
- Summary
- Decision Trees
- Chapter 9 Assessing Predictive Models
- Batch Approach to Model Assessment
- Percent Correct Classification
- Rank-Ordered Approach to Model Assessment
- Assessing Regression Models
- Summary
- Batch Approach to Model Assessment
- Chapter 10 Model Ensembles
- Motivation for Ensembles
- The Wisdom of Crowds
- Bias Variance Tradeoff
- Bagging
- Boosting
- Improvements to Bagging and Boosting
- Random Forests
- Stochastic Gradient Boosting
- Heterogeneous Ensembles
- Model Ensembles and Occam’s Razor
- Interpreting Model Ensembles
- Summary
- Motivation for Ensembles
- Chapter 11 Text Mining
- Motivation for Text Mining
- A Predictive Modeling Approach to Text Mining
- Structured vs. Unstructured Data
- Why Text Mining Is Hard
- Text Mining Applications
- Data Sources for Text Mining
- Data Preparation Steps
- POS Tagging
- Tokens
- Stop Word and Punctuation Filters
- Character Length and Number Filters
- Stemming
- Dictionaries
- The Sentiment Polarity Movie Data Set
- Text Mining Features
- Term Frequency
- Inverse Document Frequency
- TF-IDF
- Cosine Similarity
- Multi-Word Features: N-Grams
- Reducing Keyword Features
- Grouping Terms
- Modeling with Text Mining Features
- Regular Expressions
- Uses of Regular Expressions in Text Mining
- Summary
- Chapter 12 Model Deployment
- General Deployment Considerations
- Deployment Steps
- Summary
- General Deployment Considerations
- Chapter 13 Case Studies
- Survey Analysis Case Study: Overview
- Business Understanding: Defining the Problem
- Data Understanding
- Data Preparation
- Modeling
- Deployment: “What-If” Analysis
- Revisit Models
- Deployment
- Summary and Conclusions
- Help Desk Case Study
- Data Understanding: Defining the Data
- Data Preparation
- Modeling
- Revisit Business Understanding
- Deployment
- Summary and Conclusions
- Survey Analysis Case Study: Overview
UM RAFBÆKUR Á HEIMKAUP.IS
Bókahillan þín er þitt svæði og þar eru bækurnar þínar geymdar. Þú kemst í bókahilluna þína hvar og hvenær sem er í tölvu eða snjalltæki. Einfalt og þægilegt!Rafbók til eignar
Rafbók til eignar þarf að hlaða niður á þau tæki sem þú vilt nota innan eins árs frá því bókin er keypt.
Þú kemst í bækurnar hvar sem er
Þú getur nálgast allar raf(skóla)bækurnar þínar á einu augabragði, hvar og hvenær sem er í bókahillunni þinni. Engin taska, enginn kyndill og ekkert vesen (hvað þá yfirvigt).
Auðvelt að fletta og leita
Þú getur flakkað milli síðna og kafla eins og þér hentar best og farið beint í ákveðna kafla úr efnisyfirlitinu. Í leitinni finnur þú orð, kafla eða síður í einum smelli.
Glósur og yfirstrikanir
Þú getur auðkennt textabrot með mismunandi litum og skrifað glósur að vild í rafbókina. Þú getur jafnvel séð glósur og yfirstrikanir hjá bekkjarsystkinum og kennara ef þeir leyfa það. Allt á einum stað.
Hvað viltu sjá? / Þú ræður hvernig síðan lítur út
Þú lagar síðuna að þínum þörfum. Stækkaðu eða minnkaðu myndir og texta með multi-level zoom til að sjá síðuna eins og þér hentar best í þínu námi.
Fleiri góðir kostir
- Þú getur prentað síður úr bókinni (innan þeirra marka sem útgefandinn setur)
- Möguleiki á tengingu við annað stafrænt og gagnvirkt efni, svo sem myndbönd eða spurningar úr efninu
- Auðvelt að afrita og líma efni/texta fyrir t.d. heimaverkefni eða ritgerðir
- Styður tækni sem hjálpar nemendum með sjón- eða heyrnarskerðingu
- Gerð : 208
- Höfundur : 12183
- Útgáfuár : 2014
- Leyfi : 379