دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش: Second نویسندگان: Tilo Wendler, Sören Gröttrup سری: ISBN (شابک) : 9783030543372, 3030543374 ناشر: سال نشر: 2021 تعداد صفحات: 1285 زبان: English فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) حجم فایل: 86 مگابایت
در صورت تبدیل فایل کتاب Data mining with SPSS Modeler : theory, exercises and solutions به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب داده کاوی با SPSS Modeler: نظریه، تمرین و راه حل نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
Preface to the Second Edition Preface to the First Edition Contents 1: Introduction 1.1 The Concept of the SPSS Modeler 1.2 Structure and Features of This Book 1.2.1 Prerequisites for Using This Book 1.2.2 Structure of the Book and the Exercise/Solution Concept 1.2.3 Using the Data and Streams Provided with the Book 1.2.4 Datasets Provided with This Book 1.2.5 Template Concept of This Book 1.3 Introducing the Modeling Process 1.3.1 Exercises 1.3.2 Solutions References 2: Basic Functions of the SPSS Modeler 2.1 Defining Streams and Scrolling Through a Dataset 2.2 Switching Between Different Streams 2.3 Defining or Modifying Value Labels 2.4 Adding Comments to a Stream 2.5 Exercises 2.6 Solutions 2.7 Data Handling Methods 2.7.1 Theory 2.7.2 Calculations 2.7.3 String Functions 2.7.4 Extracting/Selecting Records 2.7.5 Filtering Data 2.7.6 Data Standardization: Z-Transformation 2.7.7 Partitioning Datasets 2.7.8 Sampling Methods 2.7.9 Merge Datasets 2.7.10 Append Datasets 2.7.11 Exercises 2.7.12 Solutions References 3: Univariate Statistics 3.1 Theory 3.1.1 Discrete Versus Continuous Variables 3.1.2 Scales of Measurement 3.1.3 Exercises 3.1.4 Solutions 3.2 Simple Data Examination Tasks 3.2.1 Theory 3.2.2 Frequency Distribution of Discrete Variables 3.2.3 Frequency Distribution of Continuous Variables 3.2.4 Distribution Analysis with the Data Audit Node 3.2.5 Concept of ``SuperNodes´´ and Transforming a Variable to Normality 3.2.6 Reclassifying Values 3.2.7 Binning Continuous Data 3.2.8 Exercises 3.2.9 Solutions References 4: Multivariate Statistics 4.1 Theory 4.2 Scatterplot 4.3 Scatterplot Matrix 4.4 Correlation 4.5 Correlation Matrix 4.6 Exclusion of Spurious Correlations 4.7 Contingency Tables 4.8 Exercises 4.9 Solutions References 5: Regression Models 5.1 Introduction to Regression Models 5.1.1 Motivating Examples 5.1.2 Concept of the Modeling Process and Cross-Validation 5.2 Simple Linear Regression 5.2.1 Theory 5.2.2 Building the Stream in SPSS Modeler 5.2.3 Identification and Interpretation of the Model Parameters 5.2.4 Assessment of the Goodness of Fit 5.2.5 Predicting Unknown Values 5.2.6 Exercises 5.2.7 Solutions 5.3 Multiple Linear Regression 5.3.1 Theory 5.3.2 Building the Model in SPSS Modeler 5.3.3 Final MLR Model and Its Goodness of Fit 5.3.4 Prediction of Unknown Values 5.3.5 Cross-Validation of the Model 5.3.6 Boosting and Bagging (for Regression Models) 5.3.7 Exercises 5.3.8 Solutions 5.4 Generalized Linear (Mixed) Model 5.4.1 Theory 5.4.2 Building a Model with the GLMM Node 5.4.3 The Model Nugget 5.4.4 Cross-Validation and Fitting a Quadric Regression Model 5.4.5 Exercises 5.4.6 Solutions 5.5 The Auto Numeric Node 5.5.1 Building a Stream with the Auto Numeric Node 5.5.2 The Auto Numeric Model Nugget 5.5.3 Exercises 5.5.4 Solutions References 6: Factor Analysis 6.1 Motivating Example 6.2 General Theory of Factor Analysis 6.3 Principal Component Analysis 6.3.1 Theory 6.3.2 Building a Model in SPSS Modeler 6.3.3 Exercises 6.3.4 Solutions 6.4 Principal Factor Analysis 6.4.1 Theory 6.4.2 Building a Model 6.4.3 Feature Selection vs. Feature Reduction 6.4.4 Exercises 6.4.5 Solutions References 7: Cluster Analysis 7.1 Motivating Examples 7.2 General Theory of Cluster Analysis 7.2.1 Exercises 7.2.2 Solutions 7.3 TwoStep Hierarchical Agglomerative Clustering 7.3.1 Theory of Hierarchical Clustering 7.3.2 Characteristics of the TwoStep Algorithm 7.3.3 Building a Model in SPSS Modeler 7.3.4 Exercises 7.3.5 Solutions 7.4 K-Means Partitioning Clustering 7.4.1 Theory 7.4.2 Building a Model in SPSS Modeler 7.4.3 Exercises 7.4.4 Solutions 7.5 Auto Clustering 7.5.1 Motivation and Implementation of the Auto Cluster Node 7.5.2 Building a Model in SPSS Modeler 7.5.3 Exercises 7.5.4 Solutions 7.6 Summary References 8: Classification Models 8.1 Motivating Examples 8.2 General Theory of Classification Models 8.2.1 Process of Training and Using a Classification Model 8.2.2 Classification Algorithms 8.2.3 Classification Versus Clustering 8.2.4 Decision Boundary and the Problem with Over- and Underfitting 8.2.5 Performance Measures of Classification Models 8.2.6 The Analysis Node 8.2.7 The Evaluation Node 8.2.8 A Detailed Example how to Create a ROC Curve 8.2.9 Exercises 8.2.10 Solutions 8.3 Logistic Regression 8.3.1 Theory 8.3.2 Building the Model in SPSS Modeler 8.3.3 Optional: Model Types and Variable Interactions 8.3.4 Final Model and Its Goodness of Fit 8.3.5 Classification of Unknown Values 8.3.6 Cross-Validation of the Model 8.3.7 Exercises 8.3.8 Solutions 8.4 Linear Discriminate Classification 8.4.1 Theory 8.4.2 Building the Model with SPSS Modeler 8.4.3 The Model Nugget and the Estimated Model Parameters 8.4.4 Exercises 8.4.5 Solutions 8.5 Support Vector Machine 8.5.1 Theory 8.5.2 Building the Model with SPSS Modeler 8.5.3 The Model Nugget 8.5.4 Exercises 8.5.5 Solutions 8.6 Neuronal Networks 8.6.1 Theory 8.6.2 Building a Network with SPSS Modeler 8.6.3 The Model Nugget 8.6.4 Exercises 8.6.5 Solutions 8.7 K-Nearest Neighbor 8.7.1 Theory 8.7.2 Building the Model with SPSS Modeler 8.7.3 The Model Nugget 8.7.4 Dimensional Reduction with PCA for Data Preprocessing 8.7.5 Exercises 8.7.6 Solutions 8.8 Decision Trees 8.8.1 Theory 8.8.2 Building a Decision Tree with the C5.0 Node 8.8.3 The Model Nugget 8.8.4 Building a Decision Tree with the CHAID Node 8.8.5 Exercises 8.8.6 Solutions 8.9 The Auto Classifier Node 8.9.1 Building a Stream with the Auto Classifier Node 8.9.2 The Auto Classifier Model Nugget 8.9.3 Exercises 8.9.4 Solutions References 9: Using R with the Modeler 9.1 Advantages of R with the Modeler 9.2 Connecting with R 9.3 Test the SPSS Modeler Connection to R 9.4 Calculating New Variables in R 9.5 Model Building in R 9.6 Modifying the Data Structure in R 9.7 Solutions References 10: Imbalanced Data and Resampling Techniques 10.1 Characteristics of Imbalanced Datasets and Consequences 10.2 Resampling Techniques 10.2.1 Random Oversampling Examples (ROSE) 10.2.2 Synthetic Minority Oversampling Technique (SMOTE) 10.2.3 Adaptive Synthetic Sampling Method (abbr. ADASYN) 10.3 Implementation in SPSS Modeler 10.4 Using R to Implement Balancing Methods 10.4.1 SMOTE-Approach Using R 10.4.2 ROSE-Approach Using R 10.5 Exercises 10.5.1 Exercise 1: Recap Imbalanced Data 10.5.2 Exercise 2: Resampling Application to Identify Cancer 10.5.3 Exercise 3: Comparing Resampling Algorithms 10.6 Solutions 10.6.1 Exercise 1: Recap Imbalanced Data 10.6.2 Exercise 2: Resampling Application to Identify Cancer 10.6.3 Exercise 3: Comparing Resampling Algorithms References 11: Case Study: Fault Detection in Semiconductor Manufacturing Process 11.1 Case Study Background 11.2 The Standard Process in Data Mining 11.2.1 Business Understanding (CRISP-DM Step 1) 11.2.2 Data Understanding (CRISP-DM Step 2) 11.2.3 Data Preparation (CRISP-DM Step 3) 11.2.3.1 Merging Data (CRISP-DM Step 3.1) 11.2.3.2 Separating Training and Test Data (CRISP-DM Step 3.2) 11.2.3.3 Reducing Dimensionality of Data by Feature Removal (CRISP-DM Step 3.3) 11.2.3.3.1 Deleting Features or Records 11.2.3.3.2 Identifying Correlated Features 11.2.3.4 Outlier Identification and Treatment (CRISP-DM Step 3.4) 11.2.3.5 Impute Missing Values (CRISP-DM Step 3.5) 11.2.3.5.1 Reasons for Missing Values and Implications for the Data Analysis Process 11.2.3.5.2 Imputation Methods 11.2.3.5.3 Implementing Imputation Methods 11.2.3.6 Calculating New Features (CRISP-DM Step 3.6) 11.2.3.7 Identification of Important Features by Using Feature Selection (CRISP-DM Step 3.7) 11.2.4 Modeling (CRISP-DM Step 4) 11.2.4.1 Balancing (CRISP-DM Step 4.1) 11.2.4.2 Feature Scaling and Model Building (CRISP-DM Step 4.2) 11.2.5 Evaluation and Deployment of Model (CRISP-DM Step 5 and 6) 11.3 Lessons Learned 11.4 Exercises 11.5 Solutions References 12: Appendix 12.1 Data Sets Used in This Book 12.1.1 adult_income_data.txt 12.1.2 bank_full.csv 12.1.3 beer.sav 12.1.4 benchmark.xlsx 12.1.5 car_simple.sav 12.1.6 car_sales_modified.sav 12.1.7 chess_endgame_data.txt 12.1.8 credit_card_sampling_data.sav 12.1.9 customer_bank_data.csv 12.1.10 diabetes_data_reduced.sav 12.1.11 DRUG1n.csv 12.1.12 EEG_Sleep_Signals.csv 12.1.13 employee_dataset_001 and employee_dataset_002 12.1.14 England Payment Datasets 12.1.15 Features_eeg_signals.csv 12.1.16 gene_expression_leukemia_all.csv 12.1.17 gene_expression_leukemia_short.csv 12.1.18 gravity_constant_data.csv 12.1.19 hacide_train.SAV and hacide_test.SAV 12.1.20 Housing.data.txt 12.1.21 income_vs_purchase.sav 12.1.22 Iris.csv 12.1.23 IT-projects.txt 12.1.24 IT user satisfaction.sav 12.1.25 longley.csv 12.1.26 LPGA2009.csv 12.1.27 Mtcars.csv 12.1.28 nutrition_habites.sav 12.1.29 optdigits_training.txt, optdigits_test.txt 12.1.30 Orthodont.csv 12.1.31 Ozone.csv 12.1.32 pisa2012_math_q45.sav 12.1.33 sales_list.sav 12.1.34 secom.sav 12.1.35 ships.csv 12.1.36 test_scores.sav 12.1.37 Titanic.xlsx 12.1.38 tree_credit.sav 12.1.39 wine_data.txt 12.1.40 WisconsinBreastCancerData.csv and wisconsin_breast_cancer_data.sav 12.1.41 z_pm_customer1.sav References