دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش: 1 نویسندگان: Peter C. Bruce, Peter Gedeck, Janet Dobbins سری: ISBN (شابک) : 9781394253807 ناشر: Wiley سال نشر: 2024 تعداد صفحات: 366 زبان: English فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) حجم فایل: 12 مگابایت
در صورت تبدیل فایل کتاب Statistics for Data Science and Analytics به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب آمار برای علم داده و تجزیه و تحلیل نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
fmatter Title Page Copyright Contents About the Authors Acknowledgments About the Companion Website Introduction ch1 1.1 Big Data: Predicting Pregnancy 1.2 Phantom Protection from Vitamin E 1.3 Statistician, Heal Thyself 1.4 Identifying Terrorists in Airports 1.5 Looking Ahead 1.6 Big Data and Statisticians 1.6.1 Data Scientists ch2 2.1 Statistical Science 2.2 Big Data 2.3 Data Science 2.4 Example: Hospital Errors 2.5 Experiment 2.6 Designing an Experiment 2.6.1 A/B Tests; A Controlled Experiment for the Hospital Plans 2.6.2 Randomizing 2.6.3 Planning 2.6.4 Bias 2.6.4.1 Placebo 2.6.4.2 Blinding 2.6.4.3 Before‐after Pairing 2.7 The Data 2.7.1 Dataframe Format 2.8 Variables and Their Flavors 2.8.1 Numeric Variables 2.8.2 Categorical Variables 2.8.3 Binary Variables 2.8.4 Text Data 2.8.5 Random Variables 2.8.6 Simplified Columnar Format 2.9 Python: Data Structures and Operations 2.9.1 Primary Data Types 2.9.2 Comments 2.9.3 Variables 2.9.4 Operations on Data 2.9.4.1 Converting Data Types 2.9.5 Advanced Data Structures 2.9.5.1 Classes and Objects 2.9.5.2 Data Types and Their Declaration 2.10 Are We Sure We Made a Difference? 2.11 Is Chance Responsible? The Foundation of Hypothesis Testing 2.11.1 Looking at Just One Hospital 2.12 Probability 2.12.1 Interpreting Our Result 2.13 Significance or Alpha Level 2.13.1 Increasing the Sample Size 2.13.2 Simulating Probabilities with Random Numbers 2.14 Other Kinds of Studies 2.15 When to Use Hypothesis Tests 2.16 Experiments Falling Short of the Gold Standard 2.17 Summary 2.18 Python: Iterations and Conditional Execution 2.18.1 if Statements 2.18.2 for Statements 2.18.3 while Statements 2.18.4 break and continue Statements 2.18.5 Example: Calculate Mean, Standard Deviation, Subsetting 2.18.6 List Comprehensions 2.19 Python: Numpy, scipy, and pandas—The Workhorses of Data Science 2.19.1 Numpy 2.19.2 Scipy 2.19.3 Pandas 2.19.3.1 Reading and Writing Data 2.19.3.2 Accessing Data 2.19.3.3 Manipulating Data 2.19.3.4 Iterating Over a DataFrame 2.19.3.5 And a Lot More Exercises ch3 3.1 Exploratory Data Analysis 3.2 What to Measure—Central Location 3.2.1 Mean 3.2.2 Median 3.2.3 Mode 3.2.4 Expected Value 3.2.5 Proportions for Binary Data 3.2.5.1 Percents 3.3 What to Measure—Variability 3.3.1 Range 3.3.2 Percentiles 3.3.3 Interquartile Range 3.3.4 Deviations and Residuals 3.3.5 Mean Absolute Deviation 3.3.6 Variance and Standard Deviation 3.3.6.1 Denominator of N or N–1? 3.3.7 Population Variance 3.3.8 Degrees of Freedom 3.4 What to Measure—Distance (Nearness) 3.5 Test Statistic 3.5.1 Test Statistic for this Study 3.6 Examining and Displaying the Data 3.6.1 Frequency Tables 3.6.2 Histograms 3.6.3 Bar Chart 3.6.4 Box Plots 3.6.5 Tails and Skew 3.6.6 Errors and Outliers Are Not the Same Thing! 3.7 Python: Exploratory Data Analysis/Data Visualization 3.7.1 Matplotlib 3.7.2 Data Visualization Using Pandas and Seaborn Exercises ch4 4.1 Avoid Being Fooled by Chance 4.2 The Null Hypothesis 4.3 Repeating the Experiment 4.3.1 Shuffling and Picking Numbers from a Hat or Box 4.3.2 How Many Reshuffles? 4.3.3 The t‐Test 4.3.4 Conclusion 4.4 Statistical Significance 4.4.1 Bottom Line 4.4.1.1 Statistical Significance as a Screening Device 4.4.2 Torturing the Data 4.4.3 Practical Significance 4.5 Power 4.6 The Normal Distribution 4.6.1 The Exact Test 4.7 Summary 4.8 Python: Random Numbers 4.8.1 Generating Random Numbers Using the random Package 4.8.2 Random Numbers in numpy and scipy 4.8.3 Using Random Numbers in Other Packages 4.8.4 Example: Implement a Resampling Experiment 4.8.5 Write Functions for Code Reuse 4.8.6 Organizing Code into Modules Exercises ch5 5.1 What Is Probability 5.2 Simple Probability 5.2.1 Venn Diagrams 5.3 Probability Distributions 5.3.1 Binomial Distribution 5.3.1.1 Example 5.4 From Binomial to Normal Distribution 5.4.1 Standardization (Normalization) 5.4.2 Standard Normal Distribution 5.4.2.1 z‐Tables 5.4.3 The 95 Percent Rule 5.5 Appendix: Binomial Formula and Normal Approximation 5.5.1 Normal Approximation 5.6 Python: Probability 5.6.1 Converting Counts to Probabilities 5.6.2 Probability Distributions in Python 5.6.3 Probability Distributions in random 5.6.4 Probability Distributions in the scipy Package 5.6.4.1 Continuous Distributions 5.6.4.2 Discrete Distributions Exercises ch6 6.1 Two‐way Tables 6.2 Conditional Probability 6.2.1 From Numbers to Percentages to Conditional Probabilities 6.3 Bayesian Estimates 6.3.1 Let\'s Review the Different Probabilities 6.3.2 Bayesian Calculations 6.4 Independence 6.4.1 Chi‐square Test 6.4.1.1 Sensor Calibration 6.4.1.2 Standardizing Departure from Expected 6.5 Multiplication Rule 6.6 Simpson\'s Paradox 6.7 Python: Counting and Contingency Tables 6.7.1 Counting in Python 6.7.2 Counting in Pandas 6.7.3 Two‐way Tables Using Pandas 6.7.4 Chi‐square Test Exercises ch7 7.1 Literary Digest—Sampling Trumps “All Data” 7.2 Simple Random Samples 7.3 Margin of Error: Sampling Distribution for a Proportion 7.3.1 The Confidence Interval 7.3.2 A More Manageable Box: Sampling with Replacement 7.3.3 Summing Up 7.4 Sampling Distribution for a Mean 7.4.1 Simulating the Behavior of Samples from a Hypothetical Population 7.5 The Bootstrap 7.5.1 Resampling Procedure (Bootstrap) 7.6 Rationale for the Bootstrap 7.6.1 Let\'s Recap 7.6.2 Formula‐based Counterparts to Resampling 7.6.2.1 FORMULA: The Z‐interval 7.6.2.2 Proportions 7.6.3 For a Mean: T‐interval 7.6.4 Example—Manual Calculations 7.6.5 Example—Software 7.6.6 A Bit of History—1906 at Guinness Brewery 7.6.7 The Bootstrap Today 7.6.8 Central Limit Theorem 7.7 Standard Error 7.7.1 Standard Error via Formula 7.8 Other Sampling Methods 7.8.1 Stratified Sampling 7.8.2 Cluster Sampling 7.8.3 Systematic Sampling 7.8.4 Multistage Sampling 7.8.5 Convenience Sampling 7.8.6 Self‐selection 7.8.7 Nonresponse Bias 7.9 Absolute vs. Relative Sample Size 7.10 Python: Random Sampling Strategies 7.10.1 Implement Simple Random Sample (SRS) 7.10.2 Determining Confidence Intervals 7.10.3 Bootstrap Sampling to Determine Confidence Intervals for a Mean 7.10.4 Advanced Sampling Techniques 7.10.4.1 Stratified Sampling for Categorical Variables 7.10.4.2 Stratified Sampling of Continuous Variables Exercises ch8 8.1 Count Data—R × C Tables 8.2 The Role of Experiments (Many Are Costly) 8.2.1 Example: Marriage Therapy 8.3 Chi‐Square Test 8.3.1 Alternate Option 8.3.2 Testing for the Role of Chance 8.3.3 Standardization to the Chi‐Square Statistic 8.3.4 Chi‐Square Example on the Computer 8.4 Single Sample—Goodness‐of‐Fit 8.4.1 Resampling Procedure 8.5 Numeric Data: ANOVA 8.6 Components of Variance 8.6.1 From ANOVA to Regression 8.7 Factorial Design 8.7.1 Stratification and Blocking 8.7.2 Blocking 8.8 The Problem of Multiple Inference 8.9 Continuous Testing 8.9.1 Medicine 8.9.2 Business 8.10 Bandit Algorithms 8.10.1 Web Testing 8.11 Appendix: ANOVA, the Factor Diagram, and the F‐Statistic 8.11.1 Decomposition: The Factor Diagram 8.11.2 Constructing the ANOVA Table 8.11.3 Inference Using the ANOVA Table 8.11.4 The F‐Distribution 8.11.5 Different Sized Groups 8.11.5.1 Resampling Method 8.11.5.2 Formula Method 8.11.6 Caveats and Assumptions 8.12 More than One Factor or Variable—From ANOVA to Statistical Models 8.13 Python: Contingency Tables and Chi‐square Test 8.13.1 Example: Marriage Therapy 8.13.2 Example: Imanishi‐Kari Data 8.14 Python: ANOVA 8.14.1 Visual Comparison of Groups 8.14.2 ANOVA Using Resampling Test 8.14.3 ANOVA Using the F‐Statistic Exercises ch9 9.1 Example: Delta Wire 9.2 Example: Cotton Dust and Lung Disease 9.3 The Vector Product Sum Test 9.3.1 Example: Baseball Payroll 9.3.1.1 Resampling Procedure 9.4 Correlation Coefficient 9.4.1 Inference for the Correlation Coefficient—Resampling 9.4.1.1 Hypothesis Test—Resampling 9.4.1.2 Example: Baseball Again 9.4.1.3 Inference for the Correlation Coefficient: Formulas 9.5 Correlation is not Causation 9.5.1 A Lurking External Cause 9.5.2 Coincidence 9.6 Other Forms of Association 9.7 Python: Correlation 9.7.1 Vector Operations 9.7.2 Resampling Test for Vector Product Sums 9.7.3 Calculating Correlation Coefficient 9.7.4 Calculate Correlation with numpy, pandas 9.7.5 Hypothesis Tests for Correlation 9.7.6 Using the t Statistic 9.7.7 Visualizing Correlation Exercises ch10 10.1 Finding the Regression Line by Eye 10.1.1 Making Predictions Based on the Regression Line 10.2 Finding the Regression Line by Minimizing Residuals 10.2.1 The “Loss Function” 10.3 Linear Relationships 10.3.1 Example: Workplace Exposure and PEFR 10.3.2 Residual Plots 10.3.2.1 How to Read the Payroll Residual Plot 10.4 Prediction vs. Explanation 10.4.1 Research Studies: Regression for Explanation 10.4.2 Assessing the Performance of Regression for Explanation 10.4.3 Big Data: Regression for Prediction 10.4.4 Assessing the Performance of Regression for Prediction 10.5 Python: Linear Regression 10.5.1 Linear Regression Using Statsmodels 10.5.2 Using the Non‐formula Interface to statsmodels 10.5.3 Linear Regression Using scikit‐learn 10.5.4 Splitting Datasets and Evaluating Model Performance Exercises ch11 11.1 Terminology 11.2 Example—Housing Prices 11.2.1 Explaining Home Prices 11.2.2 House Prices in Boston 11.2.3 Explore the Data 11.2.3.1 Performing and Interpreting a Regression Analysis 11.2.4 Using the Regression Equation 11.3 Interaction 11.3.1 Original Regression with No Interaction Term 11.3.2 The Regression with an Interaction Term 11.3.3 Does Crime Pay? 11.4 Regression Assumptions 11.4.1 Violation of Assumptions—Is the Model Useless? 11.5 Assessing Explanatory Regression Models 11.5.1 Overall Model Strength R2 11.5.2 Assessing Individual Coefficients 11.5.3 Resampling Procedure to Test Statistical Significance 11.5.4 Resampling Procedure for a Confidence Interval (the Pulmonary Data) 11.5.4.1 Interpretation 11.5.5 Formula‐based Inference 11.5.6 Interpreting Software Output 11.5.7 More Practice: Bootstrapping the Boston Housing Model 11.5.8 Inference for Regression—Hypothesis Tests 11.6 Assessing Regression for Prediction 11.6.1 Separate Training and Holdout Data 11.6.2 Root Mean Squared Error—RMSE 11.6.3 Tayko 11.6.4 Binary and Categorical Variables in Regression 11.6.5 Multicollinearity 11.6.6 Tayko—Building the Model 11.6.7 Reviewing the Output 11.6.8 Scoring the Model to the Validation Partition 11.6.9 The Naive Rule 11.7 Python: Multiple Linear Regression 11.7.1 Using Statsmodels 11.7.1.1 Adding Interaction Terms 11.7.2 Diagnostic Plots 11.7.3 Using Scikit‐learn 11.7.3.1 Adding Interaction Terms 11.7.4 Resampling Procedures 11.7.4.1 Estimating the Significance of the Coefficients 11.7.4.2 Estimating Confidence Intervals—The Bootstrap Exercises ch12 12.1 K‐Nearest‐Neighbors 12.1.1 Predicting Which Customers Might be Pregnant 12.1.2 Small Hypothetical Example 12.1.3 Setting k 12.1.4 K‐Nearest‐Neighbors and Numerical Outcomes 12.1.5 Explanatory Modeling 12.2 Python: Classification 12.2.1 Classification Using scikit‐learn 12.2.2 Evaluating the Model 12.2.3 Streamlining Model Fitting Using Pipelines Exercises index