ورود به حساب

نام کاربری گذرواژه

گذرواژه را فراموش کردید؟ کلیک کنید

حساب کاربری ندارید؟ ساخت حساب

ساخت حساب کاربری

نام نام کاربری ایمیل شماره موبایل گذرواژه

برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید


09117307688
09117179751

در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید

دسترسی نامحدود

برای کاربرانی که ثبت نام کرده اند

ضمانت بازگشت وجه

درصورت عدم همخوانی توضیحات با کتاب

پشتیبانی

از ساعت 7 صبح تا 10 شب

دانلود کتاب Data Science and Predictive Analytics: Biomedical and Health Applications using R

دانلود کتاب علم داده و تجزیه و تحلیل پیش بینی: کاربردهای زیست پزشکی و بهداشتی با استفاده از R

Data Science and Predictive Analytics: Biomedical and Health Applications using R

مشخصات کتاب

Data Science and Predictive Analytics: Biomedical and Health Applications using R

ویرایش: 1 
نویسندگان:   
سری:  
ISBN (شابک) : 9783319723464, 9783319723471 
ناشر: Springer International Publishing 
سال نشر: 2018 
تعداد صفحات: 851 
زبان: English 
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) 
حجم فایل: 18 مگابایت 

قیمت کتاب (تومان) : 44,000



ثبت امتیاز به این کتاب

میانگین امتیاز به این کتاب :
       تعداد امتیاز دهندگان : 7


در صورت تبدیل فایل کتاب Data Science and Predictive Analytics: Biomedical and Health Applications using R به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.

توجه داشته باشید کتاب علم داده و تجزیه و تحلیل پیش بینی: کاربردهای زیست پزشکی و بهداشتی با استفاده از R نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.


توضیحاتی در مورد کتاب علم داده و تجزیه و تحلیل پیش بینی: کاربردهای زیست پزشکی و بهداشتی با استفاده از R



در طول دهه گذشته، داده های بزرگ در همه بخش های اقتصادی، رشته های علمی و فعالیت های انسانی در همه جا حاضر شده اند. آنها منجر به پیشرفت های فن آوری قابل توجهی شده اند که بر تمام تجربیات انسانی تأثیر می گذارد. توانایی ما برای مدیریت، درک، بازجویی و تفسیر چنین داده های بسیار بزرگ، چندمنبعی، ناهمگن، ناقص، چند مقیاسی و نامتجانس با افزایش سریع حجم، پیچیدگی و تکثیر سیل اطلاعات دیجیتال همگام نبوده است. سه دلیل برای این کمبود وجود دارد. اولا، حجم داده ها بسیار سریعتر از افزایش قدرت پردازش محاسباتی ما در حال افزایش است (قانون کرایدر


توضیحاتی درمورد کتاب به خارجی

Over the past decade, Big Data have become ubiquitous in all economic sectors, scientific disciplines, and human activities. They have led to striking technological advances, affecting all human experiences. Our ability to manage, understand, interrogate, and interpret such extremely large, multisource, heterogeneous, incomplete, multiscale, and incongruent data has not kept pace with the rapid increase of the volume, complexity and proliferation of the deluge of digital information. There are three reasons for this shortfall. First, the volume of data is increasing much faster than the corresponding rise of our computational processing power (Kryder’s law > Moore’s law). Second, traditional discipline-bounds inhibit expeditious progress. Third, our education and training activities have fallen behind the accelerated trend of scientific, information, and communication advances. There are very few rigorous instructional resources, interactive learning materials, and dynamic training environments that support active data science learning. The textbook balances the mathematical foundations with dexterous demonstrations and examples of data, tools, modules and workflows that serve as pillars for the urgently needed bridge to close that supply and demand predictive analytic skills gap.

Exposing the enormous opportunities presented by the tsunami of Big data, this textbook aims to identify specific knowledge gaps, educational barriers, and workforce readiness deficiencies. Specifically, it focuses on the development of a transdisciplinary curriculum integrating modern computational methods, advanced data science techniques, innovative biomedical applications, and impactful health analytics.

The content of this graduate-level textbook fills a substantial gap in integrating modern engineering concepts, computational algorithms, mathematical optimization, statistical computing and biomedical inference. Big data analytic techniques and predictive scientific methods demand broad transdisciplinary knowledge, appeal to an extremely wide spectrum of readers/learners, and provide incredible opportunities for engagement throughout the academy, industry, regulatory and funding agencies.

The two examples below demonstrate the powerful need for scientific knowledge, computational abilities, interdisciplinary expertise, and modern technologies necessary to achieve desired outcomes (improving human health and optimizing future return on investment). This can only be achieved by appropriately trained teams of researchers who can develop robust decision support systems using modern techniques and effective end-to-end protocols, like the ones described in this textbook.

• A geriatric neurologist is examining a patient complaining of gait imbalance and posture instability. To determine if the patient may suffer from Parkinson’s disease, the physician acquires clinical, cognitive, phenotypic, imaging, and genetics data (Big Data). Most clinics and healthcare centers are not equipped with skilled data analytic teams that can wrangle, harmonize and interpret such complex datasets. A learner that completes a course of study using this textbook will have the competency and ability to manage the data, generate a protocol for deriving biomarkers, and provide an actionable decision support system. The results of this protocol will help the physician understand the entire patient dataset and assist in making a holistic evidence-based, data-driven, clinical diagnosis.

• To improve the return on investment for their shareholders, a healthcare manufacturer needs to forecast the demand for their product subject to environmental, demographic, economic, and bio-social sentiment data (Big Data). The organization’s data-analytics team is tasked with developing a protocol that identifies, aggregates, harmonizes, models and analyzes these heterogeneous data elements to generate a trend forecast. This system needs to provide an automated, adaptive, scalable, and reliable prediction of the optimal investment, e.g., R&D allocation, that maximizes the company’s bottom line. A reader that complete a course of study using this textbook will be able to ingest the observed structured and unstructured data, mathematically represent the data as a computable object, apply appropriate model-based and model-free prediction techniques. The results of these techniques may be used to forecast the expected relation between the company’s investment, product supply, general demand of healthcare (providers and patients), and estimate the return on initial investments.




فهرست مطالب

Foreword......Page 6
Genesis......Page 10
Limitations/Prerequisites......Page 11
Scope of the Book......Page 12
Acknowledgements......Page 13
DSPA Application and Use Disclaimer......Page 14
Biomedical, Biosocial, Environmental, and Health Disclaimer......Page 15
Notations......Page 16
Contents......Page 17
1.1 DSPA Mission and Objectives......Page 33
1.2.2 Parkinson´s Disease......Page 34
1.2.3 Drug and Substance Use......Page 35
1.2.6 Neurodegeneration......Page 36
1.2.7 Genetic Forensics: 2013-2016 Ebola Outbreak......Page 37
1.2.8 Next Generation Sequence (NGS) Analysis......Page 38
1.2.9 Neuroimaging-Genetics......Page 39
1.3 Common Characteristics of Big (Biomedical and Health) Data......Page 40
1.5 Predictive Analytics......Page 41
1.7 Examples of Data Repositories, Archives, and Services......Page 42
1.8 DSPA Expectations......Page 43
2.1 Why Use R?......Page 45
2.2.3 RStudio GUI Layout......Page 47
2.3 Help......Page 48
2.4 Simple Wide-to-Long Data format Translation......Page 49
2.5 Data Generation......Page 50
2.6 Input/Output (I/O)......Page 54
2.7 Slicing and Extracting Data......Page 56
2.9 Variable Information......Page 57
2.10 Data Selection and Manipulation......Page 59
2.11 Math Functions......Page 62
2.13 Advanced Data Processing......Page 64
2.14 Strings......Page 69
2.15 Plotting......Page 71
2.16 QQ Normal Probability Plot......Page 73
2.18 Graphics Parameters......Page 77
2.19 Optimization and model Fitting......Page 79
2.20 Statistics......Page 80
2.21.1 Programming......Page 81
2.22 Data Simulation Primer......Page 82
2.23.1 HTML SOCR Data Import......Page 88
2.23.2 R Debugging......Page 89
2.24.1 Confirm that You Have Installed R/RStudio......Page 92
2.24.5 Simulation......Page 93
References......Page 94
3.1 Saving and Loading R Data Structures......Page 95
3.2 Importing and Saving Data from CSV Files......Page 96
3.4 Exploring Numeric Variables......Page 98
3.5 Measuring the Central Tendency: Mean, Median, Mode......Page 99
3.6 Measuring Spread: Quartiles and the Five-Number Summary......Page 100
3.7 Visualizing Numeric Variables: Boxplots......Page 102
3.8 Visualizing Numeric Variables: Histograms......Page 103
3.9 Understanding Numeric Data: Uniform and Normal Distributions......Page 104
3.10 Measuring Spread: Variance and Standard Deviation......Page 105
3.11 Exploring Categorical Variables......Page 108
3.12 Exploring Relationships Between Variables......Page 109
3.13 Missing Data......Page 111
3.13.1 Simulate Some Real Multivariate Data......Page 116
3.13.2 TBI Data Example......Page 130
General Idea of EM Algorithm......Page 154
EM-Based Imputation......Page 155
A Simple Manual Implementation of EM-Based Imputation......Page 156
Plotting Complete and Imputed Data......Page 159
Comparison......Page 160
3.14 Parsing Webpages and Visualizing Tabular HTML Data......Page 162
3.15 Cohort-Rebalancing (for Imbalanced Groups)......Page 167
3.16.1 Importing Data from SQL Databases......Page 170
3.16.2 R Code Fragments......Page 171
3.17.2 Explore some Bivariate Relations in the Data......Page 172
References......Page 173
4.1 Common Questions......Page 174
4.3.1 Histograms and Density Plots......Page 175
4.3.2 Pie Chart......Page 178
4.3.3 Heat Map......Page 180
4.4.1 Paired Scatter Plots......Page 183
4.4.2 Jitter Plot......Page 188
4.4.3 Bar Plots......Page 190
4.4.4 Trees and Graphs......Page 195
4.4.5 Correlation Plots......Page 198
4.5.1 Line Plots Using ggplot......Page 202
4.5.3 Distributions......Page 204
4.5.4 2D Kernel Density and 3D Surface Plots......Page 205
4.5.5 Multiple 2D Image Surface Plots......Page 207
4.5.6 3D and 4D Visualizations......Page 209
4.6.1 Hands-on Activity (Health Behavior Risks)......Page 214
Housing Price Data......Page 218
Modeling the Home Price Index Data (Fig. 4.48)......Page 220
Map of the Neighborhoods of Los Angeles (LA)......Page 222
Latin Letter Frequency in Different Languages......Page 224
4.7.2 Trees and Graphs......Page 229
References......Page 230
Chapter 5: Linear Algebra and Matrix Computing......Page 231
5.1.1 Create Matrices......Page 232
5.1.2 Adding Columns and Rows......Page 233
5.3.1 Addition......Page 234
Matrix Multiplication......Page 235
5.3.6 Multiplicative Inverse......Page 237
5.4.1 Linear Models......Page 239
5.4.2 Solving Systems of Equations......Page 240
5.4.3 The Identity Matrix......Page 242
5.5 Scalars, Vectors and Matrices......Page 243
Mean......Page 245
Applications of Matrix Algebra: Linear Modeling......Page 246
Finding Function Extrema (Min/Max) Using Calculus......Page 247
5.5.2 Least Square Estimation......Page 248
5.6 Eigenvalues and Eigenvectors......Page 249
5.8 Matrix Notation (Another View)......Page 250
5.9 Multivariate Linear Regression......Page 254
5.10 Sample Covariance Matrix......Page 257
5.11.3 Matrix Equations......Page 259
5.11.8 Least Square Estimation......Page 260
References......Page 261
6.1 Example: Reducing 2D to 1D......Page 262
6.2 Matrix Rotations......Page 266
6.4 Summary (PCA vs. ICA vs. FA)......Page 271
6.5.1 Principal Components......Page 272
6.6 Independent Component Analysis (ICA)......Page 279
6.7 Factor Analysis (FA)......Page 283
6.8 Singular Value Decomposition (SVD)......Page 285
6.10 Case Study for Dimension Reduction (Parkinson´s Disease)......Page 287
6.11.1 Parkinson´s Disease Example......Page 294
References......Page 295
Chapter 7: Lazy Learning: Classification Using Nearest Neighbors......Page 296
7.1 Motivation......Page 297
7.2.1 Distance Function and Dummy Coding......Page 298
7.2.3 Rescaling of the Features......Page 299
7.3.1 Step 1: Collecting Data......Page 300
7.3.2 Step 2: Exploring and Preparing the Data......Page 301
7.3.3 Normalizing Data......Page 302
7.3.6 Step 4: Evaluating Model Performance......Page 303
7.3.7 Step 5: Improving Model Performance......Page 304
7.3.8 Testing Alternative Values of k......Page 305
7.3.9 Quantitative Assessment (Tables 7.2 and 7.3)......Page 311
7.4.2 Parkinson´s Disease......Page 315
References......Page 316
8.1 Overview of the Naive Bayes Algorithm......Page 317
8.3 Bayes Formula......Page 318
8.4 The Laplace Estimator......Page 320
8.5.2 Step 2: Exploring and Preparing the Data......Page 321
Data Preparation: Processing Text Data for Analysis......Page 322
Data Preparation: Creating Training and Test Datasets......Page 323
Visualizing Text Data: Word Clouds......Page 325
Data Preparation: Creating Indicator Features for Frequent Words......Page 326
8.5.3 Step 3: Training a Model on the Data......Page 327
8.5.4 Step 4: Evaluating Model Performance......Page 328
8.5.5 Step 5: Improving Model Performance......Page 329
8.5.6 Step 6: Compare Naive Bayesian against LDA......Page 330
8.6 Practice Problem......Page 331
8.7.1 Explain These Two Concepts......Page 332
References......Page 333
9.1 Motivation......Page 334
9.2 Hands-on Example: Iris Data......Page 335
9.3 Decision Tree Overview......Page 337
9.3.1 Divide and Conquer......Page 338
9.3.2 Entropy......Page 339
9.3.4 C5.0 Decision Tree Algorithm......Page 340
9.3.5 Pruning the Decision Tree......Page 342
9.4.2 Step 2: Exploring and Preparing the Data......Page 343
Data Preparation: Creating Random Training and Test Datasets......Page 345
9.4.3 Step 3: Training a Model On the Data......Page 346
9.4.4 Step 4: Evaluating Model Performance......Page 349
9.4.5 Step 5: Trial Option......Page 350
9.4.6 Loading the Misclassification Error Matrix......Page 351
9.4.7 Parameter Tuning......Page 352
9.6.1 Separate and Conquer......Page 358
9.7.1 Step 3: Training a Model on the Data......Page 359
9.7.2 Step 4: Evaluating Model Performance......Page 360
9.7.4 Step 5: Alternative Model2......Page 361
9.8 Practice Problem......Page 364
9.9.2 Decision Tree Partitioning......Page 369
References......Page 370
10.1.1 Simple Linear Regression......Page 371
10.2 Ordinary Least Squares Estimation......Page 373
10.2.2 Correlations......Page 375
10.2.3 Multiple Linear Regression......Page 376
10.3.2 Step 2: Exploring and Preparing the Data......Page 378
10.3.4 Visualizing Relationships Among Features: The Scatterplot Matrix......Page 382
10.3.5 Step 3: Training a Model on the Data......Page 384
10.3.6 Step 4: Evaluating Model Performance......Page 385
10.4 Step 5: Improving Model Performance......Page 387
10.4.1 Model Specification: Adding Non-linear Relationships......Page 395
10.4.2 Transformation: Converting a Numeric Variable to a Binary Indicator......Page 396
10.4.3 Model Specification: Adding Interaction Effects......Page 397
10.5.1 Adding Regression to Trees......Page 399
10.6.1 Step 2: Exploring and Preparing the Data......Page 400
10.6.3 Visualizing Decision Trees......Page 401
10.6.4 Step 4: Evaluating Model Performance......Page 403
10.6.6 Step 5: Improving Model Performance......Page 404
10.7 Practice Problem: Heart Attack Data......Page 406
References......Page 407
11.1.1 From Biological to Artificial Neurons......Page 408
11.1.2 Activation Functions......Page 409
11.1.5 The Number of Nodes in Each Layer......Page 411
11.1.6 Training Neural Networks with Backpropagation......Page 412
Variables......Page 413
11.2.2 Step 2: Exploring and Preparing the Data......Page 414
11.2.3 Step 3: Training a Model on the Data......Page 416
11.2.4 Step 4: Evaluating Model Performance......Page 417
11.2.5 Step 5: Improving Model Performance......Page 418
11.3 Simple NN Demo: Learning to Compute......Page 419
11.4 Case Study 2: Google Trends and the Stock Market - Classification......Page 421
11.5 Support Vector Machines (SVM)......Page 423
Linearly Separable Data......Page 424
Non-linearly Separable Data......Page 427
11.6 Case Study 3: Optical Character Recognition (OCR)......Page 428
11.6.1 Step 1: Prepare and Explore the Data......Page 429
11.6.2 Step 2: Training an SVM Model......Page 430
11.6.3 Step 3: Evaluating Model Performance......Page 431
11.6.4 Step 4: Improving Model Performance......Page 433
11.7.2 Step 2: Exploring and Preparing the Data......Page 434
11.7.3 Step 3: Training a Model on the Data......Page 436
11.7.4 Step 4: Evaluating Model Performance......Page 437
11.7.6 Parameter Tuning......Page 438
11.7.7 Improving the Performance of Gaussian Kernels......Page 440
11.8.2 Problem 2: Quality of Life and Chronic Disease......Page 441
11.9 Appendix......Page 445
11.10.2 Pediatric Schizophrenia Study......Page 446
References......Page 447
12.1 Association Rules......Page 448
12.3 Measuring Rule Importance by Using Support and Confidence......Page 449
12.4 Building a Set of Rules with the Apriori Principle......Page 450
12.5 A Toy Example......Page 451
12.6.2 Step 2: Exploring and Preparing the Data......Page 452
Visualizing Item Support: Item Frequency Plots......Page 454
Visualizing Transaction Data: Plotting the Sparse Matrix......Page 455
12.6.3 Step 3: Training a Model on the Data......Page 457
12.6.4 Step 4: Evaluating Model Performance......Page 458
Sorting the Set of Association Rules......Page 460
Taking Subsets of Association Rules......Page 461
12.7 Practice Problems: Groceries......Page 463
12.8 Summary......Page 466
References......Page 467
13.1 Clustering as a Machine Learning Task......Page 468
13.2 Silhouette Plots......Page 471
13.3.1 Using Distance to Assign and Update Clusters......Page 472
13.4.1 Step 1: Collecting Data......Page 473
13.4.2 Step 2: Exploring and Preparing the Data......Page 474
13.4.3 Step 3: Training a Model on the Data......Page 475
13.4.4 Step 4: Evaluating Model Performance......Page 476
13.4.5 Step 5: Usage of Cluster Information......Page 479
13.5 Model Improvement......Page 480
13.5.1 Tuning the Parameter k......Page 482
13.6.1 Step 1: Collecting Data......Page 484
13.6.2 Step 2: Exploring and Preparing the Data......Page 485
13.6.3 Step 3: Training a Model on the Data......Page 486
13.6.4 Step 4: Evaluating Model Performance......Page 487
13.6.5 Practice Problem: Youth Development......Page 490
13.7 Hierarchical Clustering......Page 492
13.8 Gaussian Mixture Models......Page 495
13.10 Assignments: 13. k-Means Clustering......Page 497
References......Page 498
14.1 Measuring the Performance of Classification Methods......Page 499
14.2.1 Binary Outcomes......Page 501
14.2.2 Confusion Matrices......Page 502
14.2.3 Other Measures of Performance Beyond Accuracy......Page 504
14.2.4 The Kappa (κ) Statistic......Page 505
14.2.5 Computation of Observed Accuracy and Expected Accuracy......Page 508
14.2.6 Sensitivity and Specificity......Page 509
14.2.7 Precision and Recall......Page 510
14.2.8 The F-Measure......Page 511
14.3 Visualizing Performance Tradeoffs (ROC Curve)......Page 512
14.4.1 The Holdout Method......Page 515
14.4.2 Cross-Validation......Page 516
14.4.3 Bootstrap Sampling......Page 518
14.5 Assignment: 14. Evaluation of Model Performance......Page 519
References......Page 520
15.2 Using caret for Automated Parameter Tuning......Page 521
15.2.1 Customizing the Tuning Process......Page 525
15.2.2 Improving Model Performance with Meta-learning......Page 526
15.2.3 Bagging......Page 527
15.2.4 Boosting......Page 529
Training Random Forests......Page 530
Evaluating Random Forest Performance......Page 531
15.2.6 Adaptive Boosting......Page 532
15.3 Assignment: 15. Improving Model Performance......Page 534
References......Page 535
16.1 Working with Specialized Data and Databases......Page 536
16.1.1 Data Format Conversion......Page 537
16.1.2 Querying Data in SQL Databases......Page 538
16.1.3 Real Random Number Generation......Page 544
16.1.4 Downloading the Complete Text of Web Pages......Page 545
16.1.5 Reading and Writing XML with the XML Package......Page 546
16.1.6 Web-Page Data Scraping......Page 547
16.1.7 Parsing JSON from Web APIs......Page 548
16.1.8 Reading and Writing Microsoft Excel Spreadsheets Using XLSX......Page 549
16.2.1 Working with Bioinformatics Data......Page 550
16.2.2 Visualizing Network Data......Page 551
16.3.1 Definition......Page 556
k-Means Clustering......Page 557
Concept Drift Streams......Page 559
16.3.5 Printing, Plotting and Saving Streams......Page 560
16.3.6 Stream Animation......Page 561
16.3.7 Case-Study: SOCR Knee Pain Data......Page 563
16.3.8 Data Stream Clustering and Classification (DSC)......Page 565
16.3.9 Evaluation of Data Stream Clustering......Page 568
16.4 Optimization and Improving the Computational Performance......Page 569
16.4.1 Generalizing Tabular Data Structures with dplyr......Page 570
16.4.3 Creating Disk-Based Data Frames with ff......Page 571
16.5 Parallel Computing......Page 572
16.5.2 Parallel Processing with Multiple Cores......Page 573
16.5.3 Parallelization Using foreach and doParallel......Page 575
16.6.2 Growing Bigger and Faster Random Forests with bigrf......Page 576
16.7 Practice Problem......Page 577
16.8.3 Data Conversion and Parallel Computing......Page 578
References......Page 579
17.1.1 Filtering Techniques......Page 580
17.1.3 Embedded Techniques......Page 581
17.2.2 Step 2: Exploring and Preparing the Data......Page 582
17.2.3 Step 3: Training a Model on the Data......Page 583
Comparing with RFE......Page 587
Comparing with Stepwise Feature Selection......Page 589
17.3 Practice Problem......Page 592
17.4.2 Use the PPMI Dataset......Page 594
References......Page 595
Chapter 18: Regularized Linear Modeling and Controlled Variable Selection......Page 596
18.3 Regularized Linear Modeling......Page 597
18.3.1 Ridge Regression......Page 599
18.3.2 Least Absolute Shrinkage and Selection Operator (LASSO) Regression......Page 602
18.4 Linear Regression......Page 605
18.4.3 Estimating the Prediction Error......Page 606
18.4.4 Improving the Prediction Accuracy......Page 607
18.4.5 Variable Selection......Page 608
18.5.2 Role of the Regularization Parameter......Page 609
18.5.4 General Regularization Framework......Page 610
18.6.1 Example: Neuroimaging-Genetics Study of Parkinson´s Disease Dataset......Page 611
18.6.3 LASSO and Ridge Solution Paths......Page 613
18.6.4 Choice of the Regularization Parameter......Page 621
18.6.6 n-Fold Cross Validation......Page 622
18.6.7 LASSO 10-Fold Cross Validation......Page 623
18.6.8 Stepwise OLS (Ordinary Least Squares)......Page 624
18.6.9 Final Models......Page 625
18.6.11 Comparing Selected Features......Page 627
18.7 Knock-off Filtering: Simulated Example......Page 628
18.7.1 Notes......Page 630
18.8.1 Fetching, Cleaning and Preparing the Data......Page 631
18.8.2 Preparing the Response Vector......Page 632
18.8.3 False Discovery Rate (FDR)......Page 640
Graphical Interpretation of the Benjamini-Hochberg (BH) Method......Page 641
FDR Adjusting the p-Values......Page 642
18.8.4 Running the Knockoff Filter......Page 643
18.9 Assignment: 18. Regularized Linear Modeling and Knockoff Filtering......Page 644
References......Page 645
19.1 Time Series Analysis......Page 646
19.1.1 Step 1: Plot Time Series......Page 649
19.1.2 Step 2: Find Proper Parameter Values for ARIMA Model......Page 651
19.1.3 Check the Differencing Parameter......Page 652
19.1.4 Identifying the AR and MA Parameters......Page 653
19.1.5 Step 3: Build an ARIMA Model......Page 655
19.1.6 Step 4: Forecasting with ARIMA Model......Page 660
19.2.1 Foundations of SEM......Page 661
19.2.2 SEM Components......Page 664
Step 2 - Exploring and Preparing the Data......Page 665
Step 3 - Fitting a Model on the Data......Page 668
19.2.4 Outputs of Lavaan SEM......Page 670
19.3.1 Mean Trend......Page 671
19.3.2 Modeling the Correlation......Page 675
19.4 GLMM/GEE Longitudinal Data Analysis......Page 676
19.4.1 GEE Versus GLMM......Page 678
19.5.1 Imaging Data......Page 680
References......Page 681
Chapter 20: Natural Language Processing/Text Mining......Page 682
20.1 A Simple NLP/TM Example......Page 683
20.1.1 Define and Load the Unstructured-Text Documents......Page 684
20.1.2 Create a New VCorpus Object......Page 686
Remove Stopwords......Page 687
Stemming: Removal of Plurals and Action Suffixes......Page 688
20.1.5 Bags of Words......Page 689
20.1.6 Document Term Matrix......Page 690
20.2 Case-Study: Job Ranking......Page 692
20.2.3 Step 3: Build the Document Term Matrix......Page 693
20.2.4 Area Under the ROC Curve......Page 697
20.3.2 Inverse Document Frequency (IDF)......Page 699
20.3.3 TF-IDF......Page 700
20.4 Cosine Similarity......Page 708
20.5.1 Data Preprocessing......Page 709
20.5.2 NLP/TM Analytics......Page 712
20.5.3 Prediction Optimization......Page 715
20.6.1 Mining Twitter Data......Page 717
References......Page 718
21.1 Forecasting Types and Assessment Approaches......Page 719
21.2.2 Example (Google Flu Trends)......Page 720
21.2.3 Example (Autism)......Page 722
21.3 Internal Statistical Cross-Validation is an Iterative Process......Page 723
21.4 Example (Linear Regression)......Page 724
21.4.2 Exhaustive Cross-Validation......Page 725
21.5 Case-Studies......Page 726
21.5.1 Example 1: Prediction of Parkinson´s Disease Using Adaptive Boosting (AdaBoost)......Page 727
21.5.2 Example 2: Sleep Dataset......Page 730
21.5.3 Example 3: Model-Based (Linear Regression) Prediction Using the Attitude Dataset......Page 732
21.5.4 Example 4: Parkinson´s Data (ppmi_data)......Page 733
21.7 Alternative Predictor Functions......Page 734
21.7.1 Logistic Regression......Page 735
21.7.2 Quadratic Discriminant Analysis (QDA)......Page 736
21.7.3 Foundation of LDA and QDA for Prediction, Dimensionality Reduction, and Forecasting......Page 737
QDA (Quadratic Discriminant Analysis)......Page 738
21.7.4 Neural Networks......Page 739
21.7.5 SVM......Page 740
21.7.6 k-Nearest Neighbors Algorithm (k-NN)......Page 741
21.7.7 k-Means Clustering (k-MC)......Page 742
Iris Petal Data......Page 749
Spirals Data......Page 750
Income Data......Page 751
21.8 Compare the Results......Page 752
21.9 Assignment: 21. Prediction and Internal Statistical Cross-Validation......Page 755
References......Page 756
22.1 Free (Unconstrained) Optimization......Page 757
22.1.1 Example 1: Minimizing a Univariate Function (Inverse-CDF)......Page 758
22.1.2 Example 2: Minimizing a Bivariate Function......Page 760
22.1.3 Example 3: Using Simulated Annealing to Find the Maximum of an Oscillatory Function......Page 761
22.2.2 Lagrange Multipliers......Page 762
Linear Programming (LP)......Page 763
Mixed Integer Linear Programming (MILP)......Page 768
22.2.4 Quadratic Programming (QP)......Page 769
22.3 General Non-linear Optimization......Page 770
Motivation......Page 771
Example 1: Linear Example......Page 772
Example 2: Quadratic Example......Page 773
Example 3: More Complex Non-linear Optimization......Page 774
22.4 Manual Versus Automated Lagrange Multiplier Optimization......Page 775
22.5 Data Denoising......Page 778
22.6.2 Linear Programming (LP)......Page 783
22.6.5 Complex Non-linear Optimization......Page 784
References......Page 785
Chapter 23: Deep Learning, Neural Networks......Page 786
23.1.1 Perceptrons......Page 787
23.2 Biological Relevance......Page 789
23.3.1 Exclusive OR (XOR) Operator......Page 791
23.3.2 NAND Operator......Page 792
23.3.3 Complex Networks Designed Using Simple Building Blocks......Page 793
23.4 Classification......Page 794
23.4.1 Sonar Data Example......Page 795
23.4.2 MXNet Notes......Page 802
23.5 Case-Studies......Page 803
23.5.1 ALS Regression Example......Page 804
23.5.2 Spirals 2D Data......Page 806
23.5.3 IBS Study......Page 810
23.5.4 Country QoL Ranking Data......Page 813
23.5.5 Handwritten Digits Classification......Page 816
Configuring the Neural Network......Page 820
Forecasting......Page 821
Examining the Network Structure Using LeNet......Page 825
23.6.2 Load, Preprocess and Classify New Images - US Weather Pattern......Page 827
23.6.3 Lake Mapourika, New Zealand......Page 831
23.6.4 Beach Image......Page 832
23.6.5 Volcano......Page 833
23.6.6 Brain Surface......Page 835
23.6.7 Face Mask......Page 836
23.7.1 Deep Learning Classification......Page 837
References......Page 838
Summary......Page 839
Glossary......Page 842
Index......Page 844




نظرات کاربران