دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش: نویسندگان: Benjamin S. Baumer, Daniel T. Kaplan, Nicholas J. Horton سری: ISBN (شابک) : 9781498724487 ناشر: CRC سال نشر: 2017 تعداد صفحات: 557 زبان: english فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) حجم فایل: 69 مگابایت
در صورت ایرانی بودن نویسنده امکان دانلود وجود ندارد و مبلغ عودت داده خواهد شد
در صورت تبدیل فایل کتاب Modern Data Science with R به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب علم داده مدرن با R نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
Modern Data Science with R یک کتاب درسی جامع علوم داده برای دانشجویان کارشناسی است که تفکر آماری و محاسباتی را برای حل مشکلات دنیای واقعی با داده ها در بر می گیرد. این کتاب به جای تمرکز انحصاری بر مطالعات موردی یا نحو برنامهنویسی، نشان میدهد که چگونه برنامهنویسی آماری در محیط محاسباتی پیشرفته R/RStudio میتواند برای استخراج اطلاعات معنیدار از انواع دادهها در خدمت پرداختن به آمار متقاعدکننده استفاده شود. سوالات علم داده معاصر مستلزم ادغام دقیق دانش از آمار، علوم کامپیوتر، ریاضیات و حوزه کاربرد است. این کتاب به خوانندگانی که پیشینه ای در زمینه آمار و تجربه اندک در زمینه کدنویسی دارند، کمک می کند تا مهارت های مناسب را برای مقابله با پروژه های پیچیده علم داده توسعه دهند و تمرین کنند. این کتاب دارای تعدادی تمرین است و دارای یک سازمان منعطف برای آموزش انواع دروس ترم است.
Modern Data Science with R is a comprehensive data science textbook for undergraduates that incorporates statistical and computational thinking to solve real-world problems with data. Rather than focus exclusively on case studies or programming syntax, this book illustrates how statistical programming in the state-of-the-art R/RStudio computing environment can be leveraged to extract meaningful information from a variety of data in the service of addressing compelling statistical questions. Contemporary data science requires a tight integration of knowledge from statistics, computer science, mathematics, and a domain of application. This book will help readers with some background in statistics and modest prior experience with coding develop and practice the appropriate skills to tackle complex data science projects. The book features a number of exercises and has a flexible organization conducive to teaching a variety of semester courses.
Contents Tables Figures Preface Intro to Data Science 1 Prologue - why data science? 1.1 What is data science? 1.2 Case study: The evolution of sabermetrics 1.3 Datasets 1.4 Further resources 2 Data visualization 2.1 The 2012 federal election cycle 2.1.1 Are these two groups different? 2.1.2 Graphing variation 2.1.3 Examining relationships among variables 2.1.4 Networks 2.2 Composing data graphics 2.2.1 A taxonomy for data graphics 2.2.2 Color 2.2.3 Dissecting data graphics 2.3 Importance of data graphics: Challenger 2.4 Creating effective presentations 2.5 The wider world of data visualization 2.6 Further resources 2.7 Exercises 3 Grammar for graphics 3.1 A grammar for data graphics 3.1.1 Aesthetics 3.1.2 Scale 3.1.3 Guides 3.1.4 Facets 3.1.5 Layers 3.2 Canonical data graphics in R 3.2.1 Univariate displays 3.2.2 Multivariate displays 3.2.3 Maps 3.2.4 Networks 3.3 Extended example: Historical baby names 3.3.1 Percentage of people alive today 3.3.2 Most common women's names 3.4 Further resources 3.5 Exercises 4 Data wrangling 4.1 A grammar for data wrangling 4.1.1 select() and filter() 4.1.2 mutate() and rename() 4.1.3 arrange() 4.1.4 summarize() with group_by() 4.2 Extended example: Ben's time with the Mets 4.3 Combining multiple tables 4.3.1 inner_join() 4.3.2 left_join() 4.4 Extended example: Manny Ramirez 4.5 Further resources 4.6 Exercises 5 Tidy data & iteration 5.1 Tidy data 5.1.1 Motivation 5.1.2 What are tidy data? 5.2 Reshaping data 5.2.1 Data verbs for converting wide to narrow and vice versa 5.2.2 Spreading 5.2.3 Gathering 5.2.4 Example: Gender-neutral names 5.3 Naming conventions 5.4 Automation and iteration 5.4.1 Vectorized operations 5.4.2 The apply() family of functions 5.4.3 Iteration over subgroups with dplyr: :do() 5.4.4 Iteration with mosaic: :do 5.5 Data intake 5.5.1 Data-table friendly formats 5.5.2 APIs 5.5.3 Cleaning data 5.5.4 Example: Japanese nuclear reactors 5.6 Further resources 5.7 Exercises 6 Professional Ethics 6.1 Introduction 6.2 Truthful falsehoods 6.3 Some settings for professional ethics 6.3.1 The chief executive officer 6.3.2 Employment discrimination 6.3.3 Data scraping 6.3.4 Reproducible spreadsheet analysis 6.3.5 Drug dangers 6.3.6 Legal negotiations 6.4 Some principles to guide ethical action 6.4.1 Applying the precepts 6.5 Data and disclosure 6.5.1 Reidentification and disclosure avoidance 6.5.2 Safe data storage 6.5.3 Data scraping and terms of use 6.6 Reproducibility 6.6.1 Example: Erroneous data merging 6.7 Professional guidelines for ethical conduct 6.8 Ethics, collectively 6.9 Further resources 6.10 Exercises Statistics & Modeling 7 Statistical Foundations 7.1 Samples and populations 7.2 Sample statistics 7.3 The bootstrap 7.4 Outliers 7.5 Statistical models: Explaining variation 7.6 Confounding and accounting for other factors 7.7 The perils of p-values 7.8 Further resources 7.9 Exercises 8 Statistical Learning & Predictive Analytics 8.1 Supervised learning 8.2 Classifiers 8.2.1 Decision trees 8.2.2 Example: High-earners in the 1994 United States Census 8.2.3 Tuning parameters 8.2.4 Random forests 8.2.5 Nearest neighbor 8.2.6 Naïve Bayes 8.2.7 Artificial neural networks 8.3 Ensemble methods 8.4 Evaluating models 8.4.1 Cross-validation 8.4.2 Measuring prediction error 8.4.3 Confusion matrix 8.4.4 ROC curves 8.4.5 Bias-variance trade-off 8.4.6 Example: Evaluation of income models 8.5 Extended example: Who has diabetes? 8.6 Regularization 8.7 Further resources 8.8 Exercises 9 Unsupervised Learning 9.1 Clustering 9.1.1 Hierarchical clustering 9.1.2 k-means 9.2 Dimension reduction 9.2.1 Intuitive approaches 9.2.2 Singular value decomposition 9.3 Further resources 9.4 Exercises 10 Simulation 10.1 Reasoning in reverse 10.2 Extended example: Grouping cancers 10.3 Randomizing functions 10.4 Simulating variability 10.4.1 The partially planned rendezvous 10.4.2 The jobs report 10.4.3 Restaurant health and sanitation grades 10.5 Simulating a complex system 10.6 Random networks 10.7 Key principles of simulation 10.8 Further resources 10.9 Exercises Topics in Data Science 11 Interactive data graphics 11.1 Rich Web content using D3. js and htmlwidgets 11.1.1 Leaet 11.1.2 Plot.ly 11.1.3 DataTables 11.1.4 dygraphs 11.1.5 streamgraphs 11.2 Dynamic visualization using ggvis 11.3 Interactive Web apps with Shiny 11.4 Further customization 11.5 Extended example: Hot dog eating 11.6 Further resources 11.7 Exercises 12 Database querying using SQL 12.1 From dplyr to SQL 12.2 Flat-file databases 12.3 The SQL universe 12.4 The SQL data manipulation language 12.4.1 SELECT...FROM 12.4.2 WHERE 12.4.3 GROUP BY 12.4.4 ORDER BY 12.4.5 HAVING 12.4.6 LIMIT 12.4.7 JOIN 12.4.8 UNION 12.4.9 Subqueries 12.5 Extended example: FiveThirtyEight flights 12.6 SQL vs. R 12.7 Further resources 12.8 Exercises 13 Database administration 13.1 Constructing efficient SQL databases 13.1.1 Creating new databases 13.1.2 CREATE TABLE 13.1.3 Keys 13.1.4 Indices 13.1.5 EXPLAIN 13.1.6 Partitioning 13.2 Changing SQL data 13.2.1 UPDATE 13.2.2 INSERT 13.2.3 LOAD DATA 13.3 Extended example: Building a database 13.3.1 Extract 13.3.2 Transform 13.3.3 Load into MySQL database 13.4 Scalability 13.5 Further resources 13.6 Exercises 14 Working with spatial data 14.1 Motivation: What's so great about spatial data? 14.2 Spatial data structures 14.3 Making maps 14.3.1 Static maps with ggmap 14.3.2 Projections 14.3.3 Geocoding, routes, and distances 14.3.4 Dynamic maps with leaflet 14.4 Extended example: Congressional districts 14.4.1 Election results 14.4.2 Congressional districts 14.4.3 Putting it all together 14.4.4 Using ggmap 14.4.5 Using leaflet 14.5 Effective maps: How (not) to lie 14.6 Extended example: Historical airline route maps 14.6.1 Using ggmap 14.6.2 Using leaflet 14.7 Projecting polygons 14.8 Playing well with others 14.9 Further resources 14.10 Exercises 15 Text as data 15.1 Tools for working with text 15.1.1 Regular expressions using Macbeth 15.1.2 Example: Life and death in Macbeth 15.2 Analyzing textual data 15.2.1 Corpora 15.2.2 Word clouds 15.2.3 Document term matrices 15.3 Ingesting text 15.3.1 Example: Scraping the songs of the Beatles 15.3.2 Scraping data from Twitter 15.4 Further resources 15.5 Exercises 16 Network science 16.1 Introduction to network science 16.1.1 Definitions 16.1.2 A brief history of network science 16.2 Extended example: Six degrees of Kristen Stewart 16.2.1 Collecting Hollywood data 16.2.2 Building the Hollywood network 16.2.3 Building a Kristen Stewart oracle 16.3 PageRank 16.4 Extended example: 1996 men's college basketball 16.5 Further resources 16.6 Exercises 17 Epilogue - towards "big data" 17.1 Notions of big data 17.2 Tools for bigger data 17.2.1 Data and memory structures for big data 17.2.2 Compilation 17.2.3 Parallel and distributed computing 17.2.4 Alternatives to SQL 17.3 Alternatives to R 17.4 Closing thoughts 17.5 Further resources Packages used in this book A.1 The mdsr package A.2 The etl package suite A.3 Other packages A.4 Further resources Intro to R & RStudio B.1 Installation B.1.1 Installation under Windows B.1.2 Installation under Mac OS X B.1.3 Installation under Linux B.1.4 RStudio B.2 Running RStudio and sample session B.3 Learning R B.3.1 Getting help B.3.2 swirl B.4 Fundamental structures and objects B.4.1 Objects and vectors B.4.2 Operators B.4.3 Lists B.4.4 Matrices B.4.5 Dataframes B.4.6 Attributes and classes B.4.7 Options B.4.8 Functions B.5 Add-ons: Packages B.5.1 Introduction to packages B.5.2 CRAN task views B.5.3 Session information B.5.4 Packages and name conflicts B.5.5 Maintaining packages B.5.6 Installed libraries and packages B.6 Further resources B.7 Exercises Algorithmic thinking C.1 Introduction C.2 Simple example C.3 Extended example: Law of large numbers C.4 Non-standard evaluation C.5 Debugging and defensive coding C.6 Further resources C.7 Exercises Reproducible analysis & workflow D.1 Scriptable statistical computing D.2 Reproducible analysis with R Markdown D.3 Projects and version control D.4 Further resources D.5 Exercises Regression modeling E.1 Simple linear regression E.1.1 Motivating example: Modeling usage of a rail trail E.1.2 Model visualization E.1.3 Measuring the strength of fit E.1.4 Categorical explanatory variables E.2 Multiple regression E.2.1 Parallel slopes: Multiple regression with a categorical variable E.2.2 Parallel planes: Multiple regression with a second quantitative variable E.2.3 Non-parallel slopes: Multiple regression with interaction E.2.4 Modelling non-linear relationships E.3 Inference for regression E.4 Assumptions underlying regression E.5 Logistic regression E.6 Further resources E.7 Exercises Setting up a database server F.1 SQLite F.2 MySQL F.2.1 Installation F.2.2 Access F.2.3 Running scripts from the command line F.3 PostgreSQL F.4 Connecting to SQL F.4.1 The command line client F.4.2 GUIs F.4.3 R and RStudio F.4.4 Load into SQLite database Biblio Index R index