دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
دسته بندی: ریاضیات کاربردی ویرایش: 1 نویسندگان: Nathan Carter سری: Handbooks in Mathematics ISBN (شابک) : 9780367027056, 9780367528492 ناشر: CRC Press سال نشر: 2020 تعداد صفحات: 545 زبان: English فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) حجم فایل: 19 مگابایت
کلمات کلیدی مربوط به کتاب علم داده برای ریاضیدانان: جبر خطی، آمار، خوشه بندی، تحقیق در عملیات، یادگیری ماشین، شبکه های عصبی
در صورت تبدیل فایل کتاب Data Science for Mathematicians به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب علم داده برای ریاضیدانان نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
ریاضیدانان مهارتهایی دارند که اگر به روشهای صحیح عمیقتر شوند، آنها را قادر میسازد تا از دادهها برای پاسخ به سؤالهای مهم برای خود و دیگران استفاده کنند و آن پاسخها را به روشهای قانعکننده گزارش کنند. علم داده بخش هایی از ریاضیات، آمار، علوم کامپیوتر را ترکیب می کند. به دست آوردن چنین قدرتی و توانایی تدریس، به شغل ریاضی دانان نیرو بخشیده است. این کتاب راهنما به ریاضیدانان کمک می کند تا فرصت های ارائه شده توسط علم داده را بهتر درک کنند. همانطور که در برنامه درسی، تحقیقات و فرصت های شغلی اعمال می شود، علم داده یک زمینه به سرعت در حال رشد است. مشارکت کنندگان از دانشگاه ها و صنعت نظرات خود را در مورد این فرصت ها و نحوه استفاده از آنها ارائه می دهند.
Mathematicians have skills that, if deepened in the right ways, would enable them to use data to answer questions important to them and others, and report those answers in compelling ways. Data science combines parts of mathematics, statistics, computer science. Gaining such power and the ability to teach has reinvigorated the careers of mathematicians. This handbook will assist mathematicians to better understand the opportunities presented by data science. As it applies to the curriculum, research, and career opportunities, data science is a fast-growing field. Contributors from both academics and industry present their views on these opportunities and how to advantage them.
Cover Half Title Series Page Title Page Copyright Page Contents Foreword 1. Introduction 1.1 Who should read this book? 1.2 What is data science? 1.3 Is data science new? 1.4 What can I expect from this book? 1.5 What will this book expect from me? 2. Programming with Data 2.1 Introduction 2.2 The computing environment 2.2.1 Hardware 2.2.2 The command line 2.2.3 Programming languages 2.2.4 Integrated development environments (IDEs) 2.2.5 Notebooks 2.2.6 Version control 2.3 Best practices 2.3.1 Write readable code 2.3.2 Don't repeat yourself 2.3.3 Set seeds for random processes 2.3.4 Profile, benchmark, and optimize judiciously 2.3.5 Test your code 2.3.6 Don't rely on black boxes 2.4 Data-centric coding 2.4.1 Obtaining data 2.4.1.1 Files 2.4.1.2 The web 2.4.1.3 Databases 2.4.1.4 Other sources and concerns 2.4.2 Data structures 2.4.3 Cleaning data 2.4.3.1 Missing data 2.4.3.2 Data values 2.4.3.3 Outliers 2.4.3.4 Other issues 2.4.4 Exploratory data analysis (EDA) 2.5 Getting help 2.6 Conclusion 3. Linear Algebra 3.1 Data and matrices 3.1.1 Data, vectors, and matrices 3.1.2 Term-by-document matrices 3.1.3 Matrix storage and manipulation issues 3.2 Matrix decompositions 3.2.1 Matrix decompositions and data science 3.2.2 The LU decomposition 3.2.2.1 Gaussian elimination 3.2.2.2 The matrices L and U 3.2.2.3 Permuting rows 3.2.2.4 Computational notes 3.2.3 The Cholesky decomposition 3.2.4 Least-squares curve-fitting 3.2.5 Recommender systems and the QR decomposition 3.2.5.1 A motivating example 3.2.5.2 The QR decomposition 3.2.5.3 Applications of the QR decomposition 3.2.6 The singular value decomposition 3.2.6.1 SVD in our recommender system 3.2.6.2 Further reading on the SVD 3.3 Eigenvalues and eigenvectors 3.3.1 Eigenproblems 3.3.2 Finding eigenvalues 3.3.3 The power method 3.3.4 PageRank 3.4 Numerical computing 3.4.1 Floating point computing 3.4.2 Floating point arithmetic 3.4.3 Further reading 3.5 Projects 3.5.1 Creating a database 3.5.2 The QR decomposition and query-matching 3.5.3 The SVD and latent semantic indexing 3.5.4 Searching a web 4. Basic Statistics 4.1 Introduction 4.2 Exploratory data analysis and visualizations 4.2.1 Descriptive statistics 4.2.2 Sampling and bias 4.3 Modeling 4.3.1 Linear regression 4.3.2 Polynomial regression 4.3.3 Group-wise models and clustering 4.3.4 Probability models 4.3.5 Maximum likelihood estimation 4.4 Confidence intervals 4.4.1 The sampling distribution 4.4.2 Confidence intervals from the sampling distribution 4.4.3 Bootstrap resampling 4.5 Inference 4.5.1 Hypothesis testing 4.5.1.1 First example 4.5.1.2 General strategy for hypothesis testing 4.5.1.3 Inference to compare two populations 4.5.1.4 Other types of hypothesis tests 4.5.2 Randomization-based inference 4.5.3 Type I and Type II error 4.5.4 Power and effect size 4.5.5 The trouble with p-hacking 4.5.6 Bias and scope of inference 4.6 Advanced regression 4.6.1 Transformations 4.6.2 Outliers and high leverage points 4.6.3 Multiple regression, interaction 4.6.4 What to do when the regression assumptions fail 4.6.5 Indicator variables and ANOVA 4.7 The linear algebra approach to statistics 4.7.1 The general linear model 4.7.2 Ridge regression and penalized regression 4.7.3 Logistic regression 4.7.4 The generalized linear model 4.7.5 Categorical data analysis 4.8 Causality 4.8.1 Experimental design 4.8.2 Quasi-experiments 4.9 Bayesian statistics 4.9.1 Bayes' formula 4.9.2 Prior and posterior distributions 4.10 A word on curricula 4.10.1 Data wrangling 4.10.2 Cleaning data 4.11 Conclusion 4.12 Sample projects 5. Clustering 5.1 Introduction 5.1.1 What is clustering? 5.1.2 Example applications 5.1.3 Clustering observations 5.2 Visualization 5.3 Distances 5.4 Partitioning and the 5.4.1 The k-means algorithm 5.4.2 Issues with k-means 5.4.3 Example with wine data 5.4.4 Validation 5.4.5 Other partitioning algorithms 5.5 Hierarchical clustering 5.5.1 Linkages 5.5.2 Algorithm 5.5.3 Hierarchical simple example 5.5.4 Dendrograms and wine example 5.5.5 Other hierarchical algorithms 5.6 Case study 5.6.1 k-means results 5.6.2 Hierarchical results 5.6.3 Case study conclusions 5.7 Model-based methods 5.7.1 Model development 5.7.2 Model estimation 5.7.3 mclust and model selection 5.7.4 Example with wine data 5.7.5 Model-based versus k-means 5.8 Density-based methods 5.8.1 Example with iris data 5.9 Dealing with network data 5.9.1 Network clustering example 5.10 Challenges 5.10.1 Feature selection 5.10.2 Hierarchical clusters 5.10.3 Overlapping clusters, or fuzzy clustering 5.11 Exercises 6. Operations Research 6.1 History and background 6.1.1 How does OR connect to data science? 6.1.2 The OR process 6.1.3 Balance between efficiency and complexity 6.2 Optimization 6.2.1 Complexity-tractability trade-off 6.2.2 Linear optimization 6.2.2.1 Duality and optimality conditions 6.2.2.2 Extension to integer programming 6.2.3 Convex optimization 6.2.3.1 Duality and optimality conditions 6.2.4 Non-convex optimization 6.3 Simulation 6.3.1 Probability principles of simulation 6.3.2 Generating random variables 6.3.2.1 Simulation from a known distribution 6.3.2.2 Simulation from an empirical distribution: bootstrapping 6.3.2.3 Markov Chain Monte Carlo (MCMC) methods 6.3.3 Simulation techniques for statistical and machine learning model assessment 6.3.3.1 Bootstrapping confidence intervals 6.3.3.2 Cross-validation 6.3.4 Simulation techniques for prescriptive analytics 6.3.4.1 Discrete-event simulation 6.3.4.2 Agent-based modeling 6.3.4.3 Using these tools for prescriptive analytics 6.4 Stochastic optimization 6.4.1 Dynamic programming formulation 6.4.2 Solution techniques 6.5 Putting the methods to use: prescriptive analytics 6.5.1 Bike-sharing systems 6.5.2 A customer choice model for online retail 6.5.3 HIV treatment and prevention 6.6 Tools 6.6.1 Optimization solvers 6.6.2 Simulation software and packages 6.6.3 Stochastic optimization software and packages 6.7 Looking to the future 6.8 Projects 6.8.1 The vehicle routing problem 6.8.2 The unit commitment problem for power systems 6.8.3 Modeling project 6.8.4 Data project 7. Dimensionality Reduction 7.1 Introduction 7.2 The geometry of data and dimension 7.3 Principal Component Analysis 7.3.1 Derivation and properties 7.3.2 Connection to SVD 7.3.3 How PCA is used for dimension estimation and data reduction 7.3.4 Topological dimension 7.3.5 Multidimensional scaling 7.4 Good projections 7.5 Non-integer dimensions 7.5.1 Background on dynamical systems 7.5.2 Fractal dimension 7.5.3 The correlation dimension 7.5.4 Correlation dimension of the Lorenz attractor 7.6 Dimension reduction on the Grassmannian 7.7 Dimensionality reduction in the presence of symme-try 7.8 Category theory applied to data visualization 7.9 Other methods 7.9.1 Nonlinear Principal Component Analysis 7.9.2 Whitney's reduction network 7.9.3 The generalized singular value decomposition 7.9.4 False nearest neighbors 7.9.5 Additional methods 7.10 Interesting theorems on dimension 7.10.1 Whitney's theorem 7.10.2 Takens' theorem 7.10.3 Nash embedding theorems 7.10.4 Johnson-Lindenstrauss lemma 7.11 Conclusions 7.11.1 Summary and method of application 7.11.2 Suggested exercises 8. Machine Learning 8.1 Introduction 8.1.1 Core concepts of supervised learning 8.1.2 Types of supervised learning 8.2 Training dataset and test dataset 8.2.1 Constraints 8.2.2 Methods for data separation 8.3 Machine learning workflow 8.3.1 Step 1: obtaining the initial dataset 8.3.2 Step 2: preprocessing 8.3.2.1 Missing values and outliers 8.3.2.2 Feature engineering 8.3.3 Step 3: creating training and test datasets 8.3.4 Step 4: model creation 8.3.4.1 Scaling and normalization 8.3.4.2 Feature selection 8.3.5 Step 5: prediction and evaluation 8.3.6 Iterative model building 8.4 Implementing the ML workflow 8.4.1 Using scikit-learn 8.4.2 Transformer objects 8.5 Gradient descent 8.5.1 Loss functions 8.5.2 A powerful optimization tool 8.5.3 Application to regression 8.5.4 Support for regularization 8.6 Logistic regression 8.6.1 Logistic regression framework 8.6.2 Parameter estimation for logistic regression 8.6.3 Evaluating the performance of a classifier 8.7 Naïve Bayes classifier 8.7.1 Using Bayes' rule 8.7.1.1 Estimating the probabilities 8.7.1.2 Laplace smoothing 8.7.2 Health care example 8.8 Support vector machines 8.8.1 Linear SVMs in the case of linear separability 8.8.2 Linear SVMs without linear separability 8.8.3 Nonlinear SVMs 8.9 Decision trees 8.9.1 Classification trees 8.9.2 Regression decision trees 8.9.3 Pruning 8.10 Ensemble methods 8.10.1 Bagging 8.10.2 Random forests 8.10.3 Boosting 8.11 Next steps 9. Deep Learning 9.1 Introduction 9.1.1 Overview 9.1.2 History of neural networks 9.2 Multilayer perceptrons 9.2.1 Backpropagation 9.2.2 Neurons 9.2.3 Neural networks for classification 9.3 Training techniques 9.3.1 Initialization 9.3.2 Optimization algorithms 9.3.3 Dropout 9.3.4 Batch normalization 9.3.5 Weight regularization 9.3.6 Early stopping 9.4 Convolutional neural networks 9.4.1 Convnet layers 9.4.2 Convolutional architectures for ImageNet 9.5 Recurrent neural networks 9.5.1 LSTM cells 9.6 Transformers 9.6.1 Overview 9.6.2 Attention layers 9.6.3 Self-attention layers 9.6.4 Word order 9.6.5 Using transformers 9.7 Deep learning frameworks 9.7.1 Hardware acceleration 9.7.2 History of deep learning frameworks 9.7.3 TensorFlow with Keras 9.8 Open questions 9.9 Exercises and solutions 10. Topological Data Analysis 10.1 Introduction 10.2 Example applications 10.2.1 Image processing 10.2.2 Molecule configurations 10.2.3 Agent-based modeling 10.2.4 Dynamical systems 10.3 Topology 10.4 Simplicial complexes 10.5 Homology 10.5.1 Simplicial homology 10.5.2 Homology definitions 10.5.3 Homology example 10.5.4 Homology computation using linear algebra 10.6 Persistent homology 10.7 Sublevelset persistence 10.8 Software and exercises 10.9 References 10.10 Appendix: stability of persistent homology 10.10.1 Distances between datasets 10.10.2 Bottleneck distance and visualization 10.10.3 Stability results Bibliography Index