دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش: [0.16.1 ed.] نویسندگان: Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola سری: ناشر: d2l.ai سال نشر: 2021 تعداد صفحات: 1021 زبان: English فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) حجم فایل: 28 Mb
در صورت تبدیل فایل کتاب Dive into Deep Learning به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب در یادگیری عمیق فرو بروید نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
Preface Installation Notation Introduction A Motivating Example Key Components Data Models Objective Functions Optimization Algorithms Kinds of Machine Learning Problems Supervised Learning Unsupervised learning Interacting with an Environment Reinforcement Learning Roots The Road to Deep Learning Success Stories Characteristics Preliminaries Data Manipulation Getting Started Operations Broadcasting Mechanism Indexing and Slicing Saving Memory Conversion to Other Python Objects Data Preprocessing Reading the Dataset Handling Missing Data Conversion to the Tensor Format Linear Algebra Scalars Vectors Matrices Tensors Basic Properties of Tensor Arithmetic Reduction Dot Products Matrix-Vector Products Matrix-Matrix Multiplication Norms More on Linear Algebra Calculus Derivatives and Differentiation Partial Derivatives Gradients Chain Rule Automatic Differentiation A Simple Example Backward for Non-Scalar Variables Detaching Computation Computing the Gradient of Python Control Flow Probability Basic Probability Theory Dealing with Multiple Random Variables Expectation and Variance Documentation Finding All the Functions and Classes in a Module Finding the Usage of Specific Functions and Classes Linear Neural Networks Linear Regression Basic Elements of Linear Regression Vectorization for Speed The Normal Distribution and Squared Loss From Linear Regression to Deep Networks Linear Regression Implementation from Scratch Generating the Dataset Reading the Dataset Initializing Model Parameters Defining the Model Defining the Loss Function Defining the Optimization Algorithm Training Concise Implementation of Linear Regression Generating the Dataset Reading the Dataset Defining the Model Initializing Model Parameters Defining the Loss Function Defining the Optimization Algorithm Training Softmax Regression Classification Problem Network Architecture Parameterization Cost of Fully-Connected Layers Softmax Operation Vectorization for Minibatches Loss Function Information Theory Basics Model Prediction and Evaluation The Image Classification Dataset Reading the Dataset Reading a Minibatch Putting All Things Together Implementation of Softmax Regression from Scratch Initializing Model Parameters Defining the Softmax Operation Defining the Model Defining the Loss Function Classification Accuracy Training Prediction Concise Implementation of Softmax Regression Initializing Model Parameters Softmax Implementation Revisited Optimization Algorithm Training Multilayer Perceptrons Multilayer Perceptrons Hidden Layers Activation Functions Implementation of Multilayer Perceptrons from Scratch Initializing Model Parameters Activation Function Model Loss Function Training Concise Implementation of Multilayer Perceptrons Model Model Selection, Underfitting, and Overfitting Training Error and Generalization Error Model Selection Underfitting or Overfitting? Polynomial Regression Weight Decay Norms and Weight Decay High-Dimensional Linear Regression Implementation from Scratch Concise Implementation Dropout Overfitting Revisited Robustness through Perturbations Dropout in Practice Implementation from Scratch Concise Implementation Forward Propagation, Backward Propagation, and Computational Graphs Forward Propagation Computational Graph of Forward Propagation Backpropagation Training Neural Networks Numerical Stability and Initialization Vanishing and Exploding Gradients Parameter Initialization Environment and Distribution Shift Types of Distribution Shift Examples of Distribution Shift Correction of Distribution Shift A Taxonomy of Learning Problems Fairness, Accountability, and Transparency in Machine Learning Predicting House Prices on Kaggle Downloading and Caching Datasets Kaggle Accessing and Reading the Dataset Data Preprocessing Training K-Fold Cross-Validation Model Selection Submitting Predictions on Kaggle Deep Learning Computation Layers and Blocks A Custom Block The Sequential Block Executing Code in the Forward Propagation Function Efficiency Parameter Management Parameter Access Parameter Initialization Tied Parameters Deferred Initialization Instantiating a Network Custom Layers Layers without Parameters Layers with Parameters File I/O Loading and Saving Tensors Loading and Saving Model Parameters GPUs Computing Devices Tensors and GPUs Neural Networks and GPUs Convolutional Neural Networks From Fully-Connected Layers to Convolutions Invariance Constraining the MLP Convolutions “Where’s Waldo” Revisited Convolutions for Images The Cross-Correlation Operation Convolutional Layers Object Edge Detection in Images Learning a Kernel Cross-Correlation and Convolution Feature Map and Receptive Field Padding and Stride Padding Stride Multiple Input and Multiple Output Channels Multiple Input Channels Multiple Output Channels 11 Convolutional Layer Pooling Maximum Pooling and Average Pooling Padding and Stride Multiple Channels Convolutional Neural Networks (LeNet) LeNet Training Modern Convolutional Neural Networks Deep Convolutional Neural Networks (AlexNet) Learning Representations AlexNet Reading the Dataset Training Networks Using Blocks (VGG) VGG Blocks VGG Network Training Network in Network (NiN) NiN Blocks NiN Model Training Networks with Parallel Concatenations (GoogLeNet) Inception Blocks GoogLeNet Model Training Batch Normalization Training Deep Networks Batch Normalization Layers Implementation from Scratch Applying Batch Normalization in LeNet Concise Implementation Controversy Residual Networks (ResNet) Function Classes Residual Blocks ResNet Model Training Densely Connected Networks (DenseNet) From ResNet to DenseNet Dense Blocks Transition Layers DenseNet Model Training Recurrent Neural Networks Sequence Models Statistical Tools Training Prediction Text Preprocessing Reading the Dataset Tokenization Vocabulary Putting All Things Together Language Models and the Dataset Learning a Language Model Markov Models and n-grams Natural Language Statistics Reading Long Sequence Data Recurrent Neural Networks Neural Networks without Hidden States Recurrent Neural Networks with Hidden States RNN-based Character-Level Language Models Perplexity Implementation of Recurrent Neural Networks from Scratch One-Hot Encoding Initializing the Model Parameters RNN Model Prediction Gradient Clipping Training Concise Implementation of Recurrent Neural Networks Defining the Model Training and Predicting Backpropagation Through Time Analysis of Gradients in RNNs Backpropagation Through Time in Detail Modern Recurrent Neural Networks Gated Recurrent Units (GRU) Gated Hidden State Implementation from Scratch Concise Implementation Long Short-Term Memory (LSTM) Gated Memory Cell Implementation from Scratch Concise Implementation Deep Recurrent Neural Networks Functional Dependencies Concise Implementation Training and Prediction Bidirectional Recurrent Neural Networks Dynamic Programming in Hidden Markov Models Bidirectional Model Training a Bidirectional RNN for a Wrong Application Machine Translation and the Dataset Downloading and Preprocessing the Dataset Tokenization Vocabulary Loading the Dataset Putting All Things Together Encoder-Decoder Architecture Encoder Decoder Putting the Encoder and Decoder Together Sequence to Sequence Learning Encoder Decoder Loss Function Training Prediction Evaluation of Predicted Sequences Beam Search Greedy Search Exhaustive Search Beam Search Attention Mechanisms Attention Cues Attention Cues in Biology Queries, Keys, and Values Visualization of Attention Attention Pooling: Nadaraya-Watson Kernel Regression Generating the Dataset Average Pooling Nonparametric Attention Pooling Parametric Attention Pooling Attention Scoring Functions Masked Softmax Operation Additive Attention Scaled Dot-Product Attention Bahdanau Attention Model Defining the Decoder with Attention Training Multi-Head Attention Model Implementation Self-Attention and Positional Encoding Self-Attention Comparing CNNs, RNNs, and Self-Attention Positional Encoding Transformer Model Positionwise Feed-Forward Networks Residual Connection and Layer Normalization Encoder Decoder Training Optimization Algorithms Optimization and Deep Learning Optimization and Estimation Optimization Challenges in Deep Learning Convexity Basics Properties Constraints Gradient Descent Gradient Descent in One Dimension Multivariate Gradient Descent Adaptive Methods Stochastic Gradient Descent Stochastic Gradient Updates Dynamic Learning Rate Convergence Analysis for Convex Objectives Stochastic Gradients and Finite Samples Minibatch Stochastic Gradient Descent Vectorization and Caches Minibatches Reading the Dataset Implementation from Scratch Concise Implementation Momentum Basics Practical Experiments Theoretical Analysis Adagrad Sparse Features and Learning Rates Preconditioning The Algorithm Implementation from Scratch Concise Implementation RMSProp The Algorithm Implementation from Scratch Concise Implementation Adadelta The Algorithm Implementation Adam The Algorithm Implementation Yogi Learning Rate Scheduling Toy Problem Schedulers Policies Computational Performance Compilers and Interpreters Symbolic Programming Hybrid Programming HybridSequential Asynchronous Computation Asynchrony via Backend Barriers and Blockers Improving Computation Improving Memory Footprint Automatic Parallelism Parallel Computation on GPUs Parallel Computation and Communication Hardware Computers Memory Storage CPUs GPUs and other Accelerators Networks and Buses More Latency Numbers Training on Multiple GPUs Splitting the Problem Data Parallelism A Toy Network Data Synchronization Distributing Data Training Experiment Concise Implementation for Multiple GPUs A Toy Network Parameter Initialization and Logistics Training Experiments Parameter Servers Data Parallel Training Ring Synchronization Multi-Machine Training (key,value) Stores Computer Vision Image Augmentation Common Image Augmentation Method Using an Image Augmentation Training Model Fine-Tuning Hot Dog Recognition Object Detection and Bounding Boxes Bounding Box Anchor Boxes Generating Multiple Anchor Boxes Intersection over Union Labeling Training Set Anchor Boxes Bounding Boxes for Prediction Multiscale Object Detection The Object Detection Dataset Downloading the Dataset Reading the Dataset Demonstration Single Shot Multibox Detection (SSD) Model Training Prediction Region-based CNNs (R-CNNs) R-CNNs Fast R-CNN Faster R-CNN Mask R-CNN Semantic Segmentation and the Dataset Image Segmentation and Instance Segmentation The Pascal VOC2012 Semantic Segmentation Dataset Transposed Convolution Basic 2D Transposed Convolution Padding, Strides, and Channels Analogy to Matrix Transposition Fully Convolutional Networks (FCN) Constructing a Model Initializing the Transposed Convolution Layer Reading the Dataset Training Prediction Neural Style Transfer Technique Reading the Content and Style Images Preprocessing and Postprocessing Extracting Features Defining the Loss Function Creating and Initializing the Composite Image Training Image Classification (CIFAR-10) on Kaggle Obtaining and Organizing the Dataset Image Augmentation Reading the Dataset Defining the Model Defining the Training Functions Training and Validating the Model Classifying the Testing Set and Submitting Results on Kaggle Dog Breed Identification (ImageNet Dogs) on Kaggle Obtaining and Organizing the Dataset Image Augmentation Reading the Dataset Defining the Model Defining the Training Functions Training and Validating the Model Classifying the Testing Set and Submitting Results on Kaggle Natural Language Processing: Pretraining Word Embedding (word2vec) Why Not Use One-hot Vectors? The Skip-Gram Model The Continuous Bag of Words (CBOW) Model Approximate Training Negative Sampling Hierarchical Softmax The Dataset for Pretraining Word Embedding Reading and Preprocessing the Dataset Subsampling Loading the Dataset Putting All Things Together Pretraining word2vec The Skip-Gram Model Training Applying the Word Embedding Model Word Embedding with Global Vectors (GloVe) The GloVe Model Understanding GloVe from Conditional Probability Ratios Subword Embedding fastText Byte Pair Encoding Finding Synonyms and Analogies Using Pretrained Word Vectors Applying Pretrained Word Vectors Bidirectional Encoder Representations from Transformers (BERT) From Context-Independent to Context-Sensitive From Task-Specific to Task-Agnostic BERT: Combining the Best of Both Worlds Input Representation Pretraining Tasks Putting All Things Together The Dataset for Pretraining BERT Defining Helper Functions for Pretraining Tasks Transforming Text into the Pretraining Dataset Pretraining BERT Pretraining BERT Representing Text with BERT Natural Language Processing: Applications Sentiment Analysis and the Dataset The Sentiment Analysis Dataset Putting All Things Together Sentiment Analysis: Using Recurrent Neural Networks Using a Recurrent Neural Network Model Sentiment Analysis: Using Convolutional Neural Networks One-Dimensional Convolutional Layer Max-Over-Time Pooling Layer The TextCNN Model Natural Language Inference and the Dataset Natural Language Inference The Stanford Natural Language Inference (SNLI) Dataset Natural Language Inference: Using Attention The Model Training and Evaluating the Model Fine-Tuning BERT for Sequence-Level and Token-Level Applications Single Text Classification Text Pair Classification or Regression Text Tagging Question Answering Natural Language Inference: Fine-Tuning BERT Loading Pretrained BERT The Dataset for Fine-Tuning BERT Fine-Tuning BERT Recommender Systems Overview of Recommender Systems Collaborative Filtering Explicit Feedback and Implicit Feedback Recommendation Tasks The MovieLens Dataset Getting the Data Statistics of the Dataset Splitting the dataset Loading the data Matrix Factorization The Matrix Factorization Model Model Implementation Evaluation Measures Training and Evaluating the Model AutoRec: Rating Prediction with Autoencoders Model Implementing the Model Reimplementing the Evaluator Training and Evaluating the Model Personalized Ranking for Recommender Systems Bayesian Personalized Ranking Loss and its Implementation Hinge Loss and its Implementation Neural Collaborative Filtering for Personalized Ranking The NeuMF model Model Implementation Customized Dataset with Negative Sampling Evaluator Training and Evaluating the Model Sequence-Aware Recommender Systems Model Architectures Model Implementation Sequential Dataset with Negative Sampling Load the MovieLens 100K dataset Train the Model Feature-Rich Recommender Systems An Online Advertising Dataset Dataset Wrapper Factorization Machines 2-Way Factorization Machines An Efficient Optimization Criterion Model Implementation Load the Advertising Dataset Train the Model Deep Factorization Machines Model Architectures Implemenation of DeepFM Training and Evaluating the Model Generative Adversarial Networks Generative Adversarial Networks Generate some “real” data Generator Discriminator Training Deep Convolutional Generative Adversarial Networks The Pokemon Dataset The Generator Discriminator Training Appendix: Mathematics for Deep Learning Geometry and Linear Algebraic Operations Geometry of Vectors Dot Products and Angles Hyperplanes Geometry of Linear Transformations Linear Dependence Rank Invertibility Determinant Tensors and Common Linear Algebra Operations Eigendecompositions Finding Eigenvalues Decomposing Matrices Operations on Eigendecompositions Eigendecompositions of Symmetric Matrices Gershgorin Circle Theorem A Useful Application: The Growth of Iterated Maps Conclusions Single Variable Calculus Differential Calculus Rules of Calculus Multivariable Calculus Higher-Dimensional Differentiation Geometry of Gradients and Gradient Descent A Note on Mathematical Optimization Multivariate Chain Rule The Backpropagation Algorithm Hessians A Little Matrix Calculus Integral Calculus Geometric Interpretation The Fundamental Theorem of Calculus Change of Variables A Comment on Sign Conventions Multiple Integrals Change of Variables in Multiple Integrals Random Variables Continuous Random Variables Maximum Likelihood The Maximum Likelihood Principle Numerical Optimization and the Negative Log-Likelihood Maximum Likelihood for Continuous Variables Distributions Bernoulli Discrete Uniform Continuous Uniform Binomial Poisson Gaussian Exponential Family Naive Bayes Optical Character Recognition The Probabilistic Model for Classification The Naive Bayes Classifier Training Statistics Evaluating and Comparing Estimators Conducting Hypothesis Tests Constructing Confidence Intervals Information Theory Information Entropy Mutual Information Kullback–Leibler Divergence Cross Entropy Appendix: Tools for Deep Learning Using Jupyter Editing and Running the Code Locally Advanced Options Using Amazon SageMaker Registering and Logging In Creating a SageMaker Instance Running and Stopping an Instance Updating Notebooks Using AWS EC2 Instances Creating and Running an EC2 Instance Installing CUDA Installing MXNet and Downloading the D2L Notebooks Running Jupyter Closing Unused Instances Using Google Colab Selecting Servers and GPUs Selecting Servers Selecting GPUs Contributing to This Book Minor Text Changes Propose a Major Change Adding a New Section or a New Framework Implementation Submitting a Major Change d2l API Document Bibliography Python Module Index Index