برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید

09117307688
09117179751

در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید

دسترسی نامحدود

برای کاربرانی که ثبت نام کرده اند

ضمانت بازگشت وجه

درصورت عدم همخوانی توضیحات با کتاب

پشتیبانی

از ساعت 7 صبح تا 10 شب

دانلود کتاب Deep Learning for NLP and Speech Recognition

دانلود کتاب یادگیری عمیق برای NLP و تشخیص گفتار

مشخصات کتاب

Deep Learning for NLP and Speech Recognition

دسته بندی: سایبرنتیک: هوش مصنوعی
ویرایش:  
نویسندگان: Uday Kamath,  John Liu,  Jimmy Whitaker  
سری:  
ISBN (شابک) : 9783030145958, 3030145956 
ناشر: Springer 
سال نشر: 2019 
تعداد صفحات: 640 
زبان: English 
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) 
حجم فایل: 8 مگابایت

قیمت کتاب (تومان) : 35,000

میانگین امتیاز به این کتاب :
تعداد امتیاز دهندگان : 8

در صورت تبدیل فایل کتاب Deep Learning for NLP and Speech Recognition به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.

توجه داشته باشید کتاب یادگیری عمیق برای NLP و تشخیص گفتار نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.

توضیحاتی در مورد کتاب یادگیری عمیق برای NLP و تشخیص گفتار

این کتاب درسی معماری یادگیری عمیق را با کاربردهایی برای وظایف مختلف NLP از جمله طبقه‌بندی اسناد، ترجمه ماشینی، مدل‌سازی زبان و تشخیص گفتار توضیح می‌دهد. با پذیرش گسترده یادگیری عمیق، پردازش زبان طبیعی (NLP) و کاربردهای گفتار در بسیاری از زمینه ها (از جمله امور مالی، بهداشت و درمان و دولت)، نیاز روزافزونی به یک منبع جامع وجود دارد که تکنیک های یادگیری عمیق را به NLP و گفتار ترسیم کند و ارائه دهد. بینش در مورد استفاده از ابزارها و کتابخانه ها برای برنامه های کاربردی دنیای واقعی. Deep Learning برای NLP و Speech Recognition روش های اخیر یادگیری عمیق قابل استفاده در NLP و گفتار را توضیح می دهد، رویکردهای پیشرفته ای را ارائه می دهد و مطالعات موردی در دنیای واقعی را با کد ارائه می دهد تا تجربه عملی را ارائه دهد. بسیاری از کتاب ها بر تئوری یادگیری عمیق یا یادگیری عمیق برای وظایف خاص NLP تمرکز می کنند، در حالی که برخی دیگر کتاب آشپزی برای ابزارها و کتابخانه ها هستند، اما جریان مداوم الگوریتم ها، ابزارها، چارچوب ها و کتابخانه های جدید در یک چشم انداز به سرعت در حال تکامل به این معنی است که متون در دسترس کمی وجود دارد. که مطالب این کتاب را ارائه می دهد. این کتاب در سه بخش تنظیم شده است که با گروه های مختلف خوانندگان و تخصص آنها هماهنگ است. سه بخش عبارتند از: یادگیری ماشینی، NLP و معرفی گفتار قسمت اول دارای سه فصل است که خوانندگان را با زمینه های NLP، تشخیص گفتار، یادگیری عمیق و یادگیری ماشین با تئوری پایه و مطالعات موردی عملی با استفاده از ابزارهای مبتنی بر پایتون آشنا می کند. و کتابخانه ها مبانی یادگیری عمیق پنج فصل در بخش دوم، یادگیری عمیق و موضوعات مختلفی را معرفی می‌کند که برای پردازش گفتار و متن حیاتی هستند، از جمله جاسازی‌های کلمه، شبکه‌های عصبی کانولوشنال، شبکه‌های عصبی تکراری و مبانی تشخیص گفتار. تئوری، نکات عملی، روش‌های پیشرفته، آزمایش‌ها و تجزیه و تحلیل در استفاده از روش‌های مطرح شده در تئوری در کارهای دنیای واقعی. تکنیک‌های یادگیری عمیق پیشرفته برای متن و گفتار بخش سوم دارای پنج فصل است که آخرین و جدیدترین تحقیقات در زمینه‌های یادگیری عمیق را که با NLP و گفتار تلاقی می‌کنند، مورد بحث قرار می‌دهد. موضوعاتی از جمله مکانیسم‌های توجه، شبکه‌های تقویت‌شده حافظه، یادگیری انتقالی، یادگیری چند وظیفه‌ای، تطبیق دامنه، یادگیری تقویتی و یادگیری عمیق پایان به انتها برای تشخیص گفتار با استفاده از مطالعات موردی پوشش داده می‌شوند.

توضیحاتی درمورد کتاب به خارجی

This textbook explains Deep Learning Architecture, with applications to various NLP Tasks, including Document Classification, Machine Translation, Language Modeling, and Speech Recognition. With the widespread adoption of deep learning, natural language processing (NLP),and speech applications in many areas (including Finance, Healthcare, and Government) there is a growing need for one comprehensive resource that maps deep learning techniques to NLP and speech and provides insights into using the tools and libraries for real-world applications. Deep Learning for NLP and Speech Recognition explains recent deep learning methods applicable to NLP and speech, provides state-of-the-art approaches, and offers real-world case studies with code to provide hands-on experience. Many books focus on deep learning theory or deep learning for NLP-specific tasks while others are cookbooks for tools and libraries, but the constant flux of new algorithms, tools, frameworks, and libraries in a rapidly evolving landscape means that there are few available texts that offer the material in this book. The book is organized into three parts, aligning to different groups of readers and their expertise. The three parts are: Machine Learning, NLP, and Speech Introduction The first part has three chapters that introduce readers to the fields of NLP, speech recognition, deep learning and machine learning with basic theory and hands-on case studies using Python-based tools and libraries. Deep Learning Basics The five chapters in the second part introduce deep learning and various topics that are crucial for speech and text processing, including word embeddings, convolutional neural networks, recurrent neural networks and speech recognition basics. Theory, practical tips, state-of-the-art methods, experimentations and analysis in using the methods discussed in theory on real-world tasks. Advanced Deep Learning Techniques for Text and Speech The third part has five chapters that discuss the latest and cutting-edge research in the areas of deep learning that intersect with NLP and speech. Topics including attention mechanisms, memory augmented networks, transfer learning, multi-task learning, domain adaptation, reinforcement learning, and end-to-end deep learning for speech recognition are covered using case studies.

فهرست مطالب

Foreword......Page 6
 Why This Book?......Page 8
 What Does This Book Cover?......Page 9
 Acknowledgments......Page 12
Contents......Page 14
Notation......Page 26
Part I Machine Learning, NLP, and Speech Introduction......Page 28
 1 Introduction......Page 29
   1.1.1 Supervised Learning......Page 31
   1.1.2 Unsupervised Learning......Page 32
  1.2 History......Page 33
   1.2.1 Deep Learning: A Brief History......Page 34
   1.2.2 Natural Language Processing: A Brief History......Page 37
   1.2.3 Automatic Speech Recognition: A Brief History......Page 41
   1.3.1 Deep Learning......Page 44
   1.3.2 Natural Language Processing......Page 45
    1.3.3.3 Additional Tools and Libraries......Page 46
   1.3.5 Online Courses and Resources......Page 47
   1.3.6 Datasets......Page 48
  1.4 Case Studies and Implementation Details......Page 51
  References......Page 53
  2.1 Introduction......Page 65
   2.2.1 Input Space and Samples......Page 66
   2.2.3 Training and Prediction......Page 67
  2.3 The Learning Process......Page 68
   2.4.1 Generalization–Approximation Trade-Off via the Vapnik–Chervonenkis Analysis......Page 69
   2.4.2 Generalization–Approximation Trade-off via the Bias–Variance Analysis......Page 72
    2.4.3.1 Classification Evaluation Metrics......Page 73
    2.4.3.2 Regression Evaluation Metrics......Page 75
   2.4.4 Model Validation......Page 76
   2.4.5 Model Estimation and Comparisons......Page 79
   2.4.6 Practical Tips for Machine Learning......Page 80
   2.5.1 Linear Regression......Page 81
    2.5.1.1 Discussion Points......Page 83
    2.5.2.1 Discussion Points......Page 84
   2.5.3 Regularization......Page 85
    2.5.3.2 Lasso Regularization: L1 Norm......Page 86
   2.5.4 Logistic Regression......Page 87
    2.5.4.1 Gradient Descent......Page 88
    2.5.4.2 Stochastic Gradient Descent......Page 89
   2.5.5 Generative Classifiers......Page 90
    2.5.5.1 Naive Bayes......Page 91
   2.5.6 Practical Tips for Linear Algorithms......Page 92
  2.6 Non-linear Algorithms......Page 93
   2.6.1 Support Vector Machines......Page 94
  2.7 Feature Transformation, Selection, and Dimensionality Reduction......Page 95
    2.7.1.4 Discretization......Page 96
    2.7.2.1 Principal Component Analysis......Page 97
   2.8.1 Discrete Time Markov Chains......Page 98
   2.8.2 Discriminative Approach: Hidden Markov Models......Page 99
   2.8.3 Generative Approach: Conditional Random Fields......Page 101
    2.8.3.2 CRF Distribution......Page 102
    2.8.3.3 CRF Training......Page 103
   2.9.2 Exploratory Data Analysis (EDA)......Page 104
    2.9.3.1 Feature Transformation and Reduction Impact......Page 105
    2.9.3.2 Hyperparameter Search and Validation......Page 108
   2.9.4 Final Training and Testing Models......Page 109
  References......Page 111
   3.1.1 Computational Linguistics......Page 113
   3.1.2 Natural Language......Page 114
   3.1.3 Model of Language......Page 115
  3.2 Morphological Analysis......Page 116
   3.2.1 Stemming......Page 117
   3.3.1 Tokens......Page 118
   3.3.3 N-Grams......Page 119
   3.3.4 Documents......Page 120
    3.3.4.2 Bag-of-Words......Page 121
  3.4 Syntactic Representations......Page 122
   3.4.1 Part-of-Speech......Page 123
    3.4.1.2 Hidden Markov Models......Page 124
   3.4.2 Dependency Parsing......Page 125
    3.4.2.2 Chunking......Page 126
  3.5 Semantic Representations......Page 127
   3.5.1 Named Entity Recognition......Page 128
   3.5.2 Relation Extraction......Page 129
   3.5.4 Semantic Role Labeling......Page 130
   3.6.3 Anaphora/Cataphora......Page 131
  3.7 Language Models......Page 132
   3.7.2 Laplace Smoothing......Page 133
   3.7.4 Perplexity......Page 134
   3.8.1 Machine Learning Approach......Page 135
    3.8.2.1 Emotional State Model......Page 136
    3.8.2.2 Subjectivity and Objectivity Detection......Page 137
   3.8.3 Entailment......Page 138
  3.9 Text Clustering......Page 139
    3.9.2.1 LSA......Page 140
   3.10.1 Dictionary Based......Page 141
  3.11 Question Answering......Page 142
   3.11.1 Information Retrieval Based......Page 143
   3.11.3 Automated Reasoning......Page 144
   3.12.1 Extraction Based......Page 145
   3.13.1 Acoustic Model......Page 146
    3.13.1.2 MFCC......Page 147
  3.14 Case Study......Page 148
   3.14.2 EDA......Page 149
   3.14.3 Text Clustering......Page 152
    3.14.4.1 LSA......Page 155
   3.14.5 Text Classification......Page 157
   3.14.6 Exercises for Readers and Practitioners......Page 159
  References......Page 160
Part II Deep Learning Basics......Page 165
  4.1 Introduction......Page 166
   4.2.1 Bias......Page 168
  4.3 Multilayer Perceptron (Neural Networks)......Page 171
   4.3.1 Training an MLP......Page 172
   4.3.2 Forward Propagation......Page 173
   4.3.3 Error Computation......Page 174
   4.3.4 Backpropagation......Page 175
   4.3.5 Parameter Update......Page 177
   4.3.6 Universal Approximation Theorem......Page 178
  4.4 Deep Learning......Page 179
   4.4.1 Activation Functions......Page 180
    4.4.1.1 Sigmoid......Page 181
    4.4.1.3 ReLU......Page 182
    4.4.1.4 Other Activation Functions......Page 183
    4.4.1.6 Hierarchical Softmax......Page 185
    4.4.2.2 Mean Absolute (L1) Error......Page 186
   4.4.3 Optimization Methods......Page 187
    4.4.3.3 Adagrad......Page 188
    4.4.3.5 ADAM......Page 189
   4.5.1 Early Stopping......Page 190
   4.5.2 Vanishing/Exploding Gradients......Page 191
   4.5.4 Regularization......Page 192
    4.5.4.2 L1 Regularization......Page 193
    4.5.4.3 Dropout......Page 194
    4.5.4.6 Batch Normalization......Page 195
    4.5.5.2 Automated Tuning......Page 196
   4.5.6 Data Availability and Quality......Page 197
    4.5.6.3 Adversarial Training......Page 198
    4.5.7.1 Computation and Memory Constraints......Page 199
   4.6.1 Energy-Based Models......Page 200
   4.6.2 Restricted Boltzmann Machines......Page 201
   4.6.4 Autoencoders......Page 203
    4.6.4.3 Sparse Autoencoders......Page 205
    4.6.4.4 Variational Autoencoders......Page 206
   4.6.6 Generative Adversarial Networks......Page 207
  4.7 Framework Considerations......Page 208
   4.7.1 Layer Abstraction......Page 209
   4.7.2 Computational Graphs......Page 210
   4.7.4 Static Computational Graphs......Page 211
   4.8.1 Software Tools and Libraries......Page 212
   4.8.2 Exploratory Data Analysis (EDA)......Page 213
   4.8.3 Supervised Learning......Page 214
   4.8.4 Unsupervised Learning......Page 218
   4.8.5 Classifying with Unsupervised Features......Page 221
   4.8.7 Exercises for Readers and Practitioners......Page 223
  References......Page 224
   5.2.1 Vector Space Model......Page 227
    5.2.1.1 Curse of Dimensionality......Page 228
    5.2.2.2 LSA......Page 229
    5.2.3.1 Bengio......Page 230
    5.2.3.2 Collobert and Weston......Page 231
   5.2.4 word2vec......Page 232
    5.2.4.1 CBOW......Page 233
    5.2.4.2 Skip-Gram......Page 234
    5.2.4.3 Hierarchical Softmax......Page 236
    5.2.4.4 Negative Sampling......Page 237
    5.2.4.6 word2vec CBOW: Forward and Backward Propagation......Page 238
    5.2.4.7 word2vec Skip-gram: Forward and Backward Propagation......Page 241
   5.2.5 GloVe......Page 243
   5.2.6 Spectral Word Embeddings......Page 245
   5.3.1 Out of Vocabulary......Page 246
   5.3.2 Antonymy......Page 247
   5.3.3 Polysemy......Page 248
    5.3.3.2 Sense2vec......Page 249
  5.4 Beyond Word Embeddings......Page 251
   5.4.2 Word Vector Quantization......Page 252
   5.4.3 Sentence Embeddings......Page 254
   5.4.4 Concept Embeddings......Page 256
   5.4.5 Retrofitting with Semantic Lexicons......Page 257
    5.4.6.1 Word2Gauss......Page 258
    5.4.6.2 Bayesian Skip-Gram......Page 259
   5.4.7 Hyperbolic Embeddings......Page 260
  5.5 Applications......Page 262
   5.5.2 Document Clustering......Page 263
   5.5.3 Language Modeling......Page 264
   5.5.4 Text Anomaly Detection......Page 265
   5.5.5 Contextualized Embeddings......Page 266
   5.6.2 Exploratory Data Analysis......Page 267
   5.6.3 Learning Word Embeddings......Page 268
    5.6.3.2 Negative Sampling......Page 270
    5.6.3.3 Training the Model......Page 271
    5.6.3.5 Using the Gensim package......Page 272
    5.6.3.6 Similarity......Page 273
    5.6.3.7 GloVe Embeddings......Page 275
    5.6.3.8 Co-occurrence Matrix......Page 276
    5.6.3.9 GloVe Training......Page 277
    5.6.3.10 GloVe Vector Similarity......Page 278
    5.6.3.11 Using the Glove Package......Page 279
    5.6.4.1 Document Vectors......Page 280
   5.6.5 Word Sense Disambiguation......Page 281
    5.6.5.2 Training with word2vec......Page 282
  References......Page 283
  6.1 Introduction......Page 286
    6.2.1.2 The Convolution Operator and Its Properties......Page 287
   6.2.2 Local Connectivity or Sparse Interactions......Page 288
   6.2.4 Spatial Arrangement......Page 289
   6.2.5 Detector Using Nonlinearity......Page 293
    6.2.6.2 Average Pooling......Page 294
    6.2.6.5 Spectral Pooling......Page 295
  6.3 Forward and Backpropagation in CNN......Page 296
   6.3.1 Gradient with Respect to Weights ∂E∂W......Page 297
   6.3.2 Gradient with Respect to the Inputs ∂E∂X......Page 298
  6.4 Text Inputs and CNNs......Page 299
   6.4.1 Word Embeddings and CNN......Page 300
   6.4.2 Character-Based Representation and CNN......Page 303
  6.5 Classic CNN Architectures......Page 304
   6.5.1 LeNet-5......Page 305
   6.5.2 AlexNet......Page 306
  6.6 Modern CNN Architectures......Page 308
   6.6.1 Stacked or Hierarchical CNN......Page 309
   6.6.2 Dilated CNN......Page 310
   6.6.3 Inception Networks......Page 311
   6.6.4 Other CNN Structures......Page 312
  6.7 Applications of CNN in NLP......Page 315
   6.7.1 Text Classification and Categorization......Page 316
   6.7.4 Information Extraction......Page 317
   6.7.5 Machine Translation......Page 318
   6.7.7 Question and Answers......Page 319
   6.8.2 Fast Filtering Algorithm......Page 320
   6.9.1 Software Tools and Libraries......Page 323
   6.9.3 Data Preprocessing and Data Splits......Page 324
   6.9.4 CNN Model Experiments......Page 326
   6.9.5 Understanding and Improving the Models......Page 330
   6.9.6 Exercises for Readers and Practitioners......Page 332
  References......Page 333
  7.1 Introduction......Page 338
   7.2.1 Recurrence and Memory......Page 339
   7.2.2 PyTorch Example......Page 340
   7.3.1 Forward and Backpropagation in RNNs......Page 341
    7.3.1.1 Output Weights (V)......Page 343
    7.3.1.2 Recurrent Weights (W)......Page 344
   7.3.2 Vanishing Gradient Problem and Regularization......Page 346
    7.3.2.1 Long Short-Term Memory......Page 347
    7.3.2.2 Gated Recurrent Unit......Page 348
    7.3.2.4 BPTT Sequence Length......Page 349
   7.4.1 Deep RNNs......Page 350
   7.4.2 Residual LSTM......Page 351
   7.4.4 Bidirectional RNNs......Page 352
   7.4.6 Recursive Neural Networks......Page 354
  7.5 Extensions of Recurrent Networks......Page 356
   7.5.1 Sequence-to-Sequence......Page 357
   7.5.2 Attention......Page 358
   7.5.3 Pointer Networks......Page 359
   7.5.4 Transformer Networks......Page 360
   7.6.1 Text Classification......Page 362
   7.6.4 Topic Modeling and Summarization......Page 363
   7.6.7 Language Models......Page 364
   7.6.8 Neural Machine Translation......Page 366
    7.6.8.1 BLEU......Page 368
    7.6.9.2 Random Sampling and Temperature Sampling......Page 369
    7.6.9.3 Optimizing Output: Beam Search Decoding......Page 370
  7.7 Case Study......Page 371
    7.7.2.1 Sequence Length Filtering......Page 372
    7.7.2.2 Vocabulary Inspection......Page 374
   7.7.3 Model Training......Page 378
    7.7.3.1 RNN Baseline......Page 379
    7.7.3.3 RNN, LSTM, and GRU Layer Depth Comparison......Page 380
    7.7.3.5 Deep Bidirectional Comparison......Page 381
    7.7.3.6 Transformer Network......Page 383
   7.7.4 Results......Page 385
   7.7.5 Exercises for Readers and Practitioners......Page 386
   7.8.1 Memorization or Generalization......Page 387
  References......Page 388
  8.1 Introduction......Page 392
   8.2.1 Speech Production......Page 393
   8.2.2 Raw Waveform......Page 394
    8.2.3.1 Pre-emphasis......Page 395
    8.2.3.3 Windowing......Page 396
    8.2.3.4 Fast Fourier Transform......Page 397
    8.2.3.6 Discrete Cosine Transform......Page 398
   8.2.4 Other Feature Types......Page 399
  8.3 Phones......Page 400
  8.4 Statistical Speech Recognition......Page 402
   8.4.1 Acoustic Model: P(X|W)......Page 404
    8.4.1.1 Lexicon Model: P(S|W)......Page 407
   8.4.2 Language Model: P(W)......Page 408
   8.4.3 HMM Decoding......Page 409
  8.5 Error Metrics......Page 410
  8.6 DNN/HMM Hybrid Model......Page 411
  8.7 Case Study......Page 414
   8.7.3 Sphinx......Page 415
    8.7.3.1 Data Preparation......Page 416
   8.7.4 Kaldi......Page 419
    8.7.4.1 Data Preparation......Page 420
    8.7.4.2 Model Training......Page 422
   8.7.5 Results......Page 424
   8.7.6 Exercises for Readers and Practitioners......Page 425
  References......Page 426
Part III Advanced Deep Learning Techniques for Text and Speech......Page 428
  9.1 Introduction......Page 429
  9.2 Attention Mechanism......Page 430
   9.2.1 The Need for Attention Mechanism......Page 431
   9.2.2 Soft Attention......Page 432
   9.2.3 Scores-Based Attention......Page 433
   9.2.5 Local vs. Global Attention......Page 434
   9.2.6 Self-Attention......Page 435
   9.2.7 Key-Value Attention......Page 436
   9.2.8 Multi-Head Self-Attention......Page 437
   9.2.9 Hierarchical Attention......Page 438
   9.2.10 Applications of Attention Mechanism in Text and Speech......Page 440
   9.3.1 Memory Networks......Page 441
    9.3.2.2 Input and Query......Page 444
    9.3.2.6 Multiple Layers......Page 445
   9.3.3 Neural Turing Machines......Page 446
    9.3.3.1 Read Operations......Page 447
    9.3.3.2 Write Operations......Page 448
    9.3.3.3 Addressing Mechanism......Page 449
   9.3.4 Differentiable Neural Computer......Page 450
    9.3.4.2 Memory Reads and Writes......Page 451
    9.3.4.3 Selective Attention......Page 452
    9.3.5.1 Input Module......Page 453
    9.3.5.3 Episodic Memory Module......Page 454
    9.3.5.5 Training......Page 455
    9.3.6.1 Neural Stack......Page 456
    9.3.6.2 Recurrent Networks, Controller, and Training......Page 458
    9.3.7.1 Input Encoder......Page 459
    9.3.7.2 Dynamic Memory......Page 460
    9.3.7.3 Output Module and Training......Page 461
   9.4.1 Attention-Based NMT......Page 462
    9.4.2.2 Model Training......Page 463
    9.4.2.3 Bahdanau Attention......Page 467
    9.4.2.4 Results......Page 471
    9.4.3.2 Exploratory Data Analysis......Page 472
    9.4.3.3 LSTM Baseline......Page 473
    9.4.3.4 End-to-End Memory Network......Page 475
   9.4.4 Dynamic Memory Network......Page 477
    9.4.4.1 Differentiable Neural Computer......Page 478
    9.4.4.2 Recurrent Entity Network......Page 480
   9.4.5 Exercises for Readers and Practitioners......Page 481
  References......Page 482
  10.1 Introduction......Page 485
  10.2 Transfer Learning: Definition, Scenarios, and Categorization......Page 486
   10.2.1 Definition......Page 487
   10.2.3 Transfer Learning Categories......Page 488
  10.3 Self-Taught Learning......Page 489
    10.3.1.1 Unsupervised Pre-training and Supervised Fine-Tuning......Page 490
   10.3.2 Theory......Page 491
   10.3.4 Applications in Speech......Page 492
    10.4.1.1 Multilinear Relationship Network......Page 493
    10.4.1.2 Fully Adaptive Feature Sharing Network......Page 495
    10.4.1.3 Cross-Stitch Networks......Page 496
    10.4.1.4 A Joint Many-Task Network......Page 498
    10.4.1.5 Sluice Networks......Page 500
   10.4.3 Applications in NLP......Page 502
   10.5.1 Software Tools and Libraries......Page 504
   10.5.2 Exploratory Data Analysis......Page 505
   10.5.3 Multitask Learning Experiments and Analysis......Page 506
  References......Page 511
  11.1 Introduction......Page 516
    11.1.1.1 Stacked Autoencoders......Page 517
    11.1.1.2 Deep Interpolation Between Source and Target......Page 519
    11.1.1.3 Deep Domain Confusion......Page 521
    11.1.1.4 Deep Adaptation Network......Page 522
    11.1.1.5 Domain-Invariant Representation......Page 523
    11.1.1.6 Domain Confusion and Invariant Representation......Page 524
    11.1.1.7 Domain-Adversarial Neural Network......Page 526
    11.1.1.8 Adversarial Discriminative Domain Adaptation......Page 527
    11.1.1.9 Coupled Generative Adversarial Networks......Page 529
    11.1.1.10 Cycle Generative Adversarial Networks......Page 531
    11.1.1.11 Domain Separation Networks......Page 532
    11.1.2.1 Siamese Networks Based Domain Adaptations......Page 534
   11.1.3 Applications in NLP......Page 536
   11.1.4 Applications in Speech Recognition......Page 537
   11.2.1 Zero-Shot Learning......Page 538
    11.2.1.1 Techniques......Page 539
    11.2.2.1 Techniques......Page 541
    11.2.3.1 Techniques......Page 542
   11.2.5 Applications in NLP and Speech Recognition......Page 543
  11.3 Case Study......Page 544
   11.3.2 Exploratory Data Analysis......Page 545
   11.3.3 Domain Adaptation Experiments......Page 546
    11.3.3.2 Experiments......Page 547
    11.3.3.3 Results and Analysis......Page 550
   11.3.4 Exercises for Readers and Practitioners......Page 551
  References......Page 552
  12.1 Introduction......Page 557
  12.2 Connectionist Temporal Classification (CTC)......Page 558
   12.2.2 Deep Speech......Page 561
    12.2.2.1 GPU Parallelism......Page 562
   12.2.3 Deep Speech 2......Page 563
   12.2.4 Wav2Letter......Page 564
    12.2.5.1 Gram-CTC......Page 565
  12.3 Seq-to-Seq......Page 566
   12.3.0.2 Location-Aware Attention......Page 567
   12.3.2 Listen, Attend, and Spell (LAS)......Page 568
  12.4 Multitask Learning......Page 569
    12.5.1.1 N-gram......Page 571
   12.5.2 CTC Decoding......Page 572
    12.5.3.1 Shallow Fusion......Page 575
    12.5.4.1 Deep Fusion......Page 576
   12.5.5 Combined CTC–Attention Decoding......Page 577
   12.5.6 One-Pass Decoding......Page 578
   12.6.1 Speech Embeddings......Page 579
   12.6.3 Audio Word2Vec......Page 580
   12.7.1 Software Tools and Libraries......Page 581
    12.7.2.2 Acoustic Model Training......Page 582
   12.7.3 Language Model Training......Page 584
   12.7.4 ESPnet......Page 586
    12.7.4.2 Model Training......Page 587
   12.7.5 Results......Page 590
  References......Page 591
  13.2 RL Fundamentals......Page 595
   13.2.1 Markov Decision Processes......Page 596
   13.2.2 Value, Q, and Advantage Functions......Page 597
   13.2.3 Bellman Equations......Page 598
   13.2.4 Optimality......Page 599
    13.2.5.2 Policy Improvement......Page 600
    13.2.5.4 Bootstrapping......Page 601
   13.2.6 Monte Carlo......Page 602
   13.2.7 Temporal Difference Learning......Page 603
    13.2.7.1 SARSA......Page 605
   13.2.8 Policy Gradient......Page 606
   13.2.9 Q-Learning......Page 607
   13.2.10 Actor-Critic......Page 608
    13.2.10.1 Advantage Actor Critic A2C......Page 609
   13.3.1 Why RL for Seq2seq......Page 610
   13.3.2 Deep Policy Gradient......Page 611
    13.3.3.1 DQN......Page 612
    13.3.3.2 Double DQN......Page 614
    13.3.3.3 Dueling Networks......Page 615
   13.3.4 Deep Advantage Actor-Critic......Page 616
   13.4.1 Information Extraction......Page 617
    13.4.1.1 Entity Extraction......Page 618
    13.4.1.2 Relation Extraction......Page 619
    13.4.1.4 Joint Entity/Relation Extraction......Page 620
   13.4.2 Text Classification......Page 621
   13.4.3 Dialogue Systems......Page 622
   13.4.4 Text Summarization......Page 623
  13.5 DRL for Speech......Page 625
   13.5.2 Speech Enhancement and Noise Suppression......Page 626
   13.6.1 Software Tools and Libraries......Page 627
   13.6.3 Exploratory Data Analysis......Page 628
    13.6.3.2 Policy Gradient......Page 629
    13.6.3.3 DDQN......Page 631
  References......Page 632
 Transition to AI-Centric......Page 634
 Explainable AI......Page 635
 NLP Trends......Page 636
 Closing Remarks......Page 637
Index......Page 638