دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش: [1 ed.] نویسندگان: E. R. Davies (editor), Matthew Turk (editor) سری: ISBN (شابک) : 0128221097, 9780128221099 ناشر: Academic Press سال نشر: 2021 تعداد صفحات: 582 [584] زبان: English فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) حجم فایل: 26 Mb
در صورت تبدیل فایل کتاب Advanced Methods and Deep Learning in Computer Vision (Computer Vision and Pattern Recognition) به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب روش های پیشرفته و یادگیری عمیق در بینایی کامپیوتر (بینایی کامپیوتری و تشخیص الگو) نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
روشهای پیشرفته و یادگیری عمیق در بینایی کامپیوتری روشهای بینایی کامپیوتری پیشرفته را ارائه میدهد، با تاکید بر تکنیکهای یادگیری عمیق ماشینی و عمیق که در طول 5 تا 10 سال گذشته ظهور کردهاند. این کتاب توضیحات روشنی از اصول و الگوریتم های پشتیبانی شده با برنامه ها ارائه می دهد. موضوعات تحت پوشش عبارتند از یادگیری ماشینی، شبکه های یادگیری عمیق، شبکه های متخاصم مولد، یادگیری تقویتی عمیق، یادگیری خود نظارتی، استخراج ویژگی های قوی، تشخیص اشیا، تقسیم بندی معنایی، توصیف زبانی تصاویر، جستجوی بصری، ردیابی بصری، بازیابی شکل سه بعدی، تصویر. Inpainting، تازگی و تشخیص ناهنجاری.
این کتاب یادگیری آسانی را برای محققان و دست اندرکاران روش های پیشرفته بینایی کامپیوتری فراهم می کند، اما همچنین به عنوان کتاب درسی برای دوره دوم بینایی کامپیوتر و یادگیری عمیق برای دانشجویان پیشرفته مناسب است. و دانشجویان تحصیلات تکمیلی.
Advanced Methods and Deep Learning in Computer Vision presents advanced computer vision methods, emphasizing machine and deep learning techniques that have emerged during the past 5–10 years. The book provides clear explanations of principles and algorithms supported with applications. Topics covered include machine learning, deep learning networks, generative adversarial networks, deep reinforcement learning, self-supervised learning, extraction of robust features, object detection, semantic segmentation, linguistic descriptions of images, visual search, visual tracking, 3D shape retrieval, image inpainting, novelty and anomaly detection.
This book provides easy learning for researchers and practitioners of advanced computer vision methods, but it is also suitable as a textbook for a second course on computer vision and deep learning for advanced undergraduates and graduate students.
Front Cover Advanced Methods and Deep Learning in Computer Vision Copyright Contents List of contributors About the editors Preface 1 The dramatically changing face of computer vision 1.1 Introduction – computer vision and its origins 1.2 Part A – Understanding low-level image processing operators 1.2.1 The basics of edge detection 1.2.2 The Canny operator 1.2.3 Line segment detection 1.2.4 Optimizing detection sensitivity 1.2.5 Dealing with variations in the background intensity 1.2.6 A theory combining the matched filter and zero-mean constructs 1.2.7 Mask design—other considerations 1.2.8 Corner detection 1.2.9 The Harris `interest point' operator 1.3 Part B – 2-D object location and recognition 1.3.1 The centroidal profile approach to shape analysis 1.3.2 Hough-based schemes for object detection 1.3.3 Application of the Hough transform to line detection 1.3.4 Using RANSAC for line detection 1.3.5 A graph-theoretic approach to object location 1.3.6 Using the generalized Hough transform (GHT) to save computation 1.3.7 Part-based approaches 1.4 Part C – 3-D object location and the importance of invariance 1.4.1 Introduction to 3-D vision 1.4.2 Pose ambiguities under perspective projection 1.4.3 Invariants as an aid to 3-D recognition 1.4.4 Cross ratios: the `ratio of ratios' concept 1.4.5 Invariants for noncollinear points 1.4.6 Vanishing point detection 1.4.7 More on vanishing points 1.4.8 Summary: the value of invariants 1.4.9 Image transformations for camera calibration 1.4.10 Camera calibration 1.4.11 Intrinsic and extrinsic parameters 1.4.12 Multiple view vision 1.4.13 Generalized epipolar geometry 1.4.14 The essential matrix 1.4.15 The fundamental matrix 1.4.16 Properties of the essential and fundamental matrices 1.4.17 Estimating the fundamental matrix 1.4.18 Improved methods of triangulation 1.4.19 The achievements and limitations of multiple view vision 1.5 Part D – Tracking moving objects 1.5.1 Tracking – the basic concept 1.5.2 Alternatives to background subtraction 1.6 Part E – Texture analysis 1.6.1 Introduction 1.6.2 Basic approaches to texture analysis 1.6.3 Laws' texture energy approach 1.6.4 Ade's eigenfilter approach 1.6.5 Appraisal of the Laws and Ade approaches 1.6.6 More recent developments 1.7 Part F – From artificial neural networks to deep learning methods 1.7.1 Introduction: how ANNs metamorphosed into CNNs 1.7.2 Parameters for defining CNN architectures 1.7.3 Krizhevsky et al.'s AlexNet architecture 1.7.4 Simonyan and Zisserman's VGGNet architecture 1.7.5 Noh et al.'s DeconvNet architecture 1.7.6 Badrinarayanan et al.'s SegNet architecture 1.7.7 Application of deep learning to object tracking 1.7.8 Application of deep learning to texture classification 1.7.9 Texture analysis in the world of deep learning 1.8 Part G – Summary Acknowledgments References Biographies 2 Advanced methods for robust object detection 2.1 Introduction 2.2 Preliminaries 2.3 R-CNN 2.3.1 System design 2.3.2 Training 2.4 SPP-Net 2.5 Fast R-CNN 2.5.1 Architecture 2.5.2 RoI pooling 2.5.3 Multitask loss 2.5.4 Finetuning strategy 2.6 Faster R-CNN 2.6.1 Architecture 2.6.2 Region proposal networks 2.7 Cascade R-CNN 2.7.1 Architecture 2.7.2 Cascaded bounding box regression 2.7.3 Cascaded detection 2.8 Multiscale feature representation 2.8.1 MS-CNN 2.8.1.1 Architecture 2.8.2 FPN 2.8.2.1 Architecture Bottom-up pathway Top-down pathway and lateral connections 2.9 YOLO 2.10 SSD 2.10.1 Architecture 2.10.2 Training 2.11 RetinaNet 2.11.1 Focal loss 2.12 Detection performances 2.13 Conclusion References Biographies 3 Learning with limited supervision 3.1 Introduction 3.2 Context-aware active learning 3.2.1 Active learning 3.2.2 Context in active learning 3.2.3 Framework for context-aware active learning 3.2.4 Applications 3.3 Weakly supervised event localization 3.3.1 Network architecture 3.3.2 k-max multiple instance learning 3.3.3 Coactivity similarity 3.3.4 Applications 3.4 Domain adaptation of semantic segmentation using weak labels 3.4.1 Weak labels for category classification 3.4.2 Weak labels for feature alignment 3.4.3 Network optimization 3.4.4 Acquiring weak labels 3.4.5 Applications 3.4.6 Output space visualization 3.5 Weakly-supervised reinforcement learning for dynamical tasks 3.5.1 Learning subgoal prediction 3.5.2 Supervised pretraining 3.5.3 Applications 3.6 Conclusions Acknowledgments References Biographies 4 Efficient methods for deep learning 4.1 Model compression 4.1.1 Parameter pruning 4.1.2 Low-rank factorization 4.1.3 Quantization 4.1.4 Knowledge distillation 4.1.5 Automated model compression 4.2 Efficient neural network architectures 4.2.1 Standard convolution layer 4.2.2 Efficient convolution layers 4.2.3 Manually designed efficient CNN models 4.2.4 Neural architecture search 4.2.5 Hardware-aware neural architecture search 4.3 Conclusion References 5 Deep conditional image generation 5.1 Introduction 5.2 Visual pattern learning: a brief review 5.3 Classical generative models 5.4 Deep generative models 5.5 Deep conditional image generation 5.6 Disentanglement for controllable synthesis 5.6.1 Disentangle visual content and style 5.6.2 Disentangle structure and style 5.6.3 Disentangle identity and attributes 5.7 Conclusion and discussions References 6 Deep face recognition using full and partial face images 6.1 Introduction 6.1.1 Deep learning models 6.1.1.1 The structure of a CNN 6.1.1.2 Methods of training CNNs 6.1.1.3 Datasets for deep face recognition experimentation 6.2 Components of deep face recognition 6.2.1 An example of a trained CNN model for face recognition 6.2.1.1 Feature extraction 6.2.1.2 Feature classification 6.3 Face recognition using full face images 6.3.1 Similarity matching using the FaceNet model 6.4 Deep face recognition using partial face data 6.5 Specific model training for full and partial faces 6.5.1 Suggested architecture of the model 6.5.2 Training phase 6.6 Discussion and conclusions References Biographies 7 Unsupervised domain adaptation using shallow and deep representations 7.1 Introduction 7.2 Unsupervised domain adaptation using manifolds 7.2.1 Unsupervised domain adaptation using product manifolds 7.3 Unsupervised domain adaptation using dictionaries 7.3.1 Generalized domain adaptive dictionary learning 7.3.2 Joint hierarchical domain adaptation and feature learning 7.3.3 Incremental dictionary learning for unsupervised domain adaptation 7.4 Unsupervised domain adaptation using deep networks 7.4.1 Discriminative approaches for domain adaptation 7.4.2 Generative approaches for domain adaptation 7.5 Summary References Biographies 8 Domain adaptation and continual learning in semantic segmentation 8.1 Introduction 8.1.1 Problem formulation 8.2 Unsupervised domain adaptation 8.2.1 Domain adaptation problem formulation 8.2.2 Adaptation focus 8.2.2.1 Input level adaptation 8.2.2.2 Feature level adaptation 8.2.2.3 Output level adaptation 8.2.3 Unsupervised domain adaptation techniques 8.2.3.1 Domain adversarial adaptation 8.2.3.2 Generative-based adaptation 8.2.3.3 Classifier discrepancy 8.2.3.4 Self-supervised learning Self-training Entropy minimization 8.2.3.5 Multitasking 8.3 Continual learning 8.3.1 Continual learning problem formulation 8.3.2 Continual learning setups in semantic segmentation 8.3.3 Incremental learning techniques 8.3.3.1 Knowledge distillation 8.3.3.2 Parameter freezing 8.3.3.3 Geometrical feature-level regularization 8.3.3.4 New directions 8.4 Conclusion Acknowledgment References Biographies 9 Visual tracking 9.1 Introduction 9.1.1 Problem definition 9.1.2 Challenges in tracking 9.1.3 Motivation of the setting 9.1.4 Historical development 9.2 Template-based methods 9.2.1 The basics 9.2.2 Performance measures 9.2.3 Normalized cross correlation 9.2.4 Phase-only matched filter 9.3 Online-learning-based methods 9.3.1 The MOSSE filter 9.3.2 Discriminative correlation filters 9.3.3 Suitable features for DCFs 9.3.4 Scale space tracking 9.3.5 Spatial and temporal weighting 9.4 Deep learning-based methods 9.4.1 Deep features in DCFs 9.4.2 Adaptive deep features 9.4.3 End-to-end learning DCFs 9.5 The transition from tracking to segmentation 9.5.1 Video object segmentation 9.5.2 A generative VOS method 9.5.3 A discriminative VOS method 9.6 Conclusions Acknowledgment References Biographies 10 Long-term deep object tracking 10.1 Introduction 10.1.1 Challenges in video object tracking 10.1.1.1 Visual challenges in tracking 10.1.1.2 Learning challenges in tracking 10.1.1.3 Engineering challenges in tracking 10.2 Short-term visual object tracking 10.2.1 Shallow trackers 10.2.2 Deep trackers 10.2.2.1 Correlation filter-based tracking 10.2.2.2 Noncorrelation filter-based tracking 10.3 Long-term visual object tracking 10.3.1 Long-term model decay 10.3.2 Target disappearance and reappearance 10.3.3 Long-term trackers 10.3.3.1 Offline learning with Siamese trackers 10.3.4 Representation invariance and equivariance 10.3.4.1 Invariance in tracking 10.3.4.2 Equivariance in tracking 10.3.4.3 Translation equivariance 10.3.4.4 Rotation equivariance 10.3.4.5 Scale equivariance 10.3.4.6 Efficiency of Siamese trackers 10.3.4.7 Hybrid learning with Siamese trackers 10.3.4.8 Online learning beyond Siamese trackers 10.3.5 Datasets and benchmarks 10.4 Discussion References Biographies 11 Learning for action-based scene understanding 11.1 Introduction 11.2 Affordances of objects 11.2.1 Why would computer vision be interested in affordances? 11.2.2 Early affordance work 11.2.3 Affordance detection, classification, and segmentation 11.2.3.1 Affordance detection from geometric features 11.2.3.2 Semantic segmentation, and classification from images 11.2.4 Affordance in the context of action recognition and robot learning 11.2.4.1 Action recognition 11.2.4.2 Affordance learning in robot vision 11.2.5 Discussion on affordance learning 11.3 Functional parsing of manipulation actions 11.3.1 The active interplay between cognition and perception 11.3.2 Grammars of action 11.3.2.1 Different implementations of the grammar 11.3.2.2 Are grammars expressive and parsimonious descriptions? 11.3.3 Modules for action understanding 11.3.3.1 Grasping: an essential feature for action understanding 11.3.3.2 Geometry to robustify 11.3.4 Discussion on activity understanding 11.4 Functional scene understanding through deep learning with language and vision 11.4.1 Attributes in zero-shot learning 11.4.2 Shared embedding spaces 11.4.3 Construction of semantic vector spaces 11.4.3.1 word2vec 11.4.4 Shared embedding spaces and graphical models 11.5 Future directions 11.6 Conclusions Acknowledgment References Biographies 12 Self-supervised temporal event segmentation inspired by cognitive theories 12.1 Introduction 12.2 The event segmentation theory from cognitive science 12.3 Version 1: single-pass temporal segmentation using prediction 12.3.1 Feature extraction and encoding 12.3.2 Recurrent prediction for feature forecasting 12.3.3 Feature reconstruction 12.3.4 Self-supervised loss function 12.3.5 Error gating 12.3.6 Adaptive learning for plasticity 12.3.7 Results 12.3.7.1 Datasets 12.3.7.2 Evaluation metrics 12.3.7.3 Ablative studies 12.3.7.4 Quantitative evaluation 12.3.7.4.1 Improved features for action recognition 12.3.7.5 Qualitative evaluation 12.4 Version 2: segmentation using attention-based event models 12.4.1 Feature extraction 12.4.2 Attention unit 12.4.3 Motion weighted loss function 12.4.4 Results 12.4.4.1 Dataset 12.4.4.2 Evaluation metrics 12.4.4.2.1 Frame level 12.4.4.2.2 Activity level 12.4.4.3 Ablative studies 12.4.4.4 Quantitative evaluation 12.4.4.5 Qualitative evaluation 12.5 Version 3: spatio-temporal localization using prediction loss map 12.5.1 Feature extraction 12.5.2 Hierarchical prediction stack 12.5.3 Prediction loss 12.5.4 Action tubes extraction 12.5.5 Results 12.5.5.1 Data 12.5.5.2 Metrics and baselines 12.5.5.3 Quantitative evaluation 12.5.5.3.1 Quality of localization proposals 12.5.5.3.2 Spatial-temporal action localization 12.5.5.3.3 Comparison with other LSTM-based approaches 12.5.5.3.4 Ablative studies 12.5.5.3.5 Unsupervised egocentric gaze prediction 12.5.5.4 Qualitative evaluation 12.6 Other event segmentation approaches in computer vision 12.6.1 Supervised approaches 12.6.2 Weakly-supervised approaches 12.6.3 Unsupervised approaches 12.6.4 Self-supervised approaches 12.7 Conclusions Acknowledgments References Biographies 13 Probabilistic anomaly detection methods using learned models from time-series data for multimedia self-aware systems 13.1 Introduction 13.2 Base concepts and state of the art 13.2.1 Generative models 13.2.2 Dynamic Bayesian Network (DBN) models 13.2.3 Variational autoencoder 13.2.4 Types of anomalies and anomaly detection methods 13.2.5 Anomaly detection in low-dimensional data 13.2.6 Anomaly detection in high-dimensional data 13.3 Framework for computing anomaly in self-aware systems 13.3.1 General framework description 13.3.2 Generalized dynamic Bayesian network (GDBN) model 13.3.3 Real-time inference algorithm 13.3.4 Multimodal abnormality measurements 13.3.4.1 Discrete level 13.3.4.2 Continuous level 13.3.4.3 Observation level 13.3.5 Use of generalized errors for continual learning 13.4 Case study results: anomaly detection on multisensory data from a self-aware vehicle 13.4.1 Case study presentation 13.4.2 DBN model learning 13.4.3 Multilevel anomaly detection 13.4.3.1 Pedestrian avoidance task 13.4.3.2 U-turn task 13.4.3.3 Image-level anomalies 13.4.3.4 Anomaly detection evaluation 13.4.4 Proprioceptive sensory data anomalies 13.4.5 Additional results 13.5 Conclusions References Biographies 14 Deep plug-and-play and deep unfolding methods for image restoration 14.1 Introduction 14.2 Half quadratic splitting (HQS) algorithm 14.3 Deep plug-and-play image restoration 14.3.1 Learning deep CNN denoiser prior 14.3.1.1 Denoising network architecture 14.3.2 Training details 14.3.3 Denoising results 14.3.3.1 Grayscale image denoising 14.3.3.2 Color image denoising 14.3.4 HQS algorithm for plug-and-play IR 14.3.4.1 Half quadratic splitting (HQS) algorithm 14.3.4.2 General methodology for parameter setting 14.3.4.3 Periodical geometric self-ensemble 14.4 Deep unfolding image restoration 14.4.1 Deep unfolding network 14.4.1.1 Data module D 14.4.1.2 Prior module P 14.4.1.3 Hyper-parameter module H 14.4.2 End-to-end training 14.5 Experiments 14.5.1 Image deblurring 14.5.1.1 Quantitative and qualitative results 14.5.1.2 Hand-designed vs. learned hyper-parameters 14.5.1.3 Intermediate results 14.5.2 Single image superresolution (SISR) 14.5.2.1 Quantitative and qualitative comparison 14.5.2.2 Hand-designed vs. learned hyper-parameters 14.5.2.3 Intermediate results 14.6 Discussion and conclusions Acknowledgments References Biographies 15 Visual adversarial attacks and defenses 15.1 Introduction 15.2 Problem definition 15.3 Properties of an adversarial attack 15.4 Types of perturbations 15.5 Attack scenarios 15.5.1 Target models 15.5.1.1 Models for image-based tasks 15.5.1.2 Models for video-based tasks 15.5.2 Datasets and labels 15.5.2.1 Image datasets 15.5.2.2 Video datasets 15.6 Image processing 15.7 Image classification 15.7.1 White-box, bounded attacks 15.7.2 White-box, content-based attacks 15.7.3 Black-box attacks 15.8 Semantic segmentation and object detection 15.9 Object tracking 15.10 Video classification 15.11 Defenses against adversarial attacks 15.11.1 Detection 15.11.2 Gradient masking 15.11.3 Model robustness 15.12 Conclusions Acknowledgment References Biographies Index Back Cover