دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش:
نویسندگان: Xavier Alameda-Pineda (editor)
سری: Computer Vision and Pattern Recognition
ISBN (شابک) : 012814601X, 9780128146019
ناشر: Academic Press
سال نشر: 2018
تعداد صفحات: 482
زبان: English
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود)
حجم فایل: 22 مگابایت
در صورت تبدیل فایل کتاب Multimodal Behavior Analysis in the Wild: Advances and Challenges به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب تحلیل رفتار چند حالته در طبیعت: پیشرفت ها و چالش ها نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
تحلیل رفتاری چندوجهی در طبیعت: پیشرفتها و چالشها پیشرفتهترین فناوری را در پردازش سیگنال رفتاری با استفاده از روشهای مختلف داده، با تمرکز ویژه بر شناسایی نقاط قوت و محدودیتهای جریان ارائه میکند. فن آوری ها این کتاب بر روی روشهای صوتی و تصویری تمرکز دارد، در حالی که بر روشهای نوظهور مانند دادههای شتابسنج یا مجاورت نیز تأکید دارد. این وظایف در سطوح مختلف پیچیدگی، از سطح پایین (تشخیص گوینده، پیوندهای حسی حرکتی، جداسازی منبع)، تا سطح میانی (تشخیص گروه مکالمه، شناسایی مخاطب و مخاطب)، و سطح بالا (شخصیت و تشخیص احساسات) را پوشش میدهد. چگونه می توان از پیوندهای بین سطحی و درون سطحی بهره برداری کرد.
این منبع ارزشمندی در مورد چالش های جدید و تحقیقات آینده در تحلیل رفتاری چندوجهی در طبیعت است. این برای محققان و دانشجویان تحصیلات تکمیلی در زمینههای بینایی کامپیوتر، پردازش صدا، تشخیص الگو، یادگیری ماشین و پردازش سیگنال اجتماعی مناسب است.
Multimodal Behavioral Analysis in the Wild: Advances and Challenges presents the state-of- the-art in behavioral signal processing using different data modalities, with a special focus on identifying the strengths and limitations of current technologies. The book focuses on audio and video modalities, while also emphasizing emerging modalities, such as accelerometer or proximity data. It covers tasks at different levels of complexity, from low level (speaker detection, sensorimotor links, source separation), through middle level (conversational group detection, addresser and addressee identification), and high level (personality and emotion recognition), providing insights on how to exploit inter-level and intra-level links.
This is a valuable resource on the state-of-the- art and future research challenges of multi-modal behavioral analysis in the wild. It is suitable for researchers and graduate students in the fields of computer vision, audio processing, pattern recognition, machine learning and social signal processing.
Cover Computer Vision and Pattern Recognition Series Multimodal Behavior Analysis in the Wild: Advances and Challenges Copyright List of Contributors About the Editors Multimodal behavior analysis in the wild: An introduction 0.1 Analyzing human behavior in the wild from multimodal data 0.2 Scope of the book 0.3 Summary of important points References 1 Multimodal open-domain conversations with robotic platforms 1.1 Introduction 1.1.1 Constructive Dialog Model 1.2 Open-domain dialogs 1.2.1 Topic shifts and topic trees 1.2.2 Dialogs using Wikipedia 1.3 Multimodal dialogs 1.3.1 Multimodal WikiTalk for robots 1.3.2 Multimodal topic modeling 1.4 Future directions 1.4.1 Dialogs using domain ontologies 1.4.2 IoT and an integrated robot architecture 1.5 Conclusion References 2 Audio-motor integration for robot audition 2.1 Introduction 2.2 Audio-motor integration in psychophysics and robotics 2.3 Single-microphone sound localization using head movements 2.3.1 HRTF model and dynamic cues 2.3.2 Learning-based sound localization 2.3.3 Results 2.4 Ego-noise reduction using proprioceptors 2.4.1 Ego-noise: challenges and opportunities 2.4.2 Proprioceptor-guided dictionary learning 2.4.3 Phase-optimized dictionary learning 2.4.4 Audio-motor integration via support vector machines 2.4.5 Results 2.5 Conclusion and perspectives References 3 Audio source separation into the wild 3.1 Introduction 3.2 Multichannel audio source separation 3.3 Making MASS go from labs into the wild 3.3.1 Moving sources and sensors 3.3.2 Varying number of (active) sources 3.3.3 Spatially diffuse sources and long mixing filters 3.3.4 Ad hoc microphone arrays 3.4 Conclusions and perspectives References 4 Designing audio-visual tools to support multisensory disabilities 4.1 Introduction 4.2 Related works 4.3 The Glassense system 4.4 Visual recognition module 4.4.1 Object-instance recognition 4.4.2 Experimental assessment 4.5 Complementary hearing aid module 4.5.1 Measurement of Glassense beam pattern 4.5.2 Analysis of measured beam pattern 4.6 Assessing usability with impaired users 4.6.1 Glassense field tests with visually impaired 4.6.2 Glassense field tests with binaural hearing loss 4.7 Conclusion References 5 Audio-visual learning for body-worn cameras 5.1 Introduction 5.2 Multi-modal classification 5.3 Cross-modal adaptation 5.4 Audio-visual reidentification 5.5 Reidentification dataset 5.6 Reidentification results 5.7 Closing remarks References 6 Activity recognition from visual lifelogs: State of the art and future challenges 6.1 Introduction 6.2 Activity recognition from egocentric images 6.3 Activity recognition from egocentric photo-streams 6.4 Experimental results 6.4.1 Experimental setup 6.4.2 Implementation 6.4.2.1 Activity recognition at image level 6.4.2.2 Activity recognition at batch level 6.4.3 Results and discussion 6.5 Conclusion Acknowledgments References 7 Lifelog retrieval for memory stimulation of people with memory impairment 7.1 Introduction 7.2 Related work 7.3 Retrieval based on key-frame semantic selection 7.3.1 Summarization of autobiographical episodes 7.3.1.1 Episode temporal segmentation 7.3.1.2 Episode semantic summarization 7.3.2 Semantic key-frame selection 7.3.2.1 With whom was I? Face detection 7.3.2.2 What did I see? Rich image detection 7.3.3 Egocentric image retrieval based on CNNs and inverted index search 7.3.3.1 Extraction of CNN features and their textual representation 7.3.3.2 Text based index and retrieval of CNN features with inverted index 7.4 Experiments 7.4.1 Dataset 7.4.2 Experimental setup 7.4.3 Evaluation measures 7.4.4 Results 7.4.5 Discussions 7.5 Conclusions Acknowledgments References 8 Integrating signals for reasoning about visitors' behavior in cultural heritage 8.1 Introduction 8.2 Using technology for reasoning about visitors' behavior 8.3 Discussion 8.4 Conclusions References 9 Wearable systems for improving tourist experience 9.1 Introduction 9.2 Related work Personalized museum experience Object detection and recognition Content-based retrieval for cultural heritage Voice activity detection 9.3 Behavior analysis for smart guides 9.4 The indoor system Artwork detection and recognition Context modeling Experimental results Voice detection evaluation 9.5 The outdoor system Context awareness Application modules Temporal smoothing Exploiting sensors for modeling behavior System implementation Application use cases Experimental results User experience evaluation 9.6 Conclusions References 10 Recognizing social relationships from an egocentric vision perspective 10.1 Introduction 10.2 Related work 10.2.1 Head pose estimation 10.2.2 Social interactions 10.3 Understanding people interactions 10.3.1 Face detection and tracking 10.3.2 Head pose estimation 10.3.3 3D people localization 10.4 Social group detection 10.4.1 Correlation clustering via structural SVM 10.5 Social relevance estimation 10.6 Experimental results 10.6.1 Head pose estimation 10.6.2 Distance estimation 10.6.3 Groups estimation 10.6.4 Social relevance 10.7 Conclusions References 11 Complex conversational scene analysis using wearable sensors 11.1 Introduction 11.2 Defining `in the wild' and ecological validity 11.3 Ecological validity vs. experimental control 11.4 Ecological validity vs. robust automated perception 11.5 Thin vs. thick slices of analysis 11.6 Collecting data of social behavior 11.6.1 Practical concerns when collecting data during social events Requirements of the hardware and software Ease of use Technical pilot test Issues during the data collection event 11.7 Analyzing social actions with a single body worn accelerometer 11.7.1 Feature extraction and classification 11.7.2 Performance vs. sample size 11.7.3 Transductive parameter transfer (TPT) for personalized models 11.7.4 Discussion 11.8 Chapter summary References 12 Detecting conversational groups in images using clustering games 12.1 Introduction 12.2 Related work 12.3 Clustering games 12.3.1 Notations and definitions 12.3.2 Clustering games 12.4 Conversational groups as equilibria of clustering games 12.4.1 Frustum of attention 12.4.2 Quantifying pairwise interactions 12.4.3 The algorithm 12.5 Finding ESS-clusters using game dynamics 12.6 Experiments and results 12.6.1 Datasets 12.6.2 Evaluation metrics and parameter exploration 12.6.3 Experiments 12.7 Conclusions References 13 We are less free than how we think: Regular patterns in nonverbal communication 13.1 Introduction 13.2 On spotting cues: how many and when 13.2.1 The cues 13.2.2 Methodology 13.2.3 Results 13.3 On following turns: who talks with whom 13.3.1 Conflict 13.3.2 Methodology 13.3.3 Results 13.4 On speech dancing: who imitates whom 13.4.1 Methodology 13.4.2 Results 13.5 Conclusions References 14 Crowd behavior analysis from fixed and moving cameras 14.1 Introduction 14.2 Microscopic and macroscopic crowd modeling 14.3 Motion information for crowd representation from fixed cameras 14.3.1 Pre-processing and selection of areas of interest 14.3.2 Motion-based crowd behavior analysis 14.4 Crowd behavior and density analysis 14.4.1 Person detection and tracking in crowded scenes 14.4.2 Low level features for crowd density estimation 14.5 CNN-based crowd analysis methods for surveillance and anomaly detection 14.6 Crowd analysis using moving sensors 14.7 Metrics and datasets 14.7.1 Metrics for performance evaluation 14.7.2 Datasets for crowd behavior analysis 14.8 Conclusions References 15 Towards multi-modality invariance: A study in visual representation 15.1 Introduction and related work 15.2 Variances in visual representation 15.3 Reversal invariance in BoVW 15.3.1 Reversal symmetry and Max-SIFT 15.3.2 RIDE: generalized reversal invariance 15.3.3 Application to image classification 15.3.4 Experiments 15.3.5 Summary 15.4 Reversal invariance in CNN 15.4.1 Reversal-invariant convolution (RI-Conv) 15.4.2 Relationship to data augmentation 15.4.3 CIFAR experiments 15.4.4 ILSVRC2012 classification experiments 15.4.5 Summary 15.5 Conclusions References 16 Sentiment concept embedding for visual affect recognition 16.1 Introduction 16.1.1 Embeddings for image classification 16.1.2 Affective computing 16.2 Visual sentiment ontology 16.3 Building output embeddings for ANPs 16.3.1 Combining adjectives and nouns 16.3.2 Loss functions for the embeddings 16.4 Experimental results 16.4.1 Adjective noun pair detection 16.4.2 Zero-shot concept detection 16.5 Visual affect recognition 16.5.1 Visual emotion prediction 16.5.2 Visual sentiment prediction 16.6 Conclusions and future work References 17 Video-based emotion recognition in the wild 17.1 Introduction 17.2 Related work 17.3 Proposed approach 17.4 Experimental results 17.4.1 EmotiW Challenge 17.4.2 ChaLearn Challenges 17.5 Conclusions and discussion Acknowledgments References 18 Real-world automatic continuous affect recognition from audiovisual signals 18.1 Introduction 18.2 Real world vs laboratory settings 18.3 Audio and video affect cues and theories of emotion 18.3.1 Audio signals 18.3.2 Visual signals 18.3.3 Quantifying affect 18.4 Affective computing 18.4.1 Multimodal fusion techniques 18.4.2 Related work 18.4.3 Databases 18.4.4 Affect recognition competitions 18.5 Audiovisual affect recognition: a representative end-to-end learning system 18.5.1 Proposed model 18.5.1.1 Visual network 18.5.1.2 Speech network 18.5.1.3 Objective function 18.5.1.4 Network training 18.5.2 Experiments & results 18.6 Conclusions References 19 Affective facial computing: Generalizability across domains 19.1 Introduction 19.2 Overview of AFC 19.3 Approaches to annotation 19.4 Reliability and performance 19.5 Factors influencing performance 19.6 Systematic review of studies of cross-domain generalizability 19.6.1 Study selection 19.6.2 Databases 19.6.3 Cross-domain generalizability 19.6.4 Studies using deep- vs. shallow learning 19.6.5 Discussion 19.7 New directions 19.8 Summary Acknowledgments References 20 Automatic recognition of self-reported and perceived emotions 20.1 Introduction 20.2 Emotion production and perception 20.2.1 Descriptions of emotion 20.2.2 Brunswik's functional lens model 20.2.3 Appraisal theory 20.3 Observations from perception experiments 20.4 Collection and annotation of labeled emotion data 20.4.1 Emotion-elicitation methods 20.4.2 Data annotation tools 20.5 Emotion datasets 20.5.1 Text datasets 20.5.2 Audio, visual, physiological, and multi-modal datasets 20.6 Recognition of self-reported and perceived emotion 20.7 Challenges and prospects 20.8 Concluding remarks Acknowledgments References Index Back Cover