ورود به حساب

نام کاربری گذرواژه

گذرواژه را فراموش کردید؟ کلیک کنید

حساب کاربری ندارید؟ ساخت حساب

ساخت حساب کاربری

نام نام کاربری ایمیل شماره موبایل گذرواژه

برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید


09117307688
09117179751

در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید

دسترسی نامحدود

برای کاربرانی که ثبت نام کرده اند

ضمانت بازگشت وجه

درصورت عدم همخوانی توضیحات با کتاب

پشتیبانی

از ساعت 7 صبح تا 10 شب

دانلود کتاب Pattern Recognition and Computer Vision 5th Chinese Conference, PRCV 2022 Shenzhen, China, November 4–7, 2022 Proceedings, Part III

دانلود کتاب تشخیص الگو و دید کامپیوتری پنجمین کنفرانس چینی، PRCV 2022 شنژن، چین، 4 تا 7 نوامبر 2022 مجموعه مقالات، قسمت سوم

Pattern Recognition and Computer Vision 5th Chinese Conference, PRCV 2022 Shenzhen, China, November 4–7, 2022 Proceedings, Part III

مشخصات کتاب

Pattern Recognition and Computer Vision 5th Chinese Conference, PRCV 2022 Shenzhen, China, November 4–7, 2022 Proceedings, Part III

ویرایش:  
نویسندگان: , , , , , , ,   
سری: LNCS, volume 13536 
ISBN (شابک) : 9783031189128, 9783031189135 
ناشر: Springer 
سال نشر: 2022 
تعداد صفحات: 775
[789] 
زبان: English 
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) 
حجم فایل: 110 Mb 

قیمت کتاب (تومان) : 56,000



ثبت امتیاز به این کتاب

میانگین امتیاز به این کتاب :
       تعداد امتیاز دهندگان : 5


در صورت تبدیل فایل کتاب Pattern Recognition and Computer Vision 5th Chinese Conference, PRCV 2022 Shenzhen, China, November 4–7, 2022 Proceedings, Part III به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.

توجه داشته باشید کتاب تشخیص الگو و دید کامپیوتری پنجمین کنفرانس چینی، PRCV 2022 شنژن، چین، 4 تا 7 نوامبر 2022 مجموعه مقالات، قسمت سوم نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.


توضیحاتی در مورد کتاب تشخیص الگو و دید کامپیوتری پنجمین کنفرانس چینی، PRCV 2022 شنژن، چین، 4 تا 7 نوامبر 2022 مجموعه مقالات، قسمت سوم

مجموعه 4 جلدی LNCS 13534، 13535، 13536 و 13537، مجموعه مقالات داوری پنجمین کنفرانس چینی تشخیص الگو و بینش کامپیوتری، PRCV 2022، که در شنژن، چین، در نوامبر 2022 برگزار شد، تشکیل شده است. و از بین 564 ارسالی انتخاب شد. مقالات در بخش‌های موضوعی زیر سازماندهی شده‌اند: نظریه‌ها و استخراج ویژگی. یادگیری ماشینی، چند رسانه ای و چندوجهی؛ بهینه سازی و شبکه عصبی و یادگیری عمیق. پردازش و تجزیه و تحلیل تصویر زیست پزشکی؛ طبقه بندی الگوها و خوشه بندی. بینایی و بازسازی کامپیوتر سه بعدی، ربات ها و رانندگی خودکار؛ تشخیص، سنجش از راه دور؛ تحلیل و درک چشم انداز؛ پردازش تصویر و دید سطح پایین. تشخیص اشیاء، بخش بندی و ردیابی.


توضیحاتی درمورد کتاب به خارجی

The 4-volume set LNCS 13534, 13535, 13536 and 13537 constitutes the refereed proceedings of the 5th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2022, held in Shenzhen, China, in November 2022. The 233 full papers presented were carefully reviewed and selected from 564 submissions. The papers have been organized in the following topical sections: Theories and Feature Extraction; Machine learning, Multimedia and Multimodal; Optimization and Neural Network and Deep Learning; Biomedical Image Processing and Analysis; Pattern Classification and Clustering; 3D Computer Vision and Reconstruction, Robots and Autonomous Driving; Recognition, Remote Sensing; Vision Analysis and Understanding; Image Processing and Low-level Vision; Object Detection, Segmentation and Tracking.



فهرست مطالب

Preface
Organization
Contents – Part III
3D Computer Vision and Reconstruction, Robots and Autonomous Driving
Locally Geometry-Aware Improvements of LOP for Efficient Skeleton Extraction
	1 Introduction
	2 Related Work
	3 The Improved LOP
		3.1 Overview
		3.2 Bilateral Filter Based Weighting
		3.3 Adaptive Radius
	4 Experimental Results
	5 Conclusions
	References
Spherical Transformer: Adapting Spherical Signal to Convolutional Networks
	1 Introduction
	2 Related Work
	3 The Proposed Approach
		3.1 Spherical Sampling
		3.2 Spherical Transformer Module
		3.3 Network Architecture
	4 Experiments
		4.1 Spherical MNIST
		4.2 3D Object Classification
		4.3 Spherical Image Semantic Segmentation
	5 Conclusion
	References
Waterfall-Net: Waterfall Feature Aggregation for Point Cloud Semantic Segmentation
	1 Introduction
	2 Related Work
	3 Waterfall-Net
		3.1 Cascaded Sub-Networks Encoder
		3.2 Learn to Upsample
	4 Experiments
		4.1 Analysis of Waterfall-Net Architecture
		4.2 Results and Visualization
	5 Conclusion
	References
Sparse LiDAR and Binocular Stereo Fusion Network for 3D Object Detection
	1 Introduction
	2 Related Work
		2.1 LiDAR-Based 3D Object Detection
		2.2 Monocular-Based 3D Object Detection
		2.3 Stereo-Based 3D Object Detection
		2.4 Multi-modal 3D Object Detection
	3 Proposed Method
		3.1 Feature Extraction
		3.2 Attention Fusion
		3.3 3D Object Information Regression Prediction
		3.4 Implementation Details
	4 Experiments
		4.1 KITTI Dataset
		4.2 Evaluation Metrics
		4.3 Main Results
		4.4 Ablation Study
	5 Conclusion
	References
Full Head Performance Capture Using Multi-scale Mesh Propagation
	1 Introduction
	2 Related Work
	3 Template Fitting Based Dynamic Full Head Performance Capture
		3.1 Per-frame Multi-view Scan Reconstruction
		3.2 Template Warping
		3.3 Multi-scale Mesh Propagation
		3.4 Ear Reconstruction
	4 Experimental Results
	5 Conclusion
	References
Learning Cross-Domain Features for Domain Generalization on Point Clouds
	1 Introduction
	2 Related Work
		2.1 Deep Learning on Point Cloud
		2.2 Unsupervised Domain Adaptation
		2.3 Domain Generalization
	3 Network Architecture
		3.1 Point Set Mask
		3.2 Cross-Domain Mixup
		3.3 Hierarchical Feature Alignment
		3.4 Overall Loss
	4 Experiments and Results
		4.1 Dataset
		4.2 Comparative Methods
		4.3 Implementation Details
		4.4 Results
		4.5 Ablation Study
	5 Conclusion
	References
Unsupervised Pre-training for 3D Object Detection with Transformer
	1 Introduction
	2 Related Work
		2.1 Object Detection on Point Clouds
		2.2 Unsupervised Representation Learning on Point Clouds
	3 UP3DETR
		3.1 Pre-training
		3.2 Fine-tuning
	4 Experiments
		4.1 ScanNetV2 Object Detection
		4.2 SUN RGB-D Object Detection
		4.3 Ablations
	5 Conclusion
	References
Global Patch Cross-Attention for Point Cloud Analysis
	1 Introduction
	2 Related Work
		2.1 Multi-view Based and Voxelized Methods
		2.2 Point Based Method
		2.3 Attention Based Method
	3 Method
		3.1 Overview
		3.2 Global Patch Construction
		3.3 Local-Global Feature Aggregation
	4 Experiment
		4.1 Point Cloud Analysis
		4.2 Analysis of GPCAN
	5 Conclusion
	References
EEP-Net: Enhancing Local Neighborhood Features and Efficient Semantic Segmentation of Scale Point Clouds
	1 Introduction
	2 Related Work
		2.1 Projection-Based Methods
		2.2 Discretization-Based Methods
		2.3 Point-Based Methods
	3 EEP-Net
		3.1 Architecture of EEP Module
		3.2 Global Feature (GF)
		3.3 Architecture of EEP-Net
	4 Experiments
		4.1 Evalution on S3DIS Dataset
		4.2 Ablation Study
	5 Conclusion
	References
CARR-Net: Leveraging on Subtle Variance of Neighbors for Point Cloud Semantic Segmentation
	1 Introduction
	2 Related Work
	3 Method
		3.1 CARR Module
		3.2 CARRs
		3.3 Overall Architecture
	4 Experiments
		4.1 Set up
		4.2 Evaluation on S3DIS Dataset
		4.3 Evaluation on SemanticKITTI Dataset
		4.4 Ablation Study
	5 Conclusion
	References
3D Meteorological Radar Data Visualization with Point Cloud Completion and Poisson Surface Reconstruction
	1 Introduction
	2 Method Description
		2.1 Data Format Conversion
		2.2 Echo Models and Completion Algorithms
		2.3 Drawing 3D Surface Using Bilateral-Filtering Poisson Surface Reconstruction Algorithm (BPSR)
	3 Experiments and Analysis
		3.1 Comparison of Meteorological Radar Data Completion
		3.2 Comparison of Point Cloud Completion Experiments
	4 Conclusion
	References
JVLDLoc: A Joint Optimization of Visual-LiDAR Constraints and Direction Priors for Localization in Driving Scenario
	1 Introduction
	2 Related Work
		2.1 Visual-LiDAR SLAM
		2.2 Using Vanishing Points as Direction Constraint
	3 Notation
	4 Method
		4.1 Tracking
		4.2 Local Mapping
		4.3 Global Mapping Using Direction Priors
	5 Experiments
		5.1 Improvements over Prior Map
		5.2 Effects of Direction Priors
		5.3 Comparison to Other Methods on KITTI Odometry Dataset
		5.4 Ablation Study
	6 Conclusion
	References
A Single-Pathway Biomimetic Model for Potential Collision Prediction
	1 Introduction
	2 Related work
	3 Proposed Method
		3.1 Problem Formulation
		3.2 The Single-Pathway LGMD2
		3.3 Collision Prediction Criteria
	4 Experiments and Analyses
		4.1 Datasets and Competing Methods
		4.2 Parameter Setting
		4.3 Evaluation Metrics
		4.4 Experimental Results
	5 Conclusions
	References
PilotAttnNet: Multi-modal Attention Network for End-to-End Steering Control
	1 Introduction
	2 Related Work
		2.1 Driving Model
		2.2 Driving Dataset
	3 Method
		3.1 Spatial Information Encoding
		3.2 The End-to-End Attentional Driving Model
	4 Experiments
		4.1 Dataset Description
		4.2 Evaluation Metrics
		4.3 Results
	5 Conclusion
	References
Stochastic Navigation Command Matching for Imitation Learning of a Driving Policy
	1 Introduction
	2 Related Works
	3 Method
		3.1 Problem Formulation
		3.2 Backbone Network
		3.3 Navigation Command
		3.4 Multi-branch Architecture
		3.5 Stochastic Navigation Command Matching
		3.6 Training
	4 Experiments
		4.1 Experiment Setting
		4.2 Quantitative Comparison
		4.3 Qualitative Comparison
		4.4 Visualization Results
	5 Conclusions
	References
Recognition, Remote Sensing
Group Activity Representation Learning with Self-supervised Predictive Coding
	1 Introduction
	2 Related Work
	3 Approach
		3.1 Spatial Graph Transformer Encoder
		3.2 Temporal Causal Transformer Decoder
		3.3 Joint Learning Measure
	4 Experiments
		4.1 Datasets
		4.2 Implementation Details
		4.3 Ablations on Volleyball
		4.4 Comparison with State-of-the-Art
	5 Conclusion
	References
Skeleton-Based Action Quality Assessment via Partially Connected LSTM with Triplet Losses
	1 Introduction
	2 Related Work
		2.1 Action Quality Assessment
		2.2 Graph-Based Methods
	3 Methods
		3.1 Joints Graph and Activation Matrix
		3.2 Partially Connected Layer
		3.3 Partially Connected LSTM
		3.4 Triplet Loss
	4 Experiments
		4.1 Evaluation Datasets and Settings
		4.2 Data Preprocess
		4.3 Experimental Results and Analysis
		4.4 Complexity Analysis
	5 Conclusion
	References
Hierarchical Long-Short Transformer for Group Activity Recognition
	1 Introduction
	2 Related Work
		2.1 Group Activity Recognition
		2.2 Transformer
	3 Methodology
		3.1 Overview of HLSTrans
		3.2 Long-Short Transformer Block
		3.3 Hierarchical Structure
		3.4 Position Bias
	4 Experiments
		4.1 Datasets
		4.2 Implementation
		4.3 Comparison to Others
	5 Conclusion
	References
GNN-Based Structural Dynamics Simulation for Modular Buildings
	1 Introduction
	2 Methodology
		2.1 Graph Representation Method
		2.2 GNN Model
	3 Numerical Studies
		3.1 Numerical Examples of Three Spring-Mass Systems
		3.2 Training and Prediction Results
	4 Conclusion
	References
Semantic-Augmented Local Decision Aggregation Network for Action Recognition
	1 Introduction
	2 Proposed Approach
		2.1 LDNet
		2.2 Semantic Information Module
		2.3 Combining Semantic Information Module with LDNet
	3 Experiments
		3.1 Datasets and Implementation Details
		3.2 Ablation Study
	4 Conclusions
	References
Consensus-Guided Keyword Targeting for Video Captioning
	1 Introduction
	2 Related Work
		2.1 Video Captioning
		2.2 Video Captioning Datasets
	3 Method
		3.1 Encoder-Decoder Framework
		3.2 Consensus-Guided Loss
		3.3 Keyword Targeting Loss
		3.4 Consensus-Guided Keyword Targeting Captioning Model
	4 Experiments
		4.1 Datasets and Metrics.
		4.2 Implementation Details
		4.3 Quantitative Results
		4.4 Qualitative Results
	5 Conclusion
	References
Handwritten Mathematical Expression Recognition via GCAttention-Based Encoder and Bidirectional Mutual Learning Transformer
	1 Introduction
	2 Related Work
		2.1 Image-to-Markup
		2.2 CNN
		2.3 Global Contextual Attention
		2.4 Transformer
		2.5 Mutual Learning
	3 Methodology
		3.1 Encoder
		3.2 Decoder
		3.3 Positional Encoding
	4 Experiments
		4.1 Datasets
		4.2 Comparison with Prior Works
		4.3 Ablation Study
		4.4 The Program with GUI
	5 Conclusion
	References
Semi- and Self-supervised Learning for Scene Text Recognition with Fewer Labels
	1 Introduction
	2 Background and Related Work
		2.1 Scene Text Recognition
		2.2 Datasets
		2.3 Self-supervised Learning
	3 Method
		3.1 Architecture
		3.2 Data Augmentation
		3.3 Loss Function
		3.4 Pseudo-labeling
	4 Experiments
		4.1 Performance on Real Scene Datasets
		4.2 Performance on Synthetic Datasets
		4.3 Comparison with State-of-the-Art Models
	5 Conclusion
	References
TMCR: A Twin Matching Networks for Chinese Scene Text Retrieval
	1 Introduction
	2 Related Work
	3 Methods
		3.1 Detection Module
		3.2 Recognition Module
		3.3 Similarity Module
		3.4 Loss and Training
	4 Experiments
		4.1 Dataset and Implementation Details
		4.2 Comparisons with State-of-the-Art
		4.3 Ablation Study
		4.4 Model Generalization
	5 Conclusion
	References
Thai Scene Text Recognition with Character Combination
	1 Introduction
	2 Methodology
		2.1 Recognition Architecture
		2.2 Thai Character Combination
	3 Experimental Setting
		3.1 Thai STR Datasets
		3.2 Data Preparing
		3.3 Model Configurations and Training
		3.4 Evaluation Metric
	4 Experimental Results and Analyses
		4.1 Experimental Results
		4.2 The Effectiveness of TCC
		4.3 Failure Cases Analysis
	5 Conclusion
	References
Automatic Examination Paper Scores Calculation and Grades Analysis Based on OpenCV
	1 Introduction
	2 Methodology
		2.1 Image Acquisition
		2.2 Image Processing
		2.3 Data Processing
	3 Experimental Results and Discussions
	4 Conclusions
	References
Efficient License Plate Recognition via Parallel Position-Aware Attention
	1 Introduction
	2 Related Work
		2.1 License Plate Datasets
		2.2 License Plate Recognition
	3 Methods
		3.1 Feature Encoder
		3.2 Parallel Position-Aware Attention
		3.3 Character Decoder
		3.4 Loss Function
	4 Data Synthesis
	5 Experiments
		5.1 DataSets
		5.2 Experiment Settings
		5.3 Experimental Results
	6 Conclusions
	References
Semantic-Aware Non-local Network for Handwritten Mathematical Expression Recognition
	1 Introduction
	2 Related Works
		2.1 Grammar-Based HMER
		2.2 Encoder-Decoder Based HMER
	3 Methodology
		3.1 Non-local Neural Networks
		3.2 FastText Language Model
	4 Experiments
		4.1 Datasets
		4.2 Metrics
		4.3 Results
	5 Conclusion
	References
Math Word Problem Generation with Memory Retrieval
	1 Introduction
	2 Related Work
		2.1 Math Word Problem Generation
		2.2 Memory Retrieval for Text Generation
	3 Problem Setup
		3.1 MWPG
		3.2 Low-Resource MWPG
	4 Proposed Approach
		4.1 Overview
		4.2 Retrieval Module
		4.3 Generation Module
		4.4 Training
	5 Experiments
		5.1 Datasets
		5.2 Metrics
		5.3 Implementation Details
		5.4 Baselines
		5.5 Quantitative Results
		5.6 Qualitative Results
	6 Conclusions
	References
Traditional Mongolian Script Standard Compliance Testing Based on Deep Residual Network and Spatial Pyramid Pooling
	1 Introduction
	2 Related Work
	3 Model Architecture
		3.1 Convolutional Layers with Residual Learning
		3.2 The Spatial Pyramid Pooling Layer
	4 Experiment and Analysis
		4.1 Data and Experimental Environment
		4.2 Evaluation Metrics
		4.3 Results
	5 Conclusion
	References
FOV Recognizer: Telling the Field of View of Movie Shots
	1 Introduction
	2 Related Works
		2.1 Human Detection
		2.2 Field of View Recognition Method
	3 Movie Field of View Dataset(MFOVD)
	4 Field of View Recognition Method
	5 Experiments
		5.1 Recognition on Movie Field of View Dataset(MFOVD)
		5.2 Recognition on a Full Movie
	6 Conclusion
	References
Multi-level Temporal Relation Graph for Continuous Sign Language Recognition
	1 Introduction
	2 Related Work
		2.1 Sign Language Recognition
		2.2 Video Contexts Modeling
		2.3 Graph Convolutional Network
	3 Our Approach
		3.1 Visual Model
		3.2 Multi-level Temporal Relation Graph
		3.3 Alignment Model
	4 Results and Discussion
		4.1 Dataset
		4.2 Experimental Setup
		4.3 Comparison with SOTA Methods
		4.4 Model Validity Experiment
	5 Conclusions
	References
Beyond Vision: A Semantic Reasoning Enhanced Model for Gesture Recognition with Improved Spatiotemporal Capacity
	1 Introduction
	2 Related Work
		2.1 Temporal Information Model
		2.2 Attention Mechanism
		2.3 Semantic Information Model
	3 Method
		3.1 The Overview of the Network
		3.2 Long and Short-term Temporal Shift Module (LS-TSM)
		3.3 Spatial Attention Module
		3.4 Label Relation Module
	4 Experiment
		4.1 Datasets
		4.2 Implementation Details
		4.3 Comparision with the State of the Art
		4.4 Ablation Study
	5 Conclusion
	References
SemanticGAN: Facial Image Editing with Semantic to Realize Consistency
	1 Introduction
	2 Related Works
		2.1 Generative Adversarial Networks
		2.2 Facial Image Editing
	3 Proposed Method
		3.1 Preliminary
		3.2 Attribute-Related Fine Editing
		3.3 Attribute-Independent Optimization
	4 Experiments
		4.1 Implementation Details
		4.2 Attribute Face Editing
		4.3 Editing with SemanticGAN
		4.4 Ablation Studies
	5 Conclusion
	References
Least-Squares Estimation of Keypoint Coordinate for Human Pose Estimation
	1 Introduction
	2 Proposed Method
		2.1 Encoding
		2.2 Decoding
	3 Experiments
		3.1 Datasets and Evaluation
		3.2 Comparison with Other Methods
		3.3 Ablation Study
	4 Conclusion
	References
Joint Pixel-Level and Feature-Level Unsupervised Domain Adaptation for Surveillance Face Recognition
	1 Introduction
	2 Related Work
		2.1 Deep Face Recognition
		2.2 Unsupervised Domain Adaptation
	3 Methodology
		3.1 Training of Feature Extractor
		3.2 Training of Domain Classifier
		3.3 Training of Style Transformer
	4 Experiment
		4.1 Datasets
		4.2 Details of Training
		4.3 Ablation Experiment
		4.4 Quantity Comparison
		4.5 Comparison
	5 Conclusion
	References
Category-Oriented Adversarial Data Augmentation via Statistic Similarity for Satellite Images
	1 Introduction
	2 Related Works
		2.1 Data Augmentation
		2.2 Appearance Properties
	3 Proposed Method
		3.1 Problem Definition and Basic Solutions
		3.2 Statistic Similarity Evaluation
		3.3 Adversarial Generation Between Similar Categories
		3.4 Task of Object Detection
	4 Experimental Results
	5 Conclusion
	References
A Multi-scale Convolutional Neural Network Based on Multilevel Wavelet Decomposition for Hyperspectral Image Classification
	1 Introduction
	2 Related Works
		2.1 2D Discrete Wavelet Transform
		2.2 DenseNet
	3 Proposed Framework
	4 Experimental Results and Discussion
		4.1 HIS Datasets
		4.2 Experimental Setting
		4.3 Results and Discussion
	5 Conclusion
	References
High Spatial Resolution Remote Sensing Imagery Classification Based on Markov Random Field Model Integrating Granularity and Semantic Features
	1 Introduction
	2 Background on MRF-Based Methods
		2.1 MRF Model with Different Granularities
		2.2 MRF Model with Multilayer
	3 Proposed Method
		3.1 MRF Model
		3.2 Proposed MRF-MM Model
	4 Experimental Results
		4.1 Data
		4.2 Classification Experiment
		4.3 Test of the MRF-MM Model Parameters
	5 Conclusion
	References
Feature Difference Enhancement Fusion for Remote Sensing Image Change Detection
	1 Introduction
	2 Related Work
		2.1 Traditional Change Detection Methods
		2.2 Deep Learning Based Change Detection Methods
	3 Method
		3.1 Overall Structure of Proposed CD Architecture
		3.2 Difference Enhancement Fusion Module (DEFM)
	4 Experiments
		4.1 Experimental Setup
		4.2 Experimental Results
		4.3 Ablation Studies
	5 Conclusion
	References
WAFormer: Ship Detection in SAR Images Based on Window-Aware Swin-Transformer
	1 Introduction
	2 Related Works
		2.1 SAR Target Detection Based on Deep Learning
		2.2 Vision Transformer
	3 Method
		3.1 Motivation
		3.2 Overview
		3.3 Variable Size Window Self-attention
		3.4 WAFormer Block
	4 Experiments
		4.1 Dataset and Evaluation Metrics
		4.2 Implementation Details
		4.3 Comparison Results
		4.4 Related Configuration Adjustment
	5 Conclusion
	References
EllipseIoU: A General Metric for Aerial Object Detection
	1 Introduction
	2 Related Work
		2.1 Aerial Object Detection
		2.2 IoU-Based Metrics
	3 EllipseIoU Loss
		3.1 EllipseIoU
		3.2 EllipseIoU Loss
		3.3 Discussion on Several IoU-Based Metrics
	4 Experimental Results
		4.1 Datasets
		4.2 Results on DOTA and HRSC2016 Dataset
	5 Conclusion
	References
Transmission Tower Detection Algorithm Based on Feature-Enhanced Convolutional Network in Remote Sensing Image
	1 Introduction
	2 Dataset Production
		2.1 Data Collection
		2.2 Dataset Production
		2.3 Dataset Expansion
		2.4 YOLOv3 Algorithm Principle
		2.5 Algorithms in This Paper
	3 Experiments and Results Analysis
		3.1 Experimental Setup
		3.2 Analysis of Results
	4 Conclusion
	References
Vision Analysis and Understanding
Mining Diverse Clues with Transformers for Person Re-identification
	1 Introduction
	2 Related Work
	3 Proposed Method
		3.1 Vision Transformer as Feature Extractor
		3.2 Person ReID with Transformers
		3.3 MDCTNet Architecture
	4 Experiments
		4.1 Datasets
		4.2 Implementation Details
		4.3 Ablation Study
		4.4 Comparison with State-of-the-Arts
	5 Conclusion
	References
Mutual Learning Inspired Prediction Network for Video Anomaly Detection
	1 Introduction
	2 Mutual Learning Inspired Prediction Network
		2.1 Framework
		2.2 Boundary Perception-Based Mimicry Loss
		2.3 Self-supervised Weighted Loss
		2.4 Objective Function
		2.5 Anomaly Detection on Testing Data
	3 Experiment
		3.1 Dataset
		3.2 Evaluation Metrics
		3.3 Comparison with Existing Methods
		3.4 Running time
	4 Conclusion
	References
Weakly Supervised Video Anomaly Detection with Temporal and Abnormal Information
	1 Introduction
	2 Related Work
		2.1 Unsupervised Video Anomaly Detection
		2.2 Weakly-Supervised Video Anomaly Detection
		2.3 Multiple Instance Learning
		2.4 Pair-Based Loss in Deep Metric Learning
	3 Approach
		3.1 Temporal Strengthen Network
		3.2 Multi-positive Sample MIL
		3.3 Anomaly Samples' N-Pair Loss
	4 Experiment
		4.1 Datasets and Metrics
		4.2 Implementation Details
		4.3 Results on ShanghaiTech
		4.4 Results on UCF-Crime
		4.5 Ablation Studies
		4.6 Qualitative Analyse
	5 Conclusion
	References
Towards Class Interpretable Vision Transformer with Multi-Class-Tokens
	1 Introduction
	2 Related Work
		2.1 Vision Transformer
		2.2 Heatmap-Based Visual Interpretability
	3 Proposed Method
		3.1 Overview of Proposed Approach
		3.2 Multi-Class-Tokens and Cross Attention
		3.3 Non-parametric Scoring Function
		3.4 Heatmap Based Per-class Interpretability
	4 Experiments
		4.1 Datasets
		4.2 Implementation Details
		4.3 Comparison Results
		4.4 Ablation Study
	5 Conclusion
	References
Multimodal Violent Video Recognition Based on Mutual Distillation
	1 Introduction
	2 Related Work
		2.1 Violent Video Recognition
		2.2 Self-supervised Learning
		2.3 Knowledge Distillation
	3 Methods
		3.1 Mutual Distillation for Violent RGB Feature
		3.2 MAF-Net
	4 Experiments
		4.1 Datasets and Metrics
		4.2 Experiments on Mutual Distillation for Violent RGB Feature
		4.3 Experiments on Multimodal Feature Fusion
		4.4 Comparison with Others
	References
YFormer: A New Transformer Architecture for Video-Query Based Video Moment Retrieval
	1 Introduction
	2 Related Work
		2.1 Transformer in Computer Vision
		2.2 Video Moment Retrieval
	3 YFormer for Video Moment Retireval
		3.1 Spatio-Temporal Feature Extractor
		3.2 Semantic Relevance Matcher
		3.3 Prediction Heads
		3.4 Losses
	4 Experiments
		4.1 Experiment Setup
		4.2 Quantitative Results
		4.3 Qualitative Results
		4.4 Ablation Study
	5 Conclusion
	References
Hightlight Video Detection in Figure Skating
	1 Introduction
	2 Related Works
		2.1 Action Quality Assessment
		2.2 Figure Skating
		2.3 Temporal Action Segmentation
	3 Approach
		3.1 Overview
		3.2 Video Segmentation
		3.3 Tube Self-Attention
		3.4 Frame Scoring
	4 Experiments
		4.1 Dataset
		4.2 Implementation Details
		4.3 Results During Training
		4.4 Ablation Study
		4.5 Results in the Singles Figure Skating Competition
	5 Conclusion
	References
Memory Enhanced Spatial-Temporal Graph Convolutional Autoencoder for Human-Related Video Anomaly Detection
	1 Introduction
	2 Related Work
		2.1 Video Anomaly Detection
		2.2 Graph Convolutional Networks
		2.3 Memory Networks
	3 Method
		3.1 Preprocessing
		3.2 Network Architecture
		3.3 Loss Function
		3.4 Anomaly Detection
	4 Experiments
		4.1 Datasets
		4.2 Implementation Details
		4.3 Evaluation
		4.4 Ablation Studies
	5 Conclusions
	References
Background Suppressed and Motion Enhanced Network for Weakly Supervised Video Anomaly Detection
	1 Introduction
	2 Related Work
		2.1 Unsupervised Video Anomaly Detection
		2.2 Weakly Supervised Video Anomaly Detection
	3 Background Suppressed and Motion Enhanced Network (BSMEN)
		3.1 Motion Discrimination Sequence Extraction (MDSE)
		3.2 Background Suppressed and Motion Enhanced Module (BSMEM)
		3.3 Loss Function
	4 Experiments
		4.1 Datasets
		4.2 Evaluation Metric
		4.3 Implementation Details
		4.4 Comparison with State-of-the-Art Methods
		4.5 Ablation Studies
	5 Conclusions
	References
Dirt Detection and Segmentation Network for Autonomous Washing Robots
	1 Introduction
	2 Relate Works
	3 Method
		3.1 SVDD (Support Vector Data Description)
		3.2 Deep SVDD
		3.3 DDSN (Dirt Detection and Segmentation Network)
	4 Evaluation
		4.1 Experiment Setup
		4.2 Experimental Result on MVTecAD Dataset
		4.3 Experimental Result on Dirt Dataset
	5 Conclusion
	References
Finding Beautiful and Happy Images for Mental Health and Well-Being Applications
	1 Introduction
	2 Related Work
	3 A Beautiful Natural Image Database (BNID)
		3.1 Collecting the Images
		3.2 Collecting Beatutifulness and Happiness Scores
		3.3 Analysis of Beautifulness and Happiness Scores
	4 Beautifulness and Happiness Assessment
		4.1 Image Beautifulness Assessment
		4.2 Loss Functions
		4.3 Final Score
		4.4 Extension to Image Happiness Prediction
	5 Experiments
		5.1 Image Beautifulness Assessment Results
		5.2 Image Happiness Assessment Results
	6 Concluding Remarks
	References
Query-UAP: Query-Efficient Universal Adversarial Perturbation for Large-Scale Person Re-Identification Attack
	1 Introduction
	2 Related Work
	3 Methodology
		3.1 Problem Definition
		3.2 Loss Function
		3.3 Query-UAP Attack
	4 Experiment
		4.1 Experimental Settings
		4.2 Comparison with State of the Arts
		4.3 Ablation Study
	5 Conclusion
	References
Robust Person Re-identification with Adversarial Examples Detection and Perturbation Extraction
	1 Introduction
	2 Related Work
		2.1 Adversarial Attack
		2.2 Adversarial Defense
	3 Proposed Method
		3.1 Networks Architecture
		3.2 Adversarial Examples Generation
		3.3 Perturbation Extractor and Purification
		3.4 Adversarial Example Detector
	4 Experiments
		4.1 Experimental Settings
		4.2 Robustness of Purification
		4.3 Adversarial Detection
		4.4 Ablation Experiments
	5 Conclusion
	References
Self-supervised and Template-Enhanced Unknown-Defect Detection
	1 Introduction
	2 Related Work
	3 Approach
		3.1 Framework
		3.2 Feature Fusion Module
		3.3 Loss Function
		3.4 Defect Detection
	4 Experiments
		4.1 Dataset
		4.2 Training Parameters
		4.3 Results
	5 Conclusion
	References
JoinTW: A Joint Image-to-Image Translation and Watermarking Method
	1 Introduction
	2 Related Work
	3 Proposed Work
		3.1 Problem Statement
		3.2 Method Overview
		3.3 The Watermark Extractor
		3.4 Watermark Embedding Generator
		3.5 The Adversary Net
		3.6 Training Details
		3.7 Loss Functions
	4 Experimental Results
		4.1 Datasets
		4.2 Training Details
		4.3 Visual Quality Study of the Generated Images
		4.4 Qualities of the Extracted Watermarks
		4.5 Watermark Robustness Under Agnostic Distortions
	5 Conclusions
	References
Author Index




نظرات کاربران