ورود به حساب

نام کاربری گذرواژه

گذرواژه را فراموش کردید؟ کلیک کنید

حساب کاربری ندارید؟ ساخت حساب

ساخت حساب کاربری

نام نام کاربری ایمیل شماره موبایل گذرواژه

برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید


09117307688
09117179751

در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید

دسترسی نامحدود

برای کاربرانی که ثبت نام کرده اند

ضمانت بازگشت وجه

درصورت عدم همخوانی توضیحات با کتاب

پشتیبانی

از ساعت 7 صبح تا 10 شب

دانلود کتاب Algorithms and Architectures for Parallel Processing: 22nd International Conference, ICA3PP 2022, Copenhagen, Denmark, October 10–12, 2022, Proceedings

دانلود کتاب الگوریتم ها و معماری ها برای پردازش موازی: بیست و دومین کنفرانس بین المللی، ICA3PP 2022، کپنهاگ، دانمارک، 10 تا 12 اکتبر 2022، مجموعه مقالات

Algorithms and Architectures for Parallel Processing: 22nd International Conference, ICA3PP 2022, Copenhagen, Denmark, October 10–12, 2022, Proceedings

مشخصات کتاب

Algorithms and Architectures for Parallel Processing: 22nd International Conference, ICA3PP 2022, Copenhagen, Denmark, October 10–12, 2022, Proceedings

ویرایش:  
نویسندگان: , , ,   
سری: Lecture Notes in Computer Science, 13777 
ISBN (شابک) : 3031226763, 9783031226762 
ناشر: Springer 
سال نشر: 2023 
تعداد صفحات: 817
[818] 
زبان: English 
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) 
حجم فایل: 49 Mb 

قیمت کتاب (تومان) : 52,000



ثبت امتیاز به این کتاب

میانگین امتیاز به این کتاب :
       تعداد امتیاز دهندگان : 2


در صورت تبدیل فایل کتاب Algorithms and Architectures for Parallel Processing: 22nd International Conference, ICA3PP 2022, Copenhagen, Denmark, October 10–12, 2022, Proceedings به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.

توجه داشته باشید کتاب الگوریتم ها و معماری ها برای پردازش موازی: بیست و دومین کنفرانس بین المللی، ICA3PP 2022، کپنهاگ، دانمارک، 10 تا 12 اکتبر 2022، مجموعه مقالات نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.


توضیحاتی در مورد کتاب الگوریتم ها و معماری ها برای پردازش موازی: بیست و دومین کنفرانس بین المللی، ICA3PP 2022، کپنهاگ، دانمارک، 10 تا 12 اکتبر 2022، مجموعه مقالات

این کتاب مجموعه مقالات داوری بیست و دومین کنفرانس بین‌المللی الگوریتم‌ها و معماری‌ها برای پردازش موازی، ICA3PP 2022 است که در اکتبر 2022 برگزار شد. به دلیل همه‌گیری COVID-19 کنفرانس به صورت مجازی برگزار شد. 33 مقاله کامل و 10 مقاله کوتاه ارائه شده با دقت بررسی و از بین 91 مقاله ارسالی انتخاب شدند. این مقاله ابعاد بسیاری از الگوریتم‌ها و معماری‌های موازی را پوشش می‌دهد که شامل رویکردهای نظری اساسی، پروژه‌های تجربی عملی، و اجزا و سیستم‌های تجاری می‌شود.


توضیحاتی درمورد کتاب به خارجی

This book constitutes the refereed proceedings of the 22nd International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2022, which was held in October 2022. Due to COVID-19 pandemic the conference was held virtually. The 33 full papers and 10 short papers, presented were carefully reviewed and selected from 91 submissions. The papers cover many dimensions of parallel algorithms and architectures, encompassing fundamental theoretical approaches, practical experimental projects, and commercial components and systems



فهرست مطالب

Preface
Organization
Contents
Efficient Remote Memory Paging for Disaggregated Memory Systems
	1 Introduction
	2 Motivation
		2.1 Remote Memory Access Latency
		2.2 Expensive RDMA Read Operations
	3 Design
		3.1 Architectural Overview
		3.2 Multi-level Cache
		3.3 Adaptive Dynamic Prefetch
	4 Implementation
		4.1 Redesign the Critical Data Path
		4.2 Page Writes
		4.3 Page Reads
	5 Evaluation
		5.1 Effectiveness of Our Optimization Schemes
		5.2 Microbenchmarks
		5.3 Performance of Graph Computing Workloads
		5.4 Performance of Machine Learning Workloads
	6 Related Work
	7 Conclusion
	References
pCOVID: A Privacy-Preserving COVID-19 Inference Framework
	1 Introduction
	2 Preliminaries
		2.1 COVID-19 Detection Using Convolutional Neural Network
		2.2 Additive Secret Sharing
	3 System Model and Design Goal
		3.1 System and Threat Model
		3.2 Design Goals
	4 Secret Sharing Friendly Model Optimization
		4.1 Fixed-Point Quantization
		4.2 Layer Fusion
	5 Additive-Secret-Sharing Based Protocols
		5.1 Modified Additive Secret Sharing Scheme
		5.2 Secure Convolution Protocol (SConv)
		5.3 Secure ReLU Protocol (SReLU)
		5.4 Secure Average Pooling Protocol (SAvgPool)
		5.5 Secure Truncation Protocol (STrun)
	6 Privacy-Preserving COVID-19 Inference
	7 Security Analysis
	8 Evaluation
		8.1 Theoretical Analysis
		8.2 Experimental Performance
	9 Conclusion
	References
Hierarchical Reinforcement Learning-Based Mobility-Aware Content Caching and Delivery Policy for Vehicle Networks
	1 Introduction
	2 Related Work
	3 System Model
		3.1 Network Model
		3.2 Communication Model
		3.3 Request Model
	4 Problem Formulation
		4.1 State Space
		4.2 Action Space
		4.3 Reward
	5 Hierarchical Reinforcement Learning-Based Caching and Delivery
		5.1 Vehicle Side Policy
		5.2 RSU Side Policy
	6 Experiments
		6.1 Experiment Setup
		6.2 Baseline Schemes
		6.3 Simulation Results
	7 Conclusion
	References
Compromise Privacy in Large-Batch Federated Learning via Malicious Model Parameters
	1 Introduction
	2 Related Work
		2.1 Federated Learning
		2.2 Gradient Inversion Attack
	3 Method
		3.1 Threat Model
		3.2 The Direct Data Leakage from Gradients
		3.3 Constructing Malicious Model Parameters
	4 Experiments
		4.1 Comparison with Previous Methods
		4.2 Performance on Models Using Different Activation Functions
		4.3 Factors Affecting the Performance of Our Method
	5 Conclusion
	References
PMemTrace: Lightweight and Efficient Memory Access Monitoring for Persistent Memory
	1 Introduction
	2 Related Work
	3 PMemTrace Design
		3.1 Architecture Overview
		3.2 Pointer Analysis
		3.3 Lightweight Thread-Local Access Permission Control
	4 Implementation
	5 Evaluation
		5.1 Environment
		5.2 Performance
		5.3 Efficiency
		5.4 Accuracy
	6 Conclusion
	References
SeqTrace: API Call Tracing Based on Intel PT and VMI for Malware Detection
	1 Introduction
	2 Background
		2.1 Virtual Machine Introspection
		2.2 Intel Processor Trace
	3 Design of SeqTrace
		3.1 System Architecture
		3.2 IPT-Based API-Call Tracing Approach
		3.3 VMI-Based Semantic Decoder
	4 Implementation
		4.1 Semantic Recognition
		4.2 Adaptive Configuration of Intel PT
	5 Experiment
		5.1 Experimental Setup
		5.2 Pretreatment
		5.3 Effectiveness of SeqTrace
		5.4 Overhead Evaluation
	6 Related Work
	7 Conclusion
	References
CP3: Hierarchical Cross-Platform Power/Performance Prediction Using a Transfer Learning Approach
	1 Introduction
	2 Related Work
	3 Motivation
	4 Methodology
		4.1 Hierarchical Division
		4.2 Partial Transfer
		4.3 Model Fusion
	5 Experimental Setup
		5.1 Platforms and Environment
		5.2 Benchmarks
		5.3 Training Data
		5.4 Metrics
	6 Evaluation of CP3
		6.1 Overall Results
		6.2 Across Different Architectures
		6.3 Across Same Architectures but Different Platforms
		6.4 Generality of CP3
		6.5 Comparisons
	7 Conclusion
	References
Effective Vehicle Lane-Change Sensing Using Onboard Smartphone Based on Temporal Convolutional Network
	1 Introduction
	2 System Architecture
	3 TCN-Based Lane-Change Sensing
		3.1 TCN-Based Lane-Change Behavior Classifier
		3.2 Offline Training Procedures
		3.3 Online Inference Procedures
	4 Performance Evaluation
		4.1 Experimental Setup
		4.2 Experimental Results
	5 Conclusion
	References
A Web Service Recommendation Method Based on Adaptive Gate Network and xDeepFM
	1 Introduction
	2 Related Work
	3 AGN-xDeepFM Method
		3.1 Data Preprocessing
		3.2 Similarity Calculation Based on AGN Model
		3.3 Web Services Recommendation based on xDeepFM
	4 Experimental Result and Analysis
		4.1 Dataset and Experimental Setting
		4.2 Evaluation Metrics
		4.3 Baselines
		4.4 Experimental Results and Analysis
		4.5 Hyper-parameter Analysis
	5 Conclusion
	References
PXCrypto: A Regulated Privacy-Preserving Cross-Chain Transaction Scheme
	1 Introduction
	2 Building Block
	3 System Overview
		3.1 Actor
		3.2 System Model
		3.3 System Goals
	4 Cross-Chain Asset Proof Scheme
	5 The Confidential Transaction Scheme
		5.1 Order Matching Mode
		5.2 Proxy Multi-party Computation (PMPC)
		5.3 Transaction Regulation Scheme
		5.4 Identity Regulation Scheme
	6 Security and Property Analysis
		6.1 Security Analysis
		6.2 Advanced Properties
	7 Implementation and Evaluation
	8 Related Work
	9 Conclusion
	References
CRFs for Digital Signature and NIZK Proof System in Web Services
	1 Introduction
		1.1 Related Work
		1.2 Our Contributions
	2 Preliminary
		2.1 Notations and Definitions
		2.2 Digital Signature
		2.3 NIZK Proof System
		2.4 Randomizable NIZK Proof System
	3 Cryptographic Reverse Firewalls for Digital Signatures
		3.1 Definitions of CRF for Digital Signatures
		3.2 Review on the PS Signature Scheme
		3.3 The CRF Construction for PS Signature Scheme
		3.4 Security Proofs of the CRF for PS Signature Scheme
		3.5 Computation and Communication Cost
	4 CRF Construction for NIZK Proof System
		4.1 Definitions of CRF W.r.t. NIZK and Its Securities
		4.2 Generic Construction of CRF W.r.t. NIZK
		4.3 Security Proof
		4.4 Instantiation of CRF for NIZK Proof System
	5 Conclusion
	References
SPAC: Scalable Pattern Approximate Counting in Graph Mining
	1 Introduction
	2 Related Work
		2.1 Accurate Counting
		2.2 Approximate Counting
	3 Design
		3.1 Pattern Number Distribution to Degrees
		3.2 Preprocess
		3.3 Sample
		3.4 Fit Formula
	4 Implementation
	5 Evaluation
		5.1 Experiment Setup
		5.2 SPAC and Accurate Counting
		5.3 SPAC and Approximate Counting
		5.4 Exception: Facebook
	6 Conclusion and Future Work
	References
Haica: A High Performance Computing & Artificial Intelligence Fused Computing Architecture
	1 Introduction
	2 Related Work
		2.1 HPC and AI Fusion
		2.2 Architecture Support for Multiple Precisions
	3 Background
		3.1 IEEE Floating-Point Standard and New Data Formats
		3.2 FMA Architecture
		3.3 Architecture of Systolic Array
	4 Motivation
	5 Haica: An HPC and AI Fusion Architecture
		5.1 Multiple-Low-Precision FMA
		5.2 Haica: A Double-Precision FMA and Low-Precision Systolic Array Fused Architecture
		5.3 Discussion
	6 Experiment
	7 Conclusion
	References
AOA: Adaptive Overclocking Algorithm on CPU-GPU Heterogeneous Platforms
	1 Introduction
	2 Related Work
		2.1 DNN in CPU-GPU Platforms
		2.2 Energy Efficiency Optimization Methods
	3 AOA Design
		3.1 Key Issue and Key Factors in AOA
		3.2 Description of AOA
	4 Experimental Results and Analysis
		4.1 Experiment Environment
		4.2 AOA Overall Result
		4.3 Comparison with Other Methods
	5 Discussion for Key Factors
		5.1 Impact of Factor 1: The Power Upper Bound Must Be Dynamic, Not Static
		5.2 Impact of Factor 2: Overclocking at both Ends of the CPU and GPU Must Be Coordinated, and Real-Time Load Balancing Must be Considered
	6 Conclusion
	References
GEM: Execution-Aware Cache Management for Graph Analytics
	1 Introduction
	2 Background
		2.1 Graph Processing Models
		2.2 Graph Representation
		2.3 Graph Analytics Execution Phases
	3 Motivation
		3.1 Data Accesses Breakdown
		3.2 Underutilized Cache Hierarchy
		3.3 Opportunity
	4 GEM Design
		4.1 Length-Aware Fetch Policy
		4.2 Reuse-Based Replacement Policy
	5 Evaluation
		5.1 Profiling Platform
		5.2 Applications
		5.3 Datasets
	6 Result
		6.1 Performance
		6.2 LLC MPKI Reduction
		6.3 Limitations
	7 Related Work
		7.1 Architectural Optimizations
		7.2 Software Optimizations
		7.3 Cache Bypassing
	8 Conclusion
	References
EnergyCIDN: Enhanced Energy-Aware Challenge-Based Collaborative Intrusion Detection in Internet of Things
	1 Introduction
	2 Our Proposed Approach
		2.1 Energy-Aware Challenge-Based CIDN Framework
		2.2 Hybrid Trust Management
		2.3 Alarm Aggregation
	3 Evaluation
		3.1 Experiment-1
		3.2 Experiment-2
	4 Related Work
	5 Discussion and Future Directions
	6 Conclusion
	References
Federated Learning-Based Intrusion Detection on Non-IID Data
	1 Introduction
	2 Related Work
	3 System Model
		3.1 Federated Learning
		3.2 Dirichlet Distribution
		3.3 Data Augmentation
	4 Algorithm and System Design
		4.1 Non-IID Data Setting
		4.2 Federated Learning Data Augmentation Based on ACGAN
		4.3 Intrusion Detection Based on Federated Learning
	5 Performance Evaluation
		5.1 Experimental Setup
		5.2 Results of Running on Non-IID Data
		5.3 Performance of Federated Learning Data Augmentation Algorithm Based on ACGAN
		5.4 Hyperparameter Analysis
	6 Conclusion
	References
Long-Term Fairness Scheduler for Pay-as-You-Use Cache Sharing Systems
	1 Introduction
	2 Desired Properties for Cache Sharing
	3 Background and Motivation
	4 Long-Term Cache Fairness Framework
		4.1 FairCache Allocation
		4.2 FairCache Property Proof
		4.3 Efficiency Optimization for FairCache Policy
	5 Evaluation
		5.1 Experimental Setup
		5.2 Testbed Experimental Results
	6 Related Work
	7 Conclusion
	References
MatGraph: An Energy-Efficient and Flexible CGRA Engine for Matrix-Based Graph Analytics
	1 Introduction
	2 Backgrounds
		2.1 Graph Representation
		2.2 Matrix-Based Graph Analytics
		2.3 Coarse-Grained Reconfigurable Architectures
	3 Motivations
		3.1 Challenges of Graph Analytics
		3.2 Overcoming Challenges
	4 Design
		4.1 Overview Architecture
		4.2 Reduced Instructions Based on Semirings
		4.3 Bitmap-Aware Instruction Filtering
		4.4 Sparsity Removing with Bidirectional Sliding Window
	5 Methodology
		5.1 Experimental Setup
		5.2 Baselines and System Configuration
		5.3 Graph Algorithms and Datasets
	6 Results
		6.1 Overall Results
		6.2 Effects of Optimizations
	7 Related Works
	8 Conclusion
	References
D-IOCost: Dynamic Cost-Aware Fair Queueing for Better I/O Proportionality and Performance
	1 Introduction
	2 Background and Motivation
		2.1 Budget-Based I/O Scheduler
		2.2 Fair Queueing I/O Scheduler
	3 Design and Implementation
		3.1 Weight Reallocation
		3.2 Dynamic Dispatch Parallelism
		3.3 Dynamic Time Window
	4 Experimental Evaluation
		4.1 Fairness and Performance on Different I/O Request Size
		4.2 Fairness and Performance on Different I/O Type and Access Format
	5 Summary
	References
Automated Binary Analysis: A Survey
	1 Introduction
	2 Data-Driven Binary Code Analysis
		2.1 Feature Engineering for Binary Representation
		2.2 Model Training for Binary Analysis
		2.3 Model Prediction, Evaluation, and Explanation
	3 Software-Engineering-Based Binary Code Analysis
		3.1 Offline Analysis
		3.2 Online Analysis
		3.3 Hybrid Analysis
	4 Challenge and Future Work
		4.1 Binary Optimization and Obfuscation
		4.2 Adversarial Binary Analysis
		4.3 Advanced Dynamic Attacks
	5 Conclusions
	References
LTNoT: Realizing the Trade-Offs Between Latency and Throughput in NVMe over TCP
	1 Introduction
	2 Background and Motivation
		2.1 NVMe-over-TCP
		2.2 NoT-Inherent CPU Overhead
		2.3 Blk-Switch
		2.4 Motivation
	3 System Overview
	4 LTNoT Design
		4.1 Inter-Queue I/O Isolation
		4.2 T-app and L-app Pipelines
		4.3 LTprio and Priority
	5 Performance Evaluation
		5.1 System Implementation
		5.2 Experimental Setup
		5.3 Evaluation Results
	6 Related Work and Discussion
	7 Conclusion
	References
AS-cast: Lock Down the Traffic of Decentralized Content Indexing at the Edge
	1 Introduction
	2 Motivation and Problem
	3 Adaptive Scoped Broadcast
		3.1 Scoped Broadcast
		3.2 Consistent Partitioning
		3.3 Dynamic Consistent Partitioning
	4 Experimentation
		4.1 Scalability and Trade-Off of AS-cast
		4.2 Traffic Containment in Dynamic Inter-autonomous Systems
	5 Related Work
	6 Conclusion
	References
Heterogeneous Graph Based Long- And Short-Term Preference Learning Model for Next POI Recommendation
	1 Introduction
	2 Related Works
	3 Preliminaries
	4 Our Model
		4.1 Embedding Layer
		4.2 The Long-Term Preference Learning Module
		4.3 The Short-Term Preference Learning Module
		4.4 Prediction Layer
		4.5 Model Optimization
	5 Experiments
		5.1 Datasets
		5.2 Baseline Models
		5.3 Evaluation Matrices
		5.4 Settings
		5.5 Comparison with Baselines
		5.6 Ablation Study
		5.7 Influence of Hyper-parameters
	6 Conclusion
	References
SMTWM: Secure Multiple Types Wildcard Pattern Matching Protocol from Oblivious Transfer
	1 Introduction
		1.1 Related Works
		1.2 Our Contributions
		1.3 Paper Organization
	2 Preliminaries and Definitions
		2.1 Oblivious Transfer
		2.2 Privacy Equality Test
		2.3 Secure Multiple Types Wildcard Pattern Matching Functionality
		2.4 Computationally Indistinguishability
		2.5 Security Definition
	3 Secure Multiple Types Wildcard Pattern Matching Protocol
		3.1 The Idea of Protocol
		3.2 Protocol Construction
	4 Protocol Efficiency
	5 Experiments
	6 Conclusion
	References
A Label Flipping Attack on Machine Learning Model and Its Defense Mechanism
	1 Introduction
	2 Related Work
		2.1 Poisoning Attack
		2.2 Defense Against Poisoning Attack
	3 Label Flipping Attack and Its Defense Method
		3.1 The Overall Block Diagram
		3.2 Label Flipping Attack Based on Agglomerative Hierarchical Clustering
		3.3 Label Correction Defense Method Based on TrAdaBoost
	4 Experimental Evaluation
		4.1 Experimental Results of Label Flipping Attack
		4.2 Experimental Results of Defense
	5 Conclusions and Limitations
	References
Astute Approach to Handling Memory Layouts of Regular Data Structures
	1 Introduction
		1.1 Motivational Example
		1.2 Objectives and Contributions
	2 Extensible Memory Layout Structures
		2.1 Decoupling the Memory Management
		2.2 Layout-Agnostic Functions
		2.3 Transformations
	3 Performance Impact of Constant Expressions
		3.1 Indexing Performance
		3.2 Constant-Loops Optimizations
	4 Implementation and Technical Insights
		4.1 Structures
		4.2 Functions
		4.3 Object Wrappers
	5 Related Work
	6 Conclusion
	A Experimental Methodology
		A.1 GPU Benchmarking Setup
		A.2 CPU Benchmarking Setup
	References
SparG: A Sparse GEMM Accelerator for Deep Learning Applications
	1 Introduction
	2 Background
		2.1 Matrix Multiplication in Deep Learning Workloads
		2.2 Sparsity in Deep Learning Workloads
	3 Inefficiency of TPU and SIGMA
		3.1 TPU
		3.2 SIGMA
		3.3 Mapping of Sparse Irregular GEMM
	4 The SparG Architecture
		4.1 Microarchitecture
		4.2 Example
	5 Evaluation
		5.1 Experimental Methodology
		5.2 Dense Regular and Dense Irregular GEMM
		5.3 Sparse Regular GEMM
		5.4 Sparse Irregular GEMM
		5.5 Scalability Analysis
		5.6 Hardware Cost Analysis
	6 Related Work
		6.1 Sparsity
		6.2 Flexible Interconnect
	7 Conclusion
	References
An Efficient Transformer Inference Engine on DSP
	1 Introduction
	2 Background
		2.1 Transformer
		2.2 Brief Introduction to the Processor
	3 Related Works
	4 Transformer Inference Engine on DSP
		4.1 High-efficiency Operator Library on Long Vector Architectures
		4.2 Software-managed Memory Optimization Strategy Based on Variable Lifecycle
		4.3 Sequence Warp
	5 Experiment
	6 Conclusion
	References
GCNPart: Interference-Aware Resource Partitioning Framework with Graph Convolutional Neural Networks and Deep Reinforcement Learning
	1 Introduction
	2 Motivation and Related Work
	3 Problem Formulation
	4 GCNPart: Design
		4.1 Overview Design
		4.2 The GCN-based Performance Prediction Model
		4.3 The DRL Decision Making Model
	5 Implementation
	6 Evaluations
		6.1 Experimental Settings
		6.2 Baselines
		6.3 Performance of the Prediction Model
		6.4 Overall Performance of the Partitioning Framework
		6.5 Impact of Transition Penalty
		6.6 Impact of Resource Sensitivity
	7 Related Work
	8 Limitations
	9 Conclusion
	References
PipeFB: An Optimized Pipeline Parallelism Scheme to Reduce the Peak Memory Usage
	1 Introduction
	2 Related Work
		2.1 Memory Optimization on a Single Device
		2.2 Distributed Parallelism Scheme
	3 Data Transfer Mechanism and Communication Analysis
		3.1 Data Transfer Mechanism
		3.2 Communication Analysis
	4 Design of the PipeFB
		4.1 Network Dividing and Deploying Method of PipeFB
		4.2 Communication Analysis of PipeFB
		4.3 Training Process of PipeFB
	5 PipeFB with the Data Transfer Mechanism
		5.1 PipeFB Applies the G-transfer
		5.2 Communication Analysis of the PipeFB with G-transfer
		5.3 PipeFB Applyies the C-transfer
		5.4 Communication Analysis of the PipeFB with C-transfer
		5.5 Summary
	6 Evaluation
		6.1 Memory Usage Test
		6.2 Training Speed Test
	7 Conclusion
	References
Operator Placement for IoT Data Streaming Applications in Edge Computing Environment
	1 Introduction
	2 Related Work
	3 Operator Placement Strategy
		3.1 System Overview
		3.2 System Model
		3.3 Constraint Programming Formulation
	4 Proposed Algorithm
		4.1 Genetic Algorithm Based on Constraint Programming
		4.2 Algorithm Description
	5 Baseline Approaches
		5.1 Greedy Intra-node Communication (GD-I)
		5.2 Greedy Improved Intra-node Communication (GD-II)
		5.3 Graph Partitioning (GP)
		5.4 Random Strategy (RS)
	6 Experimental Results and Analysis
		6.1 Experiment Configurations
		6.2 Metrics
		6.3 Results and Analysis
	7 Conclusion
	References
Makespan and Security-Aware Workflow Scheduling for Cloud Service Cost Minimization Using Firefly Optimizer
	1 Introduction
	2 Related Work
	3 System Model
		3.1 Application Model
		3.2 Cloud Resource Model
		3.3 Security Model
	4 Problem Formulation and Methodology Overview
		4.1 Problem Definition
		4.2 Methodology Overview
	5 Proposed Improved Firefly Algorithm
		5.1 Solution Representation and Initialization
		5.2 The Improved Updating Scheme
		5.3 The Distance-Based Mapping Operator
		5.4 Task Assignment Scheme
		5.5 The Detailed Procedures of IFA
	6 Simulation
	7 Conclusion
	References
Efficient Multiple-Precision and Mixed-Precision Floating-Point Fused Multiply-Accumulate Unit for HPC and AI Applications
	1 Introduction
	2 Related Works
		2.1 Basic FMA Architecture
	3 The Proposed FMA Architecture
		3.1 Input Operands Processing
		3.2 Multiple-Precision Multiplier
		3.3 Alignment Shifter
		3.4 Adder
		3.5 Leading Zero Anticipator
		3.6 Normalization
		3.7 Rounding and Exponent Adjustment
	4 Synthesis and Evaluation
	5 Conclusion
	References
Efficient-Secure k-means Clustering Guaranteeing Personalized Local Differential Privacy
	1 Introduction
	2 Preliminaries
	3 Proposed Approach
		3.1 Overview
		3.2 Proposed Framework
		3.3 Privacy Analysis
	4 Experimental Evaluation
		4.1 Experimental Environment and Datasets
		4.2 Experimental Setup and Evaluation Metrics
		4.3 Experimental Analysis
	5 Conclusion
	References
Optimizing Yinyang K-Means Algorithm on ARMv8 Many-Core CPUs
	1 Introduction
	2 Background
		2.1 Yinyang Algorithm
		2.2 ARMv8 Architecture
		2.3 Non Uniform Memory Access
	3 Analysis
	4 Optimization Technique
		4.1 Vectorization
		4.2 NUMA-Aware Optimization
		4.3 Memory Layout Optimization
	5 Experimental Result
		5.1 Setup
		5.2 Comparison
	6 Conclusion and Future Work
	References
Mining High-Value Patents Leveraging Massive Patent Data
	1 Introduction
	2 Related Work
		2.1 Patent Mining
		2.2 Heterogeneous Feature Fusion
		2.3 Multi-view Learning
	3 Framework
		3.1 Preliminaries of High-value Patent Mining
		3.2 Preliminaries of Multi-view Learning
		3.3 Overview of Framework
	4 Feature Extraction
		4.1 Structured Features Extraction Based on Patent Knowledge Graph
		4.2 Semantic Features Extraction Based on BERT
		4.3 Visual Features Extraction Based on DenseNet
	5 Multi-view Learning
	6 Evaluation
		6.1 Dataset Description
		6.2 Metrics on High-Value Patent Mining
		6.3 Baselines
		6.4 Results
		6.5 Case Study
	7 Conclusion and Future Work
	References
EasyNUSC: An Efficient Heterogeneous Computing Framework for Non-uniform Sampling Two-Dimensional Convolution Applications
	1 Introduction
	2 Related Work
		2.1 Heterogeneous Programming Model
		2.2 Non-uniform Sampling Two-Dimensional Convolution
		2.3 Parallelization Research of NUSC Applications
	3 Programming Model
	4 Runtime Framework
		4.1 Programming Interfaces for Users
		4.2 Framework Design and Implementation
		4.3 Performance Optimization
	5 Application Examples
		5.1 Astronomy Gridding Algorithm
		5.2 Geometric Correction Algorithm for Remote Sensing Images
	6 Experiments and Evaluation
		6.1 Simplification of Programming
		6.2 Performance Evaluation Within a Node
	7 Conclusion
	References
DNNEmu: A Lightweight Performance Emulator for Distributed DNN Training
	1 Introduction
	2 Background and Related Work
		2.1 Performance Measurement
		2.2 Performance Prediction
	3 System Design
		3.1 Overview of DNNEmu
		3.2 Layer-Wise Profiler
		3.3 Performance Predictor
		3.4 Computation Simulator
	4 Experimental Results
		4.1 Experimental Setup
		4.2 Runtime Prediction
		4.3 Distributed Training
		4.4 Communication Optimization
	5 Conclusion
	References
Ordis: A Dynamic Order-Dispatch Algorithm for Ridehailing and Ridesharing in a Large Region
	1 Introduction
	2 Related Work
	3 Problem Formulation
	4 Order Dispatch Algorithm
		4.1 Distributed Multi-queue Model
		4.2 Order Dispatching Algorithms
	5 Performance Evaluation
		5.1 Simulation System
		5.2 Dataset
		5.3 Results
	6 Conclusion
	References
Multi-initial-Center Federated Learning with Data Distribution Similarity-Aware Constraint
	1 Introduction
	2 Related Work
		2.1 Classic Federated Learning
		2.2 Personalized Federated Learning
	3 Preliminaries
	4 Framework
		4.1 Problem Formulation
		4.2 Cluster Structure Optimization
		4.3 Model Weights Optimization
	5 Experiments
		5.1 Datasets and Baselines
		5.2 Experimental Setting
		5.3 Effectiveness of Proposed Framework
		5.4 Sensitivity Analysis
		5.5 Ablation Results
		5.6 Discussion
	6 Conclusion
	References
An Efficient Graph Accelerator with Distributed On-Chip Memory Hierarchy
	1 Introduction
	2 Background and Motivation
		2.1 Programming Model
		2.2 Limitation of Existing Architecture
		2.3 Distributed On-Chip Memory Hierarchy
	3 GraphS Architecture
		3.1 Hardware Design
		3.2 Workflow
	4 GraphS Optimizations
		4.1 Data Placement
		4.2 Omega Network
		4.3 Degree-Aware Preprocessing Method
	5 Evaluation
		5.1 Experimental Settings
		5.2 Experimental Results
		5.3 Performance Scaling
	6 Conclusion
	References
Routing Protocol Based on Mission-Oriented Opportunistic Networks
	1 Introduction
	2 Related Work
	3 Routing Protocol Based on Multi-dimensional Trust Evaluation Parameters
		3.1 Multidimensional Trust Evaluation Parameters
		3.2 MTEPRP Routing Protocol
	4 Experimental Evaluation
		4.1 Simulation Environment Settings
		4.2 Analysis of Simulation Result
	5 Conclusion
	References
Author Index




نظرات کاربران