ورود به حساب

نام کاربری گذرواژه

گذرواژه را فراموش کردید؟ کلیک کنید

حساب کاربری ندارید؟ ساخت حساب

ساخت حساب کاربری

نام نام کاربری ایمیل شماره موبایل گذرواژه

برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید


09117307688
09117179751

در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید

دسترسی نامحدود

برای کاربرانی که ثبت نام کرده اند

ضمانت بازگشت وجه

درصورت عدم همخوانی توضیحات با کتاب

پشتیبانی

از ساعت 7 صبح تا 10 شب

دانلود کتاب High Performance Computing: 38th International Conference, ISC High Performance 2023, Hamburg, Germany, May 21–25, 2023, Proceedings (Lecture Notes in Computer Science)

دانلود کتاب محاسبات با کارایی بالا: 38 مین کنفرانس بین المللی ، ISC با عملکرد بالا 2023 ، هامبورگ ، آلمان ، 21-25 مه 2023 ، مجموعه مقالات (یادداشت های سخنرانی در علوم کامپیوتر)

High Performance Computing: 38th International Conference, ISC High Performance 2023, Hamburg, Germany, May 21–25, 2023, Proceedings (Lecture Notes in Computer Science)

مشخصات کتاب

High Performance Computing: 38th International Conference, ISC High Performance 2023, Hamburg, Germany, May 21–25, 2023, Proceedings (Lecture Notes in Computer Science)

ویرایش:  
نویسندگان: , , ,   
سری:  
ISBN (شابک) : 3031320409, 9783031320408 
ناشر: Springer 
سال نشر: 2023 
تعداد صفحات: 440
[432] 
زبان: English 
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) 
حجم فایل: 34 Mb 

قیمت کتاب (تومان) : 69,000



ثبت امتیاز به این کتاب

میانگین امتیاز به این کتاب :
       تعداد امتیاز دهندگان : 3


در صورت تبدیل فایل کتاب High Performance Computing: 38th International Conference, ISC High Performance 2023, Hamburg, Germany, May 21–25, 2023, Proceedings (Lecture Notes in Computer Science) به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.

توجه داشته باشید کتاب محاسبات با کارایی بالا: 38 مین کنفرانس بین المللی ، ISC با عملکرد بالا 2023 ، هامبورگ ، آلمان ، 21-25 مه 2023 ، مجموعه مقالات (یادداشت های سخنرانی در علوم کامپیوتر) نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.


توضیحاتی درمورد کتاب به خارجی



فهرست مطالب

Preface
Organization
Contents
Architecture, Networks, and Storage
CPU Architecture Modelling and Co-design
	1 Introduction
	2 Approach to Modelling
	3 Methodology
	4 Model Tuning and Validation
	5 Applications
		5.1 GROMACS
		5.2 GPAW
	6 Results
		6.1 GROMACS
		6.2 GPAW
	7 Related Work
	8 Summary and Conclusions
	References
Illuminating the I/O Optimization Path of Scientific Applications
	1 Introduction
	2 Related Work
	3 Visualization, Diagnosis, and Recommendations
		3.1 Extracting I/O Behavior from Metrics
		3.2 Exploring I/O Behavior Interactively
		3.3 Automatic Detection of I/O Bottlenecks
		3.4 Exploring I/O Phases and Bottlenecks
		3.5 Towards Exploring File System Usage
	4 Results
		4.1 I/O Systems in NERSC and OLCF
		4.2 I/O Bottlenecks in OpenPMD
		4.3 Improving AMReX with Asynchronous I/O
	5 Conclusion
	References
Efficient Large Scale DLRM Implementation on Heterogeneous Memory Systems
	1 Introduction
	2 Related Work
	3 Implementing Embedding Tables in Heterogeneous Memory Systems
	4 Cached Embeddings
		4.1 CachedEmbeddings Performance
	5 DLRM Implementation Methodology
	6 End-to-End DLRM Performance
	7 Conclusions and Future Work
	References
HPC Algorithms and Applications
Efficient GPU Offloading with OpenMP for a Hyperbolic Finite Volume Solver on Dynamically Adaptive Meshes
	1 Introduction
	2 Science Case and Code Architecture
	3 A Realisation of GPU Offloads with target map
	4 User-Managed Memory Management
		4.1 Data Pre-allocation on the GPU
		4.2 Pre-allocation on the CPU with Unified Memory
	5 Results
	6 Discussion and Conclusions
	7 Summary and Outlook
	References
Shallow Water DG Simulations on FPGAs: Design and Comparison of a Novel Code Generation Pipeline
	1 Introduction
	2 Background
		2.1 Mathematical Model and Numerical Scheme
		2.2 Simulation Scenario: Radial Dam Break
		2.3 FPGAs
	3 Proposed Code Generation Pipeline (CGP)
		3.1 GHODDESS
		3.2 pystencils
		3.3 StencilStream
		3.4 Integration
	4 Existing Dataflow Design
	5 FPGA Designs, Experiments and Evaluation
		5.1 Performance of the CPU Reference and Validation
	6 Analysis
	7 Related Work
	8 Conclusion and Outlook
	References
Massively Parallel Genetic Optimization Through Asynchronous Propagation of Populations
	1 Introduction
	2 Related Work
	3 Propulate Algorithm and Implementation
	4 Experimental Evaluation
		4.1 Experimental Environment
		4.2 Benchmark Functions
		4.3 Meta-optimizing the Optimizer
		4.4 Benchmark Function Optimization
		4.5 HP Optimization for Remote Sensing Classification
		4.6 Scaling
	5 Conclusion
	References
Steering Customized AI Architectures for HPC Scientific Applications
	1 Introduction
	2 Related Work and Research Contributions
	3 Batching/Compression or Why Matricization Matters?
	4 The Graphcore IPU Hardware Technology
		4.1 Architecture Principles and Hardware Details
		4.2 Programming Model and Poplar Development Kit
	5 HPC Scientific Applications
		5.1 Adaptive Optics in Computational Astronomy
		5.2 Seismic Processing and Imaging
		5.3 Climate/Weather Prediction Applications
		5.4 Wireless Communications
	6 Implementation Details
	7 Performance Results
	8 Limitations and Perspectives
	9 Conclusion and Future Work
	References
GPU-Based Low-Precision Detection Approach for Massive MIMO Systems
	1 Introduction
	2 Brief Background
		2.1 Modulation
		2.2 Signal to Noise Ratio (SNR)
		2.3 Error Rate and Time Complexity
	3 Related Work
	4 System Model
		4.1 Tree-Based Representation
	5 Multi-level Approach
	6 GPU-Based Multi-level Approaches
		6.1 GPU Multi-level
		6.2 Multi-GPU Version
	7 Results and Discussions
	8 Conclusion and Perspectives
	References
A Mixed Precision Randomized Preconditioner for the LSQR Solver on GPUs
	1 Introduction
	2 Background
		2.1 Related Work
	3 Design and Implementation of the Mixed Precision Preconditioner
	4 Numerical Experiments
		4.1 Experiment Setup
		4.2 Discussion
	5 Conclusion
	References
Ready for the Frontier: Preparing Applications for the World's First Exascale System
	1 Introduction and Background
	2 Systems Overview
		2.1 Summit
		2.2 Frontier
	3 Applications
		3.1 CoMet
		3.2 Cholla: Computational Hydrodynamics on Parallel Architecture
		3.3 GESTS: GPUs for Extreme-Scale Turbulence Simulations
		3.4 LBPM: Lattice Boltzmann Methods for Porous Media
		3.5 LSMS
		3.6 NUCCOR/NTCL
		3.7 NAMD
		3.8 PIConGPU
	4 Lessons Learned
	5 Conclusions
	References
End-to-End Differentiable Reactive Molecular Dynamics Simulations Using JAX
	1 Introduction
		1.1 Related Work
		1.2 Our Contribution
	2 Background
		2.1 ReaxFF Overview
		2.2 JAX and JAX-MD Overview
	3 Design and Implementation
		3.1 Memory Management
		3.2 Generation of Interaction Lists
		3.3 Force Field Training
	4 Experimental Results
		4.1 Software and Hardware Setup
		4.2 Validation of MD Capabilities
		4.3 Performance and Scalability
		4.4 Training
	5 Conclusion
	References
Machine Learning, AI, and Quantum Computing
Allegro-Legato: Scalable, Fast, and Robust Neural-Network Quantum Molecular Dynamics via Sharpness-Aware Minimization
	1 Introduction
	2 Method Innovation
		2.1 Summary of Neural-Network Quantum Molecular Dynamics
		2.2 Summary of Sharpness-Aware Minimization
		2.3 Key Innovation: Allegro-Legato: SAM-Enhanced Allegro
		2.4 RXMD-NN: Scalable Parallel Implementation of Allegro-Legato NNQMD
	3 Results
		3.1 Experimental Platform
		3.2 Fidelity-Scaling Results
		3.3 Computational-Scaling Results
	4 Discussions
		4.1 Simulation Time
		4.2 Training Time
		4.3 Model Accuracy
		4.4 Implicit Sharpness Regularization in Allegro
		4.5 Training Details
	5 Applications
	6 Related Work
	7 Conclusion
	References
Quantum Annealing vs. QAOA: 127 Qubit Higher-Order Ising Problems on NISQ Computers
	1 Introduction
	2 Methods
		2.1 Ising Model Problem Instances
		2.2 Quantum Alternating Operator Ansatz
		2.3 Quantum Annealing
		2.4 Simulated Annealing Implementation
	3 Results
	4 Discussion
	References
Quantum Circuit Simulation by SGEMM Emulation on Tensor Cores and Automatic Precision Selection
	1 Introduction
	2 Background
		2.1 NVIDIA Tensor Core and SGEMM Emulation
		2.2 Quantum Circuit Simulation and Tensor Network Contraction
	3 SGEMM Emulation Library on Tensor Cores
	4 Automatic Precision Selection
		4.1 Exponent Statistics and Computing Mode Selection Rule
		4.2 Dynamic Kernel Selection
		4.3 The Overhead of the Exponent Statistics
	5 Experiment
		5.1 Preparation
		5.2 Exploratory Experiment
		5.3 Random Quantum Circuit Simulation
	6 Conclusion
	References
Performance Modeling, Evaluation, and Analysis
A Study on the Performance Implications of AArch64 Atomics
	1 Introduction
	2 The Problem
		2.1 RAJAPerf and the PI_ATOMIC kernel
		2.2 Performance Results
		2.3 A Closer Look at OpenMP Floating-Point Atomics
	3 Benchmarking CAS Operations
		3.1 Compare-and-Swap Operations
		3.2 Benchmark Description
		3.3 Assembly Kernels
	4 Experiments and Observations
		4.1 Evaluating the Performance of CAS
		4.2 A Closer Look at A64FX
		4.3 Testing LL-SC Implementations
		4.4 Summary and Recommendations
	5 Related Work
	6 Conclusions
	References
Analyzing Resource Utilization in an HPC System: A Case Study of NERSC's Perlmutter
	1 Introduction
	2 Related Work
	3 Background
		3.1 System Overview
		3.2 Data Collection
		3.3 Analysis Methods
	4 Results
		4.1 Workloads Overview
		4.2 Resource Utilization
		4.3 Temporal Characteristics
		4.4 Spatial Characteristics
		4.5 Correlations
	5 Discussion and Conclusion
	References
Overcoming Weak Scaling Challenges in Tree-Based Nearest Neighbor Time Series Mining
	1 Introduction
	2 Matrix Profile Background and Performance-Accuracy Trade-offs
		2.1 Related Work
		2.2 Potentials of Tree-based Methods
	3 Current Parallel Tree-Based Approach and Its Shortcomings
	4 Overcoming the Scalability Challenges
		4.1 Pipelining Mechanism
		4.2 Forest of Trees on Ensembles of Resources:
	5 Modeling the Impact of Optimizations on Complexity
	6 Experimental Setup
	7 Evaluations
		7.1 Region of Benefit
		7.2 Performance on Real-World Datasets
		7.3 Single-Node Performance
		7.4 Scaling Overheads
		7.5 Effects of Pipelining and Forest Mechanisms
		7.6 Scaling Results
		7.7 Billion Scale Experiment
	8 Conclusions
	References
Porting Numerical Integration Codes from CUDA to oneAPI: A Case Study
	1 Introduction
	2 Background
		2.1 oneAPI and SYCL
		2.2 CUDA-Backend for SYCL
		2.3 Related Work
	3 Numerical Integration Use Case
		3.1 PAGANI
		3.2 m-Cubes
	4 Porting Process
		4.1 Challenges
	5 Experimental Results
		5.1 Offloading Mathematical Computations to Kernels
		5.2 Benchmark Integrands Performance Comparison
		5.3 Simple Integrands Performance Comparison
		5.4 Factors Limiting Performance
	6 Conclusion
	References
Performance Evaluation of a Next-Generation SX-Aurora TSUBASA Vector Supercomputer
	1 Introduction
	2 Overview of SX-Aurora TSUBASA VE30
		2.1 The SX-Aurora TSUBASA Product Family
		2.2 Basic Architecture of the VE30 Processor
		2.3 Architectural Improvements from the VE20 Processor
	3 Performance Evaluation
		3.1 Evaluation Environment
		3.2 Basic Benchmarks
		3.3 Evaluation of Architectural Improvements
		3.4 Real-World Workloads
	4 Performance Tuning for VE30
		4.1 Selective L3 Caching
		4.2 Partitioning Mode
	5 Conclusions
	References
Programming Environments and Systems Software
Expression Isolation of Compiler-Induced Numerical Inconsistencies in Heterogeneous Code
	1 Introduction
	2 Examples of Compiler-Induced Inconsistencies
	3 Technical Approach
		3.1 Hierarchy Extraction
		3.2 Hierarchical Code Isolation
		3.3 Source-to-Source Precision Enhancement
	4 Experimental Evaluation
		4.1 RQ1: Numerical Inconsistencies in Heterogeneous Programs
		4.2 RQ2: Comparison with the State of the Art
		4.3 Threats to Validity
	5 Related Work
	6 Conclusion
	References
SAI: AI-Enabled Speech Assistant Interface for Science Gateways in HPC
	1 Introduction and Motivation
		1.1 Motivation
		1.2 Challenges in Enabling Conversational Interface for HPC
		1.3 Contributions
	2 Background
		2.1 Conversational User Interface
		2.2 Open OnDemand
		2.3 Ontology and Knowledge Graphs
		2.4 Spack
	3 Terminologies
	4 Proposed SAI Framework
		4.1 Generating HPC Datasets for Speech and Text
		4.2 Fine-Tuning Speech Recognition Model for HPC Terminologies
		4.3 Designing an Entity Detection and Classification Model for SAI
		4.4 Creating the HPC Ontology and Knowledge Graphs
		4.5 Knowledge Graph Selection and Inference
		4.6 Software Installer Check and Interfacing with Spack
		4.7 Integration with Open OnDemand
	5 Insights into SAI Usage and Explainable Flow
	6 Experimental Evaluation
		6.1 Evaluation Platform
		6.2 Evaluation Methodology
		6.3 Evaluating ASR Model
		6.4 Evaluating NLU Model
		6.5 Performance Evaluation of Combined ASR and NLU Models
		6.6 Overhead Analysis of SAI
		6.7 Overhead Analysis of Scaling Passenger App Users
		6.8 Analysis of SAI Interactive App on Different Architectures
	7 Discussion
		7.1 Security and Authentication
		7.2 Handling Ambiguous Queries in SAI
		7.3 Trade-offs for Converting Speech to Entities
		7.4 Portability for New Software and Systems
	8 Related Work
	9 Future Work
	10 Conclusion
	References
Author Index




نظرات کاربران