دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش: نویسندگان: Stephen Poole (editor), Oscar Hernandez (editor), Matthew Baker (editor), Tony Curtis (editor) سری: ISBN (شابک) : 3031048873, 9783031048876 ناشر: Springer سال نشر: 2022 تعداد صفحات: 212 [205] زبان: English فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) حجم فایل: 14 Mb
در صورت تبدیل فایل کتاب OpenSHMEM and Related Technologies. OpenSHMEM in the Era of Exascale and Smart Networks (Lecture Notes in Computer Science) به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب OpenShmem و فن آوری های مرتبط. OpenShmem در دوره شبکه های Exascale و Smart (یادداشت های سخنرانی در علوم کامپیوتر) نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
Preface Organization Contents Applications and Implementations A Study in SHMEM: Parallel Graph Algorithm Acceleration with Distributed Symmetric Memory 1 Introduction 2 Background 2.1 PGAS 2.2 SHMEM 2.3 Minimum Spanning Tree 3 Related Research 3.1 OpenSHMEM API Calls 3.2 OpenSHMEM Graph Processing 3.3 Productivity Studies 3.4 Parallel MST 4 Experiments 4.1 API Level 4.2 Datasets 4.3 Testbed 4.4 Algorithm 4.5 Algorithm Variables 4.6 SHMEM Optimizations 5 Results 5.1 API Level 5.2 MST Algorithm 5.3 Productivity Studies 6 Discussion 6.1 API Level 6.2 Productivity Studies 6.3 MST Algorithm 7 Conclusions 8 Future Work References OpenFAM: A Library for Programming Disaggregated Memory 1 Introduction 2 The OpenFAM API 3 The OpenFAM Reference Implementation 4 Performance Measurements 4.1 Data Path Performance 4.2 Meta Data Operations 5 Discussion and Future Work 6 Related Work 7 Summary References OpenSHMEM over MPI as a Performance Contender: Thorough Analysis and Optimizations 1 Introduction 2 Background 2.1 Semantics Overview 2.2 OSHMPI 3 Related Work 4 Analysis of Performance Loss in Contiguous RMA 5 Optimizations for Fast RMA 5.1 Basic Datatype Decoding with IPO Link-Time Inlining 5.2 Fast Window Attributes Access 5.3 Avoiding Virtual Address Translation 5.4 Optimizing MPI Progress 5.5 Reducing Synchronization in OSHMPI 5.6 Other Implementation-Specific Optimizations 6 Evaluation 6.1 Instruction Analysis 6.2 Latency 6.3 Message Rate 7 Conclusion and Future Work References Tools and Benchmarks SKaMPI-OpenSHMEM: Measuring OpenSHMEM Communication Routines 1 Introduction 2 Related Works 3 Measuring OpenSHMEM Communication Routines 3.1 Point-to-Point Communication Routines 3.2 Collective Operations 3.3 Fine-Grain Measurements 4 Experimental Evaluation 4.1 Loop Measurement Granularity 4.2 Point-to-Point Communications: Blocking vs Non-blocking 4.3 Point-to-Point Communications: Overlap Capabilities 4.4 Collective Communications: Broadcast 4.5 Locks 5 Conclusion and Perspectives References A Tools Information Interface for OpenSHMEM 1 Introduction 2 Related Work 3 OpenSHMEM Performance Variables 4 Design of a Tools Information Interface 4.1 shmem_t Variables 5 Example Tool Usage 6 Conclusion References CircusTent: A Tool for Measuring the Performance of Atomic Memory Operations on Emerging Architectures 1 Introduction 2 Background 2.1 Atomic Operations in Distributed Memory 2.2 CircusTent 3 CircusTent for Distributed Memory 3.1 OpenSHMEM 3.2 MPI RMA 4 Evaluation 4.1 Methodology 4.2 Platforms 4.3 Kernel Scalability 4.4 Shared Memory Optimizations 4.5 Observations 5 Related Work 5.1 Atomic Memory Operations 5.2 Memory Benchmarks 6 Conclusion 7 Future Work References SHMEM-ML: Leveraging OpenSHMEM and Apache Arrow for Scalable, Composable Machine Learning 1 Motivation 1.1 Related Work: Using HPC Frameworks Under Existing DS/ML Frameworks 1.2 Related Work: Novel HPC DS/ML Frameworks 1.3 Contributions 2 Programming Model 2.1 Distributed SHMEM-ML Arrays 2.2 SHMEM-ML Arrays with Third Party Python Libraries 2.3 Client-Server vs. SPMD 3 Implementation 3.1 Background: OpenSHMEM 3.2 Background: Apache Arrow 3.3 Background: Scikit-Learn, Tensorflow, and Horovod 3.4 ND-Array Implementation 3.5 Client-Server Implementation 3.6 Integration with Scikit-Learn and Tensorflow/Keras 4 Performance Evaluation 4.1 Scikit-Learn 4.2 Tensorflow 5 Conclusions References Programming Models and Extensions OpenSHMEM Active Message Extension for Task-Based Programming 1 Introduction 2 Background 2.1 Task-Based Programming and Active Messages 2.2 OpenSHMEM 2.3 Related Work 3 Design and Implementation 3.1 OpenSHMEM Active Message API Extension 3.2 Implementation 4 Performance Evaluation 4.1 Latency and Throughput 4.2 Tasking Framework Efficiency 5 Conclusion and Future Work References UCX Programming Interface for Remote Function Injection and Invocation 1 Introduction 2 Background 2.1 Two-Chains 2.2 Related Work 3 Design and Implementation 3.1 The ifunc API 3.2 Using the API 3.3 ifuncs versus UCX Active Messages 3.4 Implementing the API 3.5 Security Implications and Mitigations 4 Evaluation 4.1 Microbenchmark Description 4.2 Testbed Platform 4.3 Experimental Results and Analysis 4.4 Takeaways 5 Conclusion 5.1 Future Work References Can Deferring Small Messages Enhance the Performance of OpenSHMEM Applications? 1 Introduction 2 Related Work 2.1 Nagle's Algorithm 2.2 DMAPP Bundled Puts 2.3 Bale: Exstack and Conveyor 2.4 Libfabric (FI_MORE) 2.5 IB Verbs Postlists 3 Background 3.1 How Nagle's Algorithm Works and Its Caveats 3.2 The Bale Programming Model and a Performance Summary 3.3 OpenSHMEM Contexts 4 Design and Discussion 4.1 Proposed API 4.2 API Extensions (Items up for Discussion) 5 Performance Evaluation 5.1 Experimental Setup 5.2 Perftools Measurement of IB Verbs Message Rates 5.3 OSU Microbenchmarks: Chained vs. Unchained Message Rates 5.4 GUPS with and Without Chaining 6 Future Work and Conclusion References Remote Programmability Model for SmartNICs in HPC Workloads 1 Introduction 2 Proposed System Design 2.1 Rationale 2.2 Example: SmartNIC-Based Storage 3 Scalability 4 Related Work 5 Conclusions and Summary References Dynamic Symmetric Heap Allocation in NVSHMEM 1 Introduction 2 NVSHMEM 2.1 NVSHMEM Symmetric Memory 2.2 Peer-to-Peer Memory Access 2.3 Network Interconnect 3 CUDA Virtual Memory Management (VMM) API 4 Using VMM for Dynamic Heap Allocation 5 Experimental Results 5.1 Register Count 5.2 Memory Allocation Time 6 Related Work 7 Conclusion and Future Work References Author Index