Home
Publications
Posts
Talks
Projects
Recognition
Experience
Links
#cuda
(all tags)
Publications
Movement and Placement of Non-Contiguous Data In Distributed GPU Computing
Carl Pearson
Ph.D. Dissertation
04/21
Machine Learning for CUDA+MPI Design Rules
Carl Pearson
, Aurya Javeed, Karen Devine
in
23rd IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC)
03/22
Adaptive Cache Bypass and Insertion for Many-Core Accelerators
Xuhao Chen, Shengzhao Wu, Li-Wen Chang, Wei-Sheng Huang,
Carl Pearson
, Wen-mei Hwu
in
Proceedings of International Workshop on Manycore Embedded Systems, 2016
06/14
Collaborative (CPU+ GPU) Algorithms for Triangle Counting and Truss Decomposition
Vikram S. Mailthody, Ketan Date, Zaid Qureshi,
Carl Pearson
, Rakesh Nagi, Jinjun Xiong, Wen-Mei Hwu
in
2018 IEEE High Performance Extreme Computing Conference
09/18
Posts
Improving MPI_Pack performance in CUDA-aware MPI
10/06/20
CUDA Releases: Component Versions and Sizes
10/23/23
Using nvtx-connector and Nsight Systems to Understand your Kokkos Application
07/29/24
Talks
Evaluating Characteristics of CUDA Communication Primitives on High-Bandwidth Interconnects
at
ACM/SPEC International Conference on Performance Engineering
04/10/19
Benchmarking CUDA Communication Primitives on High-Bandwidth Interconnects
at
ADA Liason Meeting
06/05/19
Latency and Bandwidth Microbenchmarks of Six US Department of Energy Systems in the Top500
at
Cluster 2023
11/02/23
Latency and Bandwidth Microbenchmarks of US Department of Energy Systems in the June 2023 Top500 List
at
Supercomputing 2023
11/13/23
Kokkos Kernels: State on Exascale Architectures
at
Kokkos User Group Meeting 2023
12/12/23