-
Programming Model Extensions for General-Purpose Processing-In-Memory
Hyesun Hong, Lukas Sommer,B ongjun Kim, Mikhail Kashkarov, Kumudha Narasimhan, Ilya Veselov, Mehdi Goli, Jaeyeon Kim, Ruyman Reyes Castro, Hanwoong Jung
ISC
High Performance 2024, May 2024
[pdf]
-
A performance analysis of leading many-core technologies for Cellular Automata execution
Alessio De Rango, Donato D'Ambrosio, Alfonso Senatore, Giuseppe Mendicino, Kumudha Narasimhan, Mehdi Goli, Rod Burns
International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (
HeteroPar
) @ Euro-Par, August 2023
[pdf]
-
Accelerating Neural Networks using Open Standard Software on RISC-V
Kumudha Narasimhan, Mehdi Goli
International workshop on RISC-V for HPC @ ISC 2023 (
RISC-V HPC
), May 2023
[pdf], [slides]
-
User-Driven Online Kernel Fusion for SYCL
Victor Perez, Lukas Sommer, Victor Lomüller, Kumudha Narasimhan, Mehdi Goli
ACM Transactions on Architecture and Code Optimization (
TACO
), Nov 2022
[pdf]
-
Toward Performance Portability of AI Graphs Using SYCL
Kumudha Narasimhan, Ouadie El Farouki, Mehdi Goli, Muhammad Tanvir, Svetlozar Georgiev, Isaac Ault
Performance, Portability, and Productivity in HPC (
P3HPC
) @ Supercomputing (SC), Nov 2022
[pdf]
-
Towards performance portability of AI models using SYCL-DNN
Muhammad Tanvir, Kumudha Narasimhan, Mehdi Goli, Ouadie El Farouki, Svetlozar Georgiev, Isaac Ault
International Workshop on OpenCL (
IWOCL
), May 2022
[pdf]
-
Improving Performance of SYCL Applications on CPU Architectures using LLVM-directed Compilation Flow
Pietro Ghiglio, Uwe Dolinsky, Mehdi Goli, Kumudha Naraimshan
Programming Models and Applications for Multicores and Manycores (
PMAM
) @ PPoPP, Apr 2022
[pdf],
[slides]
-
A Practical Tile Size Selection Model for Affine Loop Nests
Kumudha Naraimshan, Aravind Acharya, Abhinav Baid, Uday Bondhugula
-
Towards Cross-Platform Performance Portability of DNN Models using SYCL
Mehdi Goli, Kumudha Naraimshan, Ruyman Reyes et. al.
Performance, Portability, and Productivity in HPC (
P3HPC
) @ Supercomputing (SC), Nov 2020
[pdf]
-
Optimizing Dense Matrix Computations with PolyMage
Kumudha KN
-
Optimizing Geometric Multigrid Method Computation using a DSL Approach
Vinay Vasista, Kumudha Naraimshan, Siddharth Bhat, Uday Bondhugula
Supercomputing (
SC
), Nov 2017
[pdf]