Adding MFMA Support to gem5. Marco Kurzynski and Matthew D. Sinclair. Preprint on ArXiV. Download: PDF
NUMA-Aware Queue Scheduler for Multi-Chiplet GPUs. Neeraj Surawar, MS Thesis, December 2024. Download: PDF
PAL: A Variability-Aware Policy for Scheduling ML Workloads in GPU Clusters. Rutwik Jain, Brandon Tran, Keting Chen, Matthew D. Sinclair, Shivaram Venkataraman. In the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), November 2024. Download: ArtifactPDF
CPElide: Efficient Multi-Chiplet GPU Implicit Synchronization. Preyesh Dalmia, Rajesh Shashi Kumar, and Matthew D. Sinclair. In IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2024. Download: ArtifactPDFPresentation.
Simulating Machine Learning Models at Scale. Vishnu Ramadas and Matthew D. Sinclair. At SRC TECHCON. September 2024. Download: PDF
Global Optimizations & Lightweight Dynamic Logic for Concurrency. Suchita Pati, Shaizeen Aga, Nuwan Jayasena, and Matthew D. Sinclair. Preprint on ArXiV. Download: PDF
PAL: A Variability-Aware Policy for Scheduling ML Workloads in GPU Clusters. Rutwik Jain, Brandon Tran, Keting Chen, Matthew D. Sinclair, and Shivaram Venkataraman. Preprint on ArXiV, August 2024. Download: PDF
Creating Flexible, High Fidelity Energy Modeling for Future HPC Systems. Matthew D. Sinclair, Bobby Bruce, William Godoy, Oscar Hernandez, Jason Lowe-Power, and Shivaram Venkataraman. In DOE Energy-Efficient Computing for Science Workshop. September 2024. Download: PDF
Designing Generalizable Power Models For Open-Source Architecture Simulators. Alex Smith, Bobby Bruce, Jason Lowe-Power and Matthew D. Sinclair. In 3rd Open-Source Computer Architecture Research Workshop (OSCAR). June 2024. Download: PDF
Cross-stack Optimizations for Sequence-based Models on GPUs. Suchita Pati. PhD Thesis, May 2024. Download: PDF
Further Closing the GAP: Improving the Accuracy of gem5's GPU Models. Vishnu Ramadas, Daniel Kouchekinia, and Matthew D. Sinclair. In 6th Young Architects' (YArch) Workshop, April 2024. Download: PDF
Building Better Tools To Enable Power- and Sustainability-Aware Co-Design. Matthew D. Sinclair, Brandon Tran, and Akanksha Chaudhari. In NSF Workshop on Sustainable Computing for Sustainability (NSF-WSCS). April 2024.
T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives. Suchita Pati, Shaizeen Aga, Mahzabeen Islam, Nuwan Jayasena, and Matthew D. Sinclair. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). April 2024. Download: PDFLightning Talk
Architecture 2.0. Siddharth Garg, Brian Hirano, Jenny Huang, Yingyan (Celine) Lin, Vijay Janapa Reddi, Tushar Krishna, Srivatsan Krishnan, Benjamin Lee, Jason Lowe-Power, Martin Maas, Shvetank Prakash, Matthew D. Sinclair, Srinivas Sridharan, Amir Yazdanbakhsh, Jason Yik, and Cliff Young. December 2023. Download: Preliminary SummarySIGARCH Blog Post
Fifty Years of ISCA: A data-driven retrospective on key trends. Matthew D. Sinclair, Parthasarathy Ranganathan, Gaurang Upasani, Adrian Sampson, David Patterson, Rutwik Jain, Shaan Shah, Nidhi Parthasarathy. In IEEE Micro, vol. 43, no. 6, pp. 109-124, Nov.-Dec. 2023, doi: 10.1109/MM.2023.3324465. Download: PDFData
Tale of Two Cs: Computation vs. Communication Scaling for Future Transformers on Future Hardware. Suchita Pati, Shaizeen Aga, Mahzabeen Islam, Nuwan Jayasena, Matthew D. Sinclair. In 2023 IEEE International Symposium on Workload Characterization (IISWC), October 2023. Download: PDF
Reducing Synchronization and Communication Overhead in GPUs. Preyesh Dalmia. PhD Thesis, September 2023. Download: PDF
Fifty Years of ISCA: A data-driven retrospective on key trends. Gaurang Upasani, Matthew D. Sinclair, Adrian Sampson, Parthasarathy Ranganathan, David Patterson, Shaan Shah, Nidhi Parthasarathy, Rutwik Jain. Preprint on ArXiV, June 2023. Download: PDF
Closing the Gap: Improving the Accuracy of gem5’s GPU Models. Vishnu Ramadas, Daniel Kouchekinia, Ndubuisi Osuji, and Matthew D. Sinclair. In 5th gem5 Users' Workshop, June 2023. Download: Abstract
Improving gem5’s GPUFS Support. Vishnu Ramadas, Matthew Poremba, Bradford M. Beckmann, and Matthew D. Sinclair. In 5th gem5 Users' Workshop, June 2023. Download: Abstract
Analyzing the Benefits of More Complex Cache Replacement Policies in Moderns GPU LLCs. Jarvis Jia and Matthew D. Sinclair. In 5th gem5 Users' Workshop, June 2023. Download: Abstract
Improving the Speed of gem5’s GPU Regression Tests. James Braun and Matthew D. Sinclair. In 5th gem5 Users' Workshop, June 2023. Download: Abstract
Integrating Per-Stream Stat Tracking into Accel-Sim. Shichen (Justin) Qiao, Xin (Cassie) Su, Matthew D. Sinclair. Preprint on ArXiV, April 2023. Download: PDF
Computation vs. Communication Scaling for Future Transformers on Future Hardware. Suchita Pati, Shaizeen Aga, Mahzabeen Islam, Nuwan Jayasena, Matthew D. Sinclair. Preprint on ArXiV, February 2023. Download: PDF
Improving the Scalability of GPU Synchronization Primitives. Preyesh Dalmia, Rohan Mahapatra, Jeremy Intan, Dan Negrut, and Matthew D. Sinclair. In Transactions on Parallell and Distributed Computing (TPDS), 2022. Download: PDFArtifactCode
Demystifying BERT: System Design Implications. Suchita Pati, Shaizeen Aga, Nuwan Jayasena, and Matthew D. Sinclair. In IEEE International Symposium on Workload Characterization (IISWC), November 2022. Download: PDF
Not All GPUs Are Created Equal: Characterizing Variability in Large-Scale, Accelerator-Rich Systems. Prasoon Sinha, Akhil Guliani, Rutwik Jain, Brandon Tran, Matthew D. Sinclair, and Shivaram Venkataraman. In the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), November 2022. Download: PDFPresentationArtifact
Not All GPUs Are Created Equal: Characterizing Variability in Large-Scale, Accelerator-Rich Systems. Prasoon Sinha, Akhil Guliani, Rutwik Jain, Brandon Tran, Matthew D. Sinclair, and Shivaram Venkataraman. Preprint on ArXiV, August 2022. Download: PDF
gem5 GPU Accuracy Profiler (GAP). Charles Jamieson, Anushka Chandrashekar, Ian McDougall, and Matthew D. Sinclair. In 4th gem5 Users' Workshop, June 2022. Download: AbstractPDFPresentation
Johnathan Alsop, Weon Taek Na, Samuel Grayson, Matthew D. Sinclair, and Sarita V. Adve. A Case for Fine-grained Coherence Specialization in Heterogeneous Systems, in Transactions on Architecture and Code Optimizations (TACO), 2022. Download: PDF
Only Buffer When You Need To: Reducing On-chip GPU Traffic with Reconfigurable Local Atomic Buffers. Preyesh Dalmia, Rohan Mahapatra, and Matthew D. Sinclair. In 28th IEEE International Symposium on High-Performance Computer Architecture (HPCA 2022), February 2022. Download: PDF
DENNI: Distributed Neural Network Inference on Severely Resource Constrained Edge Devices. Rohit Sanu, Ryan J. Toepfer, Matthew D. Sinclair, and Henry Duwe III. In 40th IEEE International Performance Computing and Communications Conference (IPCCC), October 2021. Download: PDF
Reducing Synchronization Overhead for Persistent RNNs. Qinjun Jiang. Published in UW-Madison 2020-2021 Trewartha Papers, May 2021. Download: PDF
A Case for Fine-grain Coherence Specialization in Heterogeneous Systems. Johnathan Alsop, Weon Taek Na, Matthew D. Sinclair, Samuel Grayson, and Sarita V. Adve. Preprint on ArXiV, April 2021. Download: PDF
Demystifying BERT: Implications for Accelerator Design. Suchita Pati, Shaizeen Aga, Nuwan Jayasena, and Matthew D. Sinclair. Preprint on ArXiV, April 2021. Download: PDF
Improving GPU Utilization in ML Workloads Through Finer-Grained Synchronization. Reese Kuper, Suchita Pati, and Matthew D. Sinclair. At 3rd Young Architects Workshop (YArch), April 2021.
Enabling Reproducible and Agile Full-System Simulation. Bobby R. Bruce, Ayaz Akram, Hoa Nguyen, Kyle Roarty, Mahyar Samani, Marjan Fariborz, Trivikram Reddy, Matthew D. Sinclair, and Jason Lowe-Power. In 2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), March 2021. Best Paper Nominee! Details: MaterialsPDFTalk
Co-designing Power Management with Job Scheduling for Efficient Exascale Computing. Matthew D. Sinclair and Shivaram Venkataraman. At DOE ASCR Workshop on Reimagining Codesign, March 2021. Details: PDF
Leveraging open source simulators for HPC codesign. Bobby Bruce, Jason Lowe-Power, and Matthew D. Sinclair. At DOE ASCR Workshop on Reimagining Codesign, March 2021. Details: PDF
Deadline-Aware Offloading for High-Throughput Accelerators. Tsung Tai Yeh, Matthew D. Sinclair, Bradford M. Beckmann, and Timothy G. Rogers. In Proceedings of the 27th IEEE International Symposium on High-Performance Computer Architecture (HPCA), March 2021. Download: PDFPresentationShort Presentation
Deterministic Atomic Buffering. Yuan Hsi Chou*, Christopher Ng*, Shaylin Cattell, Jeremy Intan, Matthew D. Sinclair, Joseph Devietti, Timothy G. Rogers, and Tor M. Aamodt. In 53rd IEEE/ACM International Symposium on Microarchitecture (MICRO), October 2020. Download: PDFTalkLightning TalkSource Code * NOTE: First two authors contributed equally and are listed alphabetically
SeqPoint: Identifying Representative Iterations of Sequence-based Neural Networks. Suchita Pati, Shaizeen Aga, Matthew D. Sinclair, and Nuwan Jayasena. In 2020 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), August 2020. Download: PDF
Preprint on ArXiV, July 2020. Download: PDF
Specializing Coherence, Consistency, and Push/Pull for GPU Graph Analytics. Giordano Salvador, Johnathan Alsop, Wesley H. Darvin, Muhammad Huzaifa, Matthew D. Sinclair, and Sarita V. Adve. An extended abstract in 2020 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), August 2020.
Extended version on ArXiV: PDF
Inter-Kernel Reuse-Aware Thread Block Scheduling. Muhammad Huzaifa, Johnathan Alsop, Abdulrahman Mahmoud, Giordano Salvador, Matthew D. Sinclair, and Sarita V. Adve. In ACM Transactions on Architecture and Code Optimization (TACO). 17,3, Article 24 (August 2020), 27 pages. Download: PDF
Designing Efficient Barriers and Semaphores for Graphics Processing Units. Rohan Mahapatra. MS Project, August 2020. Download: PDF
The gem5 Simulator: Version 20.0+. Jason Lowe-Power, Abdul Mutaal Ahmad, Ayaz Akram, Mohammad Alian, Rico Amslinger, Matteo Andreozzi, Adria Armejach, Nils Asmussen, Srikant Bharadwaj, Gabe Black, Gedare Bloom, Bobby R. Bruce, Daniel Rodrigues Carvalho, Jeronimo Castrillon, Lizhong Chen, Nicolas Derumigny, Stephan Diestelhorst, Wendy Elsasser, Marjan Fariborz, Amin Farmahini-Farahani, Pouya Fotouhi, Ryan Gambord, Jayneel Gandhi, Dibakar Gope, Thomas Grass, Bagus Hanindhito, Andreas Hansson, Swapnil Haria, Austin Harris, Timothy Hayes, Adrian Herrera, Matthew Horsnell, Syed Ali Raza Jafri, Radhika Jagtap, Hanhwi Jang, Reiley Jeyapaul, Timothy M. Jones, Matthias Jung, Subash Kannoth, Hamidreza Khaleghzadeh, Yuetsu Kodama, Tushar Krishna, Tommaso Marinelli, Christian Menard, Andrea Mondelli, Tiago Muck, Omar Naji, Krishnendra Nathella, Hoa Nguyen, Nikos Nikoleris, Lena E. Olson, Marc Orr, Binh Pham, Pablo Prieto, Trivikram Reddy, Alec Roelke, Mahyar Samani, Andreas Sandberg, Javier Setoain, Boris Shingarov, Matthew D. Sinclair, Tuan Ta, Rahul Thakur, Giacomo Travaglini, Michael Upton, Nilay Vaish, Ilias Vougioukas, Zhengrong Wang, Norbert Wehn, Christian Weis, David A. Wood, Hongil Yoon, and Eder F. Zulian. Preprint on ArXiV, July 2020. Download: PDF
Modeling Modern GPU Applications in gem5. Kyle Roarty and Matthew D. Sinclair. In 3rd gem5 Users' Workshop, June 2020. Download: HTMLPresentation
Enabling Multi-GPU Support in gem5. Bobbi W. Yogatama, Matthew D. Sinclair, and Michael M. Swift. In 3rd gem5 Users' Workshop, June 2020. Download: HTMLPresentation
Independent Forward Progress of Work-groups. Alexandru Dutu, Matthew D. Sinclair, Bradford M. Beckmann, David A. Wood, and Marcus Chow. In 47th International Symposium on Computer Architecture (ISCA), May 2020. Download: PDF
Optimizing GPU Cache Policies for MI Workloads. Johnathan Alsop, Matthew D. Sinclair, Anthony Gutierrez, Srikant Bharadwaj, Xianwei Zhang, Bradford Beckmann, Alexandru Dutu, Onur Kayiran, Michael LeBeane, Brandon Potter, Sooraj Puthoor, and Tsung Tai Yeh. Short paper in 2019 IEEE International Symposium on Workload Characterization (IISWC), November 2019.
Extended version on ArXiV: PDF
Analyzing Machine Learning Workloads Using a Detailed GPU Simulator. Jonathan Lew, Deval Shah, Suchita Pati, Shaylin Cattell, Mengchi Zhang, Amruth Sandhupatla, Christopher Ng, Negar Goli, Matthew D. Sinclair, Timothy G. Rogers, and Tor Aamodt. Extended abstract and poster in 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), March 2019.
Extended version on ArXiV: PDF
Exploring GPU Architectural Optimizations for RNNs. Suchita Pati. At 1st Young Architects Workshop (YArch), February 2019. Download: PDF
Spandex: A Generalized Interface for Flexible Heterogeneous Coherence. Johnathan Alsop, Matthew D. Sinclair, and Sarita V. Adve. In 45th International Symposium on Computer Architecture (ISCA), June 2018. Download: BibTeXPDF
Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level. Anthony Gutierrez, Bradford Beckmann, Alexandru Dutu, Joseph Gross, John Kalamatianos, Onur Kayiran, Michael LeBeane, Matthew Poremba, Brandon Potter, Sooraj Puthoor, Matthew D. Sinclair, Mark Wyse, Jieming Yin, Xianwei Zhang, Akshay Jain, and Timothy G. Rogers. In 24th IEEE International Symposium on High Performance Computer Architecture (HPCA), February 2018. Download: BibTeXPDF
HPVM: Heterogeneous Parallel Virtual Machine. Maria Kotsifakou*, Prakalp Srivastava*, Matthew D. Sinclair, Rakesh Komuravelli, Vikram Adve, and Sarita Adve. In 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), February 2018. Download: BibTeXPDF * NOTE: First two authors contributed equally and are listed alphabetically
HeteroSync: A Benchmark Suite for Fine-Grained Synchronization on Tightly Coupled GPUs. Matthew D. Sinclair, Johnathan Alsop, and Sarita V. Adve, in the IEEE International Symposium on Workload Characterization (IISWC), October 2017. Download: BibTeXPDFPresentation
Chasing Away RAts: Semantics and Evaluation for Relaxed Atomics on Heterogeneous Systems. Matthew D. Sinclair, Johnathan Alsop, and Sarita V. Adve, in the 44th International Symposium on Computer Architecture (ISCA), June 2017. Download: BibTeXPDFPresentationLightning Presentation
POSTER: hVISC: A Portable Virtual Instruction Set for Heterogeneous Parallel Systems. Prakalp Srivastava, Maria Kotsifakou, Matthew D. Sinclair, Rakesh Komuravelli, Vikram Adve, and Sarita Adve. In the 25th International Conference on Parallel Architecture and Compilation (PACT), September 2016. Download: BibTeXPDF
GSI: A GPU Stall Inspector to Characterize the Source of Memory Stalls for Tightly Coupled GPUs. Johnathan Alsop, Matthew D. Sinclair, Rakesh Komuravelli, and Sarita V. Adve. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), April 2016. Download: BibTeXPDF
Efficient GPU Synchronization without Scopes: Saying No to Complex Consistency Models. Matthew D. Sinclair, Johnathan Alsop, and Sarita V. Adve, in the 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), December 2015. Selected as an IEEE MICRO Top Picks Honorable Mention from the 2015 Computer Architecture Conferences! Download: BibTeXPDFPresentation
Stash: Have Your Scratchpad and Cache it Too. Rakesh Komuravelli*, Matthew D. Sinclair*, Johnathan Alsop, Muhammad Huzaifa, Maria Kotsifakou, Prakalp Srivastava, Sarita V. Adve, and Vikram Adve. In 42nd International Symposium on Computer Architecture (ISCA), June 2015. Selected as an IEEE MICRO Top Picks Honorable Mention from the 2015 Computer Architecture Conferences! Download: BiBTeXPDFPresentation * NOTE: First two authors contributed equally and are listed alphabetically
Vikram Adve, Sarita Adve, Rakesh Komuravelli, Matthew D. Sinclair, and Prakalp Srivastava. Virtual Instruction Set Computing for Heterogeneous Systems. In 4th USENIX Workshop on Hot Topics in Parallelism (HotPar), June 2012. Download: PDF
Porting CMP Benchmarks to GPUs. Matthew Sinclair, Henry Duwe, and Karthikeyan Sankaralingam. Technical Report TR-1693, Department of Computer Sciences, The University of Wisconsin-Madison, 2011. Download: BibTeXPDF
Challenge Benchmarks that Must Conquered to Sustain the GPU Revolution. Emily Blem, Matthew Sinclair, and Karthikeyan Sankaralingam. In Proceedings of 4th Annual Workshop on Emerging Applications and Many-Core Architecture (EAMA), June 2011. Download: DetailsBibTeXPDFPresentation
Enabling New Uses for GPUs. Matthew D. Sinclair. Masters Thesis, University of Wisconsin-Madison, May 2011. Download: BibTeXPDF
Sampling + DMR: Practical and Low-overhead Permanent Fault Detection. Shuou Nomura, Matthew D. Sinclair, Chen-han Ho, Venkatraman Govindaraju, Marc de Kruijf, and Karthikeyan Sankaralingam. In Proceedings of 38th International Symposium on Computer Architecture (ISCA 2011). Download: DetailsBibTeXPDF
GRASSY: Leveraging GPU Texture Units for Asteroseismic Data Analysis. Karthikeyan Sankaralingam, Richard Townsend, and Matthew D. Sinclair. In Proceedings of GPU Technology Conference (GTC), 2010. Download: DetailsBibTeXPresentation Video
Leveraging the untapped computation power of GPUs: fast spectral synthesis using texture interpolation. Richard Townsend, Karthikeyan Sankaralingam, and Matthew D. Sinclair. Addison-Wesley, 2010. Download: DetailsBibTeX
Bitonic-MapReduce: Optimization of MapReduce on the Cell B.E. Architecture with a Bitonic Sort. Matthew D. Sinclair. Published in UW-Madison 2009-2010 Trewartha Papers. Download: PDFBibTeX