Wisconsin Publications

ACM DL Author-ize serviceMultifacet's general execution-driven multiprocessor simulator (GEMS) toolset
Milo M. K. Martin, Daniel J. Sorin, Bradford M. Beckmann, Michael R. Marty, Min Xu, Alaa R. Alameldeen, Kevin E. Moore, Mark D. Hill, David A. Wood
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05, 2005
Local copy: pdf
Web Site: http://www.cs.wisc.edu/gems
ISCA Tutorial Slides: ppt

External Publications

[1] Pablo Abad, Valentin Puente, José Angel Gregorio, and Pablo Prieto. Rotary router: an efficient architecture for CMP interconnection networks. In Proc. of the 34th Annual Intnl. Symp. on Computer Architecture, pages 116-125, June 2007.

[2] Pablo Abad, Valientin Puente, and Jose Angel Gregorio. Reducing the Interconnection Network Cost of Chip Multiprocessors. In NOCS '08: Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip (nocs 2008), pages 183-192, Washington, DC, USA, 2008. IEEE Computer Society.

[3] Rajeev Balasubramonian, Naveen Muralimanohar, Karthik Ramani, Liqun Cheng, and John B. Carter. Leveraging Wire Properties at the Microarchitecture Level. pages 40-52, December 2006.

[4] Lee Baugh, Naveen Neelakantam, and Craig Zilles. Using Hardware Memory Protection to Build a High-Performance, Strongly-Atomic Hybrid Transactional Memory. In Proc. of the 35th Annual Intnl. Symp. on Computer Architecture, June 2008.

[5] Abhishek Bhattacharjee, Gilberto Contreras, and Margaret Martonosi. Full-system chip multiprocessor power evaluations using FPGA-based emulation. In ISLPED '08: Proceeding of the thirteenth international symposium on Low power electronics and design, pages 335-340, New York, NY, USA, 2008. ACM.

[6] Colin Blundell, Joe Devietti, E Christopher Lewis, and Milo M.K. Martin. Making the fast case common and the uncommon case simple in unbounded transactional memory. In Proc. of the 34th Annual Intnl. Symp. on Computer Architecture, June 2007.

[7] Jichuan Chang and Gurindar S. Sohi. Cooperative Caching for Chip Multiprocessors. In Proc. of the 33nd Annual Intnl. Symp. on Computer Architecture, June 2006.

[8] Jichuan Chang and Gurindar S. Sohi. Cooperative cache partitioning for chip multiprocessors. In Proc. of the 21th Intnl. Conf. on Supercomputing, pages 242-252, June 2007.

[9] Kaiyu Chen, Sharad Malik, and Priyadarsan Patra. Runtime Validation of Memory Ordering Using Constraint Graph Checking. In Proc. of the 14th IEEE Symp. on High-Performance Computer Architecture, February 2008.

[10] Liqun Cheng and John B. Carter. Extending CC-NUMA systems to support write update optimizations. In Proc. of SC2008, pages 1-12, November 2008.

[11] Liqun Cheng, Naveen Muralimanohar, Karthik Ramani, Rajeev Balasubramonian, and John B. Carter. Interconnect-Aware Coherence Protocols for Chip Multiprocessors. In Proc. of the 33nd Annual Intnl. Symp. on Computer Architecture, June 2006.

[12] Derek Chiou, Dam Sunwoo, Hari Angepat, Joonsoo Kim, Nikhil A. Patil, William Reinhart, and D. Eric Johnson. Parallelizing computer system simulators. In Proc. of the Intnl. Parallel and Distributed Processing Symposium Symposium, pages 1-5, April 2008.

[13] Blas Cuesta, Antonio Robles, and Jose Duato. An Effective Starvation Avoidance Mechanism to Enhance the Token Coherence Protocol. In Proceedings of the 15th EUROMICRO International Conference on Parallel, Distributed and Network-Based Processing, 2007.

[14] John D. Davis, James Laudon, and Kunle Olukotun. Maximizing CMP Throughput with Mediocre Cores. In Proc. of the Intnl. Conf. on Parallel Architectures and Compilation Techniques, pages 51-62, September 2005.

[15] Andrew DeOrio, Adam Bauserman, and Valeria Bertacco. Post-silicon verification for cache coherence. In Computer Design, 2008. ICCD 2008. IEEE International Conference on, pages 348-355, Lake Tahoe, CA, October 2008.

[16] Dave Dice, Maurice Herlihy, Doug Lea, Yossi Lev, Victor Luchangco, Wayne Mesard, Mark Moir, Kevin Moore, and Dan Nussbaum. Applications of the Adaptive Transactional Memory Test Platform. In Proc. of the 3rd ACM SIGPLAN Workshop on Languages, Compilers, and Hardware Support for Transactional Computing, February 2008.

[17] Ricardo Fernandez-Pascual, Jose M. Garcia, Manuel E. Acacio, and Jose Duato. A Low Overhead Fault Tolerant Coherence Protocol for CMP Architectures. In Proc. of the 13th IEEE Symp. on High-Performance Computer Architecture, February 2007.

[18] Sevin Fide and Stephen Jenks. Architecture optimizations for synchronization and communication on chip multiprocessors. In Proc. of the Intnl. Parallel and Distributed Processing Symposium Symposium, pages 1-8, April 2008.

[19] Sevin Fide and Stephen Jenks. Proactive Use of Shared L3 Caches to Enhance Cache Communications in Multi-Core Processors. IEEE Comput. Archit. Lett., 7(2):57-60, 2008.

[20] Wenyin Fu and Katherine Compton. A Simulation Platform for Reconfigurable Computing Research. In Field Programmable Logic and Applications, 2006. FPL '06. International Conference on, pages 1-7, Madrid, August 2006.

[21] Venkatraman Govindaraju, Peter Djeu, Karthikeyan Sankaralingam, Mary Vernon, and William R. Mark. Toward a multicore architecture for real-time ray-tracing. Proc. of the 41st Annual IEEE/ACM International Symp. on Microarchitecture, pages 176-187, November 2008.

[22] Shantanu Gupta, Florin Sultan, Srihari Cadambi, Franjo Ivancic, and Martin Roetteler. RaceTM: detecting data races using transactional memory. In Proc. of the 20th ACM Symp. on Parallel Algorithms and Architectures, pages 104-106, June 2008.

[23] Enric Herrero, José González, and Ramon Canal. Distributed cooperative caching. In Proc. of the Intnl. Conf. on Parallel Architectures and Compilation Techniques, pages 134-143, October 2008.

[24] Hemayet Hossain, Sandhya Dwarkadas, and Michal C. Huang. Improving support for locality and fine-grain sharing in chip multiprocessors. In Proc. of the Intnl. Conf. on Parallel Architectures and Compilation Techniques, pages 155-165, October 2008.

[25] Natalie Enright Jerger, Li-Shiuan Peh, and Mikko Lipasti. Virtual Circuit Tree Multicasting: A Case for On-Chip Hardware Multicast Support. In Proc. of the 35th Annual Intnl. Symp. on Computer Architecture, pages 229-240, June 2008.

[26] Martha Mercaldi Kim, Mojtaba Mehrara, Mark Oskink, and Todd Austin. Architectural implications of brick and mortar silicon manufacturing. In Proc. of the 34th Annual Intnl. Symp. on Computer Architecture, pages 244-253, June 2007.

[27] Manhee Lee, Minseon Ahn, and Eun Jung Kim. I2SEMS: Interconnects-Independent Security Enhanced Shared Memory Multiprocessor Systems. In Proc. of the Intnl. Conf. on Parallel Architectures and Compilation Techniques, September 2007.

[28] Yosef Lev, Mark Moir, and Dan Nussbaum. PhTM: Phased Transactional Memory. In Proc. of the 2nd ACM SIGPLAN Workshop on Languages, Compilers, and Hardware Support for Transactional Computing, August 2007.

[29] Yossi Lev and Jan-Willem Maessen. Split Hardware Transactions: True nesting of transactions using best-effort hardware transactional memory. In Proc. of the 15th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming (PPoPP), February 2008.

[30] Sean Leventhal and Manoj Franklin. Perceptron Based Consumer Prediction in Shared-Memory Multiprocessors. In Proceedings of the 24th International Conference on Computer Design, October 2006.

[31] Man-Lap Li, Pradeep Ramachandran, Swarup Sahoo, Sarita Adve, Vikram Adve, and Yuanyuan Zhou. Understanding the Propagation of Hard Errors to Software and Implications for Resilient System Design. In Proc. of the 13th Intnl. Conf. on Architectural Support for Programming Languages and Operating Systems, March 2008.

[32] Tong Li, Alvin R. Lebeck, and Daniel J. Sorin. Spin Detection Hardware for Improved Management of Multithreaded Systems. IEEE Transactions on Parallel and Distributed Systems, 17(6), June 2006.

[33] Albert Meixner. Low-cost methods for error detection in multi-core systems. PhD thesis, Durham, NC, USA, 2008. Adviser-Daniel Sorin.

[34] Albert Meixner and Daneil J. Sorin. Error Detection via Online Checking of Cache Coherence with Token Coherence Signatures. In Proc. of the 13th IEEE Symp. on High-Performance Computer Architecture, pages 145-156, February 2007.

[35] Albert Meixner and Daniel J. Sorin. Dynamic Verification of Sequential Consistency. In Proc. of the 32nd Annual Intnl. Symp. on Computer Architecture, June 2005.

[36] Albert Meixner and Daniel J. Sorin. Dynamic Verification of Memory Consistency in Cache-Coherence Multithreaded Computer Architectures. In Proc. of the Intnl. Conf. on Dependable Systems and Networks, June 2006.

[37] Javier Merino, Valent'in Puente, Pablo Prieto, and José 'Angel Gregorio. SP-NUCA: a cost effective dynamic non-uniform cache architecture. SIGARCH Comput. Archit. News, 36(2):64-71, 2008.

[38] Mark Moir, Kevin Moore, and Dan Nussbaum. The Adaptive Transactional Memory Test Platform: A Tool for Experimenting with Transactional Code for Rock. In Proc. of the 3rd ACM SIGPLAN Workshop on Languages, Compilers, and Hardware Support for Transactional Computing, February 2008.

[39] Vivek Pandey, Weihang Jiang, Yuanyaun Zhou, and Richardo Bianchini. DMA-aware memory energy management. Proceedings of the 12th International Symposium on High Performance Computer Architecture (HPCA), pages 133-144, February 2006.

[40] Soyeon Park, Weihang Jiang, Yuanyuan Zhou, and Sarita Adve. Managing energy-performance tradeoffs for multithreaded applications on multiprocessor architectures. In Proc. of the 2007 ACM Sigmetrics Conf. on Measurement and Modeling of Computer Systems, pages 169-180, June 2007.

[41] Avadh Patel and Kanad Ghose. Energy-efficient MESI cache coherence with pro-active snoop filtering for multicore microprocessors. In ISLPED '08: Proceeding of the thirteenth international symposium on Low power electronics and design, pages 247-252, New York, NY, USA, 2008. ACM.

[42] Nauman Rafique, Won-Taek Lim, and Mithuna Thottethodi. Architectural Support for Operating System-Driven CMP Cache Management. In Proc. of the Intnl. Conf. on Parallel Architectures and Compilation Techniques, September 2006.

[43] Nauman Rafique, Won-Taek Lim, and Mithuna Thottethodi. Effective Management of DRAM Bandwidth in Multicore Processors. In Proc. of the Intnl. Conf. on Parallel Architectures and Compilation Techniques, September 2007.

[44] Arun Raghavan, Colin Blundell, and Milo M. K. Martin. Token Tenure: PATCHing Token Counting Using Directory-Based Cache Coherence. In Proc. of the 41st Annual IEEE/ACM International Symp. on Microarchitecture, November 2008.

[45] Alberto Ros, Manuel E. Acacio, and Jose M. Garcia. DiCo-CMP: Efficient cache coherency in tiled CMP architectures. In Proc. of the 22nd Intnl Parallel & Distributed Processing Symposium, pages 1-11, April 2008.

[46] Guo Rui, Hong An, Ruiling Dou, Ming Cong, Yaobin Wang, and Qi Li. LogSPoTM: A Scalable Thread Level Speculation Model Based on Transactional Memory. In 13th Asia-Pacific Computer Systems Architecture Conference, August 2008.

[47] Resit Sendag, Ayse Yilmazer, Joshua J. Yi, and Augustus K. Uht. Quantifying and reducing the effects of wrong-path memory references in cache-coherent multiprocessor systems. In Proc. of the Intnl. Parallel and Distributed Processing Symposium Symposium, April 2006.

[48] Resit Sendag, Ayse Yilmazer, Joshua J. Yi, and Augustus K. Uht. The impact of wrong-path memory references in cache-coherent multiprocessor systems. J. Parallel Distrib. Comput., 67(12):1256-1269, 2007.

[49] Arrvindh Shriraman and Sandhya Dwarkadas. Refereeing Conflicts in Transactional Memory Systems. Technical Report 939, University of Rochester, 2008.

[50] Arrvindh Shriraman, Sandhya Dwarkadas, and Michael L. Scott. Flexible Decoupled Transactional Memory Support. In Proc. of the 35th Annual Intnl. Symp. on Computer Architecture, June 2008.

[51] Arrvindh Shriraman, Virendra J. Marathe, Sandhya Dwarkadas, Michael L. Scott, David Eisenstat, Christopher Heriot, William N. Scherer III, and Michael F. Spear. Hardware Acceleration of Software Transactional Memory. In Proc. of the 1st ACM SIGPLAN Workshop on Languages, Compilers, and Hardware Support for Transactional Computing, June 2006.

[52] Arrvindh Shriraman, Michael F. Spear, Hemayet Hossain, Virendra J. Marathe, Sandhya Dwarkadas, and Michael L. Scott. An integrated hardware-software approach to flexible transactional memory. In Proc. of the 34th Annual Intnl. Symp. on Computer Architecture, June 2007.

[53] Michael F Spear, Arrvindh Shriraman, Luke Dalessandro, Sandhya Dwarkadas, and Michael L. Scott. Nonblocking transactions without indirection using alert-on-update. In Proc. of the 19th ACM Symp. on Parallel Algorithms and Architectures, pages 210-220, March 2007.

[54] Fuad Tabba, Cong Wang, and James R. Goodman. NZTM: Nonblocking Zero-Indirection Transactional Memory. In Proc. of the 2nd ACM SIGPLAN Workshop on Languages, Compilers, and Hardware Support for Transactional Computing, August 2007.

[55] Enrique Vallejo, Tim Harris, Adrian Cristal, Osman Unsal, and Mateo Valero. Hybrid Transactional Memory to Accelerate Safe Lock-Based Transactions. In Proc. of the 3rd ACM SIGPLAN Workshop on Languages, Compilers, and Hardware Support for Transactional Computing, February 2008.

[56] Ilya Wagner and Valeria Bertacco. MCjammer: adaptive verification for multi-core designs. In DATE '08: Proceedings of the conference on Design, automation and test in Europe, pages 670-675, New York, NY, USA, 2008. ACM.

[57] David Wang, Brinda Ganesh, Nuengwong Tuaycharoen, Kathleen Baynes, Aamer Jaleel, and Bruce Jacob. DRAMsim: a memory system simulator. SIGARCH Comput. Archit. News, 33(4):100-107, 2005.

[58] Carole-Jean Wu and Margaret Martonosi. A Comparison of Capacity Management Schemes for Shared CMP Caches. In In Proceedings of the 7th Workshop on Duplicating, Deconstructing, and Debunking, June 2008.

[59] Tom Yeh, Petros Faloutsos, Sanjay Patel, and Glenn Reinman. ParallAX: An Architecture for Real-Time Physics. In Proc. of the 34th Annual Intnl. Symp. on Computer Architecture, June 2007.

[60] Richard M. Yoo and Hsien-Hsin S. Lee. Adaptive Transaction Scheduling for Transactional Memory Systems. In 20th ACM Symposium on Parallelism in Algorithms and Architectures, June 2008.

[61] Xuemei Zhao, Karl Sammut, and Fangpo He. Performance Evaluation of a Novel CMP Cache Structure for Hybrid Workloads. In PDCAT '07: Proceedings of the Eighth International Conference on Parallel and Distributed Computing, Applications and Technologies, pages 89-96, Washington, DC, USA, 2007. IEEE Computer Society.

[62] Man-Lap Li, Pradeep Ramachandran, Swarup Kumar Sahoo, Sarita V. Adve, Vikram Adve, and Yuanyuan Zhou Trace-Based Microarchitecture-Level Diagnosis of Permanent Hardware Faults. Proc. of the Intnl. Conf. on Dependable Systems and Networks, June 2008.

[63] Swarup Kumar Sahoo, Man-Lap Li, Pradeep Ramachandran, Sarita V. Adve, Vikram Adve, and Yuanyuan Zhou Using Likely Program Invariants to Detect Hardware Errors. Proc. of the Intnl. Conf. on Dependable Systems and Networks, June 2008.

[64] Man-Lap Li, Pradeep Ramachandran, Ulya R. Karpuzcu, Siva Kumar Sastry Hari, and Sarita V. Adve Accurate Microarchitecture-Level Fault Modeling for Studying Hardware Faults. Proceedings of the 15th International Symposium on High Performance Computer Architecture (HPCA), February 2009.

[65] Doe Hyun Yoon and Mattan Erez Memory Mapped ECC: Low-Cost Error Protection for Last Level Caches. Proceedings of the 36th International Symposium on Computer Architecture (ISCA), June 2009.

[66] Varun Jannepally and Sohum Sohoni Fast Encryption and Authentication for Cache-to-Cache Transfers using GCM-AES. International Conference on Sensors, Security, Software, and Intelligent Systems, 2009.

[67] Salil Mohan Pant and Gregory T. Byrd Extending concurrency of transactional memory programs by using value prediction. In Proceedings of the 6th ACM Conference on Computing Frontiers, 2009.

[68] J. Ruben Titos Gil, Manuel E. Acacio Sanchez, and Jose M. Garcia Carrasco Characterization of Conflicts in Log-Based Transactional Memory (LogTM). In Proceedings of the 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008), 2008.

[69] Salil Mohan Pant and Gregory T. Byrd Limited early value communication to improve performance of transactional memory. In Proceedings of the 23rd international Conference on Supercomputing, 2009.

[70] Ruben Titos, Manuel E. Acacio, and Jose M. Garcia Speculation-based conflict resolution in hardware transactional memory. IEEE International Symposium on Parallel & Distributed Processing, 2009.

[71] Shantanu Gupta, Florin Sultan, Srihari Cadambi, Franjo Ivancic, and Martin Rotteler Using hardware transactional memory for data race detection. IEEE International Symposium on Parallel & Distributed Processing, 2009.

[72] Marc Lupon, Grigorios Magklis, and Antonio Gonzalez Version management alternatives for hardware transactional memory. In Proceedings of the 9th Workshop on Memory Performance: Dealing with Applications, Systems and Architecture, 2008.

[73] Marc Lupon, Grigorios Magklis, and Antonio Gonzalez FASTM: A Log-based Hardware Transactional Memory with Fast Abort Recovery. In Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2009.

[74] Ricardo Quislant, Eladio Gutierrez, and Oscar Plata Improving Signatures by Locality Exploitation for Transactional Memory. In Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2009.

[75] Hashem H. Najaf-abadi, Niket K. Choudhary, and Eric Rotenberg Core-Selectability in Chip Multiprocessors. In Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2009.

[76] Hemayet Hossain, Sandhya Dwarkadas, and Michael C. Huang DDCache: Decoupled and Delegable Cache Data and Metadata. In Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2009.

[77] Bogdan F. Romanescu, Alvin R. Lebeck, Daniel J. Sorin, and Anne Bracy. UNified Instruction/Translation/Data (UNITD) Coherence: One Protocol to Rule Them All. In Proceedings of the 16th IEEE International Symposium on High-Performance Computer Architecture (HPCA), January 2010.

[78] Bogdan F. Romanescu, Alvin R. Lebeck, Daniel J. Sorin. Specifying and Dynamically Verifying Address Translation-Aware Memory Consistency. Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2010), March 2010.

[79] Wenyin Fu and Katherine Compton. Scheduling Intervals for Reconfigurable Computing. In IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), 2008.

[80] Wenyin Fu and Katherine Compton. Active Kernel Monitoring to Combat Scheduler Gaming in Reconfigurable Computing Systems. In 18th International Conference on Field Programmable Logic and Applocations (FPL), 2008.

[81] Wenyin Fu and Katherine Compton. Balanced allocation of compute time in hardware-accelerated systems. in International Conference on Field Programmable Technology, Taipei, 2008.

[82] Philip Garcia and Katherine Compton. A reconfigurable hardware interface for a modern computing system. In IEEE Symposium on Field-Programmable Custom Computing Machines, 2007.

[83] Philip Garcia and Katherine Compton. Kernel sharing on reconfigurable multiprocessor systems. In IEEE Conference on Field Programmable Technology, pages 225–232, 2008.

[84] Philip Garcia and Katherine Compton. Shared memory cache organizations for reconfigurable computing systems (short paper). In IEEE Symposium on Field-Programmable Custom Computing Machines, pages 239–242, 2009.

[85] K. Rupnow, W. Fu, K. Compton. Block, Drop or Roll(back): Alternative Preemption Methods for RH Multi-Tasking. IEEE Symposium on Field-Programmable Custom Computing Machines, April 2009.

[86] K. Rupnow, J. Adriaens, W. Fu, and K. Compton. Accurately Evaluating Application Performance in Simulated Hybrid Multi-Tasking Systems. Accepted for publication at ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2010.




The Multifacet GEMS Development Team,
Page last modified: Friday, 05-Feb-2010 16:25:09 CST