sponsored byIEEEACMThe International Conference for High Performance 
Computing, Networking, Storage and Analysis
FacebookTwitterGoogle PlusLinkedInYouTubeFlickr

SCHEDULE: NOV 16-21, 2014

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and Its Application to Unstructured Matrices

SESSION: Sparse Solvers

EVENT TYPE: Papers

TIME: 2:30PM - 3:00PM

SESSION CHAIR: Anne C. Elster

AUTHOR(S):Jongsoo Park, Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Alexander Heinecke, Dhiraj D. Kalamkar, Xing Liu, Md. Mostofa Ali Patwary, Yutong Lu, Pradeep Dubey

ROOM:391-92

ABSTRACT:

High-performance sparse linear solvers, the back-bone of modern HPC, face many challenges on upcoming extreme-scale architectures. The High Performance Linpack (HPL), widely recognized benchmark for ranking such system, does not represent challenges inherent to these solvers. To address this shortcoming, a new sparse high performance conjugate gradient benchmark (HPCG) has been recently proposed. This is the first paper which analyzes and optimizes HPCG on two modern multi- and many-core IA-based architectures: Xeon and Xeon Phi. We explore number of algorithmic and performance optimizations. By taking advantage of salient architectural features of these two architectures, our implementation sustains 75% and 67% of their achievable bandwidth, respectively. We further show our optimizations generally apply to a wide range of matrices, on which we achieve 72% and 65% of achievable bandwidth. Lastly, we study multi-node scalability of HPCG and the tradeoff between number of parallel domains, convergence and single-node parallel performance.

Chair/Author Details:

Anne C. Elster (Chair) - Norwegian University of Science & Technology / University of Texas at Austin

Jongsoo Park - Intel Corporation

Mikhail Smelyanskiy - Intel Corporation

Karthikeyan Vaidyanathan - Intel Corporation

Alexander Heinecke - Intel Corporation

Dhiraj D. Kalamkar - Intel Corporation

Xing Liu - Georgia Institute of Technology

Md. Mostofa Ali Patwary - Intel Corporation

Yutong Lu - National University of Defense Technology, China

Pradeep Dubey - Intel Corporation

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar


Paper provided by the ACM Digital Library

Paper also available from IEEE Computer Society