TIME
| PRESENTATION
| SPEAKER
| LOCATION
| PLANNER
|
10:30AM - 11:00AM |
Lattice QCD with Domain Decomposition on Intel(R) Xeon Phi(TM) Co-Processors |
Simon Heybrock, Balint Joo, Dhiraj D. Kalamkar, Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Tilo Wettig, Pradeep Dubey |
393-94-95 |
 |
10:30AM - 11:00AM |
Fence Scoping |
Changhui Lin, Vijay Nagarajan, Rajiv Gupta |
391-92 |
 |
10:30AM - 11:00AM |
CYPRESS: Combining Static and Dynamic Analysis for Top-Down Communication Trace Compression |
Jidong Zhai, Jianfei Hu, Xiongchao Tang, Xiaosong Ma, Wenguang Chen |
388-89-90 |
 |
11:00AM - 11:30AM |
Mapping to Irregular Torus Topologies and Other Techniques for Petascale Biomolecular Simulation |
James C. Phillips, Yanhua Sun, Nikhil Jain, Eric J. Bohm, Laxmikant V. Kale |
393-94-95 |
 |
11:00AM - 11:30AM |
Recycled Error Bits: Energy-Efficient Architectural Support for Floating Point Accuracy |
Ralph Nathan, Bryan Anthonio, Shih-Lien L. Lu, Helia Naeimi, Daniel J. Sorin, Xiaobai Sun |
391-92 |
 |
11:00AM - 11:30AM |
Lightweight Distributed Metric Service: A Scalable Infrastructure for Continuous Monitoring of Large Scale Computing Systems and Applications |
Anthony Agelastos, Benjamin Allan, Jim Brandt, Paul Cassella, Jeremy Enos, Joshi Fullop, Ann Gentile, Steve Monk, Nichamon Naksinehaboon, Jeff Ogden, Mahesh Rajan, Michael Showerman, Joel Stevenson, Narate Taerat, Tom Tucker |
388-89-90 |
 |
11:30AM - 12:00PM |
A Volume Integral Equation Stokes Solver for Problems with Variable Coefficients |
Dhairya Malhotra, Amir Gholami, George Biros |
393-94-95 |
 |
11:30AM - 12:00PM |
Managing DRAM Latency Divergence in Irregular GPGPU Applications |
Niladrish Chatterjee, Mike O'Connor, Gabriel H. Loh, Nuwan Jayasena, Rajeev Balasubramonian |
391-92 |
 |
11:30AM - 12:00PM |
Dissecting On-node Memory Access Performance: A Semantic Approach |
Alfredo Gimenez, Todd Gamblin, Barry Rountree, Abhinav Bhatele, Ilir Jusufi, Peer-Timo Bremer, Bernd Hamann |
388-89-90 |
 |
1:30PM - 2:00PM |
Practical Symbolic Race Checking of GPU Programs |
Peng Li, Guodong Li, Ganesh Gopalakrishnan |
388-89-90 |
 |
1:30PM - 2:00PM |
Best Practices and Lessons Learned from Deploying and Operating Large-Scale Data-Centric Parallel File Systems |
Sarp Oral, James Simmons, Jason Hill, Dustin Leverman, Feiyi Wang, Matthew Ezell, Ross Miller, Douglas Fuller, Raghul Gunasekaran, Youngjae Kim, Saurabh Gupta, Devesh Tiwari, Sudharshan Vazhkudai, James H. Rogers, David Dillow, Arthur S. Bland, Galen M. Shipman |
393-94-95 |
 |
1:30PM - 2:00PM |
High-productivity Framework on GPU-rich Supercomputers for Operational Weather Prediction Code ASUCA |
Takashi Shimokawabe, Takayuki Aoki, Naoyuki Onodera |
391-92 |
 |
2:00PM - 2:30PM |
Scalable Kernel Fusion for Memory-Bound GPU Applications |
Mohamed Wahib, Naoya Maruyama |
388-89-90 |
 |
2:00PM - 2:30PM |
A User-Friendly Approach for Tuning Parallel File Operations |
Robert McLay, Doug James, Si Liu, John Cazes, William Barth |
393-94-95 |
 |
2:00PM - 2:30PM |
Pipelining Computational Stages of the Tomographic Reconstructor for Multi-Object Adaptive Optics on a Multi-GPU System |
Ali Charara, Hatem Ltaief, Damien Gratadour, David Keyes, Arnaud Sevin, Ahmad Abdelfattah, Eric Gendron, Carine Morel, Fabrice Vidal |
391-92 |
 |
2:30PM - 3:00PM |
A Unified Programming Model for Intra- and Inter-Node Offloading on Xeon Phi Clusters |
Matthias Noack, Florian Wende, Frank Cordes, Thomas Steinke |
388-89-90 |
 |
2:30PM - 3:00PM |
IndexFS: Scaling File System Metadata Performance with Stateless Caching and Bulk Insertion |
Kai Ren, Qing Zheng, Swapnil Patil, Garth Gibson |
393-94-95 |
 |
2:30PM - 3:00PM |
pTatin3D: High-performance Methods for Long-Term Lithospheric Dynamics |
Dave A. May, Jed Brown, Laetitia Le Pourhiet |
391-92 |
 |
3:30PM - 4:00PM |
Oil and Water Can Mix! An Integration of Polyhedral and AST-based Transformations |
Jun Shirako, Louis-Noel Pouchet, Vivek Sarkar |
393-94-95 |
 |
3:30PM - 4:00PM |
RAHTM: Routing Algorithm Aware Hierarchical Task Mapping |
Ahmed H. Abdel-Gawad, Mithuna Thottethodi, Abhinav Bhatele |
388-89-90 |
 |
3:30PM - 4:00PM |
A Computation- And Communication-Optimal Parallel Direct 3-Body Algorithm |
Penporn Koanantakool, Katherine Yelick |
391-92 |
 |
4:00PM - 4:30PM |
Compiler Techniques for Massively Scalable Implicit Task Parallelism |
Timothy G. Armstrong, Justin M. Wozniak, Michael Wilde, Ian Foster |
393-94-95 |
 |
4:00PM - 4:30PM |
Maximizing Throughput on a Dragonfly Network |
Nikhil Jain, Abhinav Bhatele, Xiang Ni, Nicholas J. Wright, Laxmikant V. Kale |
388-89-90 |
 |
4:00PM - 4:30PM |
A Communication-Optimal Framework for Contracting Distributed Tensors |
Samyam Rajbhandari, Akshay Nikam, Pai-Wei Lai, Kevin Stock, Sriram Krishnamoorthy, P. Sadayappan |
391-92 |
 |
4:30PM - 5:00PM |
MSL: A Synthesis Enabled Language for Distributed Implementations |
Zhilei Xu, Shoaib Kamil, Armando Solar-Lezama |
393-94-95 |
 |
4:30PM - 5:00PM |
Slim Fly: A Cost Effective Low-Diameter Network Topology |
Maciej Besta, Torsten Hoefler |
388-89-90 |
 |
4:30PM - 5:00PM |
Fast Parallel Computation of Longest Common Prefixes |
Julian Shun |
391-92 |
 |
TIME
| PRESENTATION
| SPEAKER
| LOCATION
| PLANNER
|
10:30AM - 11:00AM |
Fast Iterative Graph Computation: A Path Centric Approach |
Pingpeng Yuan, Wenya Zhang, Changfeng Xie, Ling Liu, Hai Jin, Kisung Lee |
391-92 |
 |
10:30AM - 11:00AM |
Parallel De Bruijn Graph Construction and Traversal for De Novo Genome Assembly |
Evangelos Georganas, Aydin Buluc, Jarrod Chapman, Leonid Oliker, Daniel Rokhsar, Katherine Yelick |
388-89-90 |
 |
10:30AM - 11:00AM |
Nonblocking Epochs in MPI One-Sided Communication |
Judicael A. Zounmevo, Xin Zhao, Pavan Balaji, William Gropp, Ahmad Afsahi |
393-94-95 |
 |
11:00AM - 11:30AM |
Efficient I/O and Storage of Adaptive Resolution Data |
Sidharth Kumar, John Edwards, Peer-Timo Bremer, Aaron Knoll, Cameron Christensen, Venkatram Vishwanath, Philip Carns, John A. Schmidt, Valerio Pascucci |
391-92 |
 |
11:00AM - 11:30AM |
Orion : Scaling Genomic Sequence Matching with Fine-Grained Parallelization |
Kanak Mahadik, Somali Chaterji, Bowen Zhou, Milind Kulkarni, Saurabh Bagchi |
388-89-90 |
 |
11:00AM - 11:30AM |
Enabling Efficient Multithreaded MPI Communication Through a Library-Based Implementation of MPI Endpoints |
Srinivas Sridharan, James Dinan, Dhiraj Kalamkar |
393-94-95 |
 |
11:30AM - 12:00PM |
An Image-Based Approach to Extreme Scale In Situ Visualization and Analysis |
James Ahrens, Sebastien Jourdain, Patrick O'Leary, John Patchett, David H. Rogers, Mark Petersen |
391-92 |
 |
11:30AM - 12:00PM |
Parallel Bayesian Network Structure Learning for Genome-Scale Gene Networks |
Sanchit Misra, Vasimuddin Md, Kiran Pamnany, Sriram P. Chockalingam, Yong Dong, Min Xie, Maneesha R. Aluru, Srinivas Aluru |
388-89-90 |
 |
11:30AM - 12:00PM |
MC-Checker: Detecting Memory Consistency Errors in MPI One-Sided Applications |
Zhezhe Chen, James Dinan, Zhen Tang, Pavan Balaji, Hua Zhong, Jun Wei, Tao Huang, Feng Qin |
393-94-95 |
 |
1:30PM - 2:00PM |
Scheduling Multi-Tenant Cloud Workloads on Accelerator-Based Systems |
Dipanjan Sengupta, Anshuman Goswami, Karsten Schwan, Krishna Pallavi |
391-92 |
 |
1:30PM - 2:00PM |
Faster Parallel Traversal of Scale Free Graphs at Extreme Scale with Vertex Delegates |
Roger Pearce, Maya Gokhale, Nancy M. Amato |
388-89-90 |
 |
1:30PM - 2:00PM |
Understanding Soft Error Resiliency of BlueGene/Q Compute Chip through Hardware Proton Irradiation and Software Fault Injection |
Chen-Yong Cher, Meeta S. Gupta, Pradip Bose, K. Paul Muller |
393-94-95 |
 |
2:00PM - 2:30PM |
Scaling MapReduce Vertically and Horizontally |
Ismail El-Helw, Rutger Hofman, Henri Bal |
391-92 |
 |
2:00PM - 2:30PM |
Pardicle: Parallel Approximate Density-Based Clustering |
Md. Mostofa Ali Patwary, Nadathur Satish, Narayanan Sundaram, Fredrik Manne, Salman Habib, Pradeep Dubey |
388-89-90 |
 |
2:00PM - 2:30PM |
Fail-in-Place Network Design: Interaction between Topology, Routing Algorithm and Failures |
Jens Domke, Torsten Hoefler, Satoshi Matsuoka |
393-94-95 |
 |
2:30PM - 3:00PM |
The DRIHM Project: A Flexible Approach to Integrate HPC, Grid and Cloud Resources for Hydro-Meteorological Research |
Daniele D'Agostino, Andrea Clematis, Antonella Galizia, Alfonso Quarati, Emanuele Danovaro, Luca Roverelli, Gabriele Zereik, Dieter Kranzlmüller, Michael Schiffers, Nils gentschen Felde, Christian Straube, Olivier Caumont, Evelyne Richard, Luis Garrote, Quillon Harpham, H.R.A. Jagers, Vladimir Dimitrijević, Ljiljana Dekić, Elisabetta Fiori, Fabio Delogu, Antonio Parodi |
391-92 |
 |
2:30PM - 3:00PM |
Scalable and High Performance Betweenness Centrality on the GPU |
Adam T. McLaughlin, David A. Bader |
388-89-90 |
 |
2:30PM - 3:00PM |
Correctness Field Testing of Production and Decommissioned High Performance Computing Platforms at Los Alamos National Laboratory |
Sarah E. Michalak, William N. Rust, John T. Daly, Andrew J. DuBois, David H. DuBois |
393-94-95 |
 |
3:30PM - 4:00PM |
Omnisc'IO: A Grammar-Based Approach to Spatial and Temporal I/O Patterns Prediction |
Matthieu Dorier, Shadi Ibrahim, Gabriel Antoniu, Robert Ross |
391-92 |
 |
3:30PM - 4:00PM |
Metascalable Quantum Molecular Dynamics Simulations of Hydrogen-on-Demand |
Ken-ichi Nomura, Rajiv K. Kalia, Aiichiro Nakano, Priya Vashishta, Kohei Shimamura, Fuyuki Shimojo, Manaschai Kunaseth, Paul C. Messina, Nichols A. Romero |
388-89-90 |
 |
3:30PM - 4:00PM |
Quantitatively Modeling Application Resiliency with the Data Vulnerability Factor |
Li Yu, Dong Li, Sparsh Mittal, Jeffery S. Vetter |
393-94-95 |
 |
4:00PM - 4:30PM |
Two-Choice Randomized Dynamic I/O Scheduler for Object Storage Systems |
Dong Dai, Yong Chen, Dries Kimpe, Robert Ross |
391-92 |
 |
4:00PM - 4:30PM |
Efficient Implementation of Many-Body Quantum Chemical Methods on the Intel Xeon Phi Coprocessor |
Edoardo Apra, Michael Klemm, Karol Kowalski |
388-89-90 |
 |
4:00PM - 4:30PM |
A System Software Approach to Proactive Memory-Error Avoidance |
Carlos H. A. Costa, Yoonho Park, Bryan S. Rosenburg, Chen-Yong Cher, Kyung Dong Ryu |
393-94-95 |
 |
4:30PM - 5:00PM |
Parallel Programming with Migratable Objects: Charm++ in Practice |
Bilge Acun, Abhishek Gupta, Nikhil Jain, Akhil Langer, Harshitha Menon, Eric Mikida, Xiang Ni, Michael Robson, Yanhua Sun, Ehsan Totoni, Lukasz Wesolowski, Laxmikant V. Kale |
391-92 |
 |
4:30PM - 5:00PM |
Optimized Scheduling Strategies for Hybrid Density Functional Theory Electronic Structure Calculations |
William DF Dawson, Francois Gygi |
388-89-90 |
 |
4:30PM - 5:00PM |
Fault-Tolerant Dynamic Task Graph Scheduling |
Mehmet Can Kurt, Sriram Krishnamoorthy, Kunal Agrawal, Gagan Agrawal |
393-94-95 |
 |
TIME
| PRESENTATION
| SPEAKER
| LOCATION
| PLANNER
|
10:30AM - 11:00AM |
NUMARCK: Machine Learning Algorithm for Resiliency and Checkpointing |
Zhengzhang Chen, Seung Woo Son, William Hendrix, Ankit Agrawal, Wei-keng Liao, Alok Choudhary |
388-89-90 |
 |
10:30AM - 11:00AM |
Efficient Sparse Matrix-Vector Multiplication on GPUs using the CSR Storage Format |
Joseph L. Greathouse, Mayank Daga |
391-92 |
 |
10:30AM - 11:00AM |
Maximizing Throughput of Overprovisioned HPC Data Centers Under a Strict Power Budget |
Osman Sarood, Akhil Langer, Abhishek Gupta, Laxmikant V. Kale |
393-94-95 |
 |
11:00AM - 11:30AM |
Parallel Deep Neural Network Training for Big Data on Blue Gene/Q |
I-Hsin Chung, Tara Sainath, Bhuvana Ramabhadran, Michael Picheny, John Gunnels, Vernon Austel, Upendra Chauhari, Brain Kingsbury |
388-89-90 |
 |
11:00AM - 11:30AM |
Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications |
Arash Ashari, Naser Sedaghati, John Eisenlohr, Srinivasan Parthasarathy, P. Sadayappan |
391-92 |
 |
11:00AM - 11:30AM |
Application Centric Energy-Efficiency Study of Distributed Multi-Core and Hybrid CPU-GPU Systems |
Ben Cumming, Gilles Fourestey, Oliver Fuhrer, Tobias Gysi, Massimiliano Fatica, Thomas C. Schulthess |
393-94-95 |
 |
11:30AM - 12:00PM |
FAST: Near Real-time Searchable Data Analytics for the Cloud |
Yu Hua, Hong Jiang, Dan Feng |
388-89-90 |
 |
11:30AM - 12:00PM |
A Study on Balancing Parallelism and Data Locality in Stencil Calculations |
Catherine Olschanowsky, Stephen Guzik, John Loffeld, Jeffrey Hittinger, Michelle Strout |
391-92 |
 |
11:30AM - 12:00PM |
Scaling the Power Wall: A Path to Exascale |
Oreste Villa, Daniel R. Johnson, Mike O'Connor, Evgeny Bolotin, David W. Nellans, Justin Luitjens, Nikolai Sakharnykh, Peng Wang, Paulius Micikevicius, Anthony Scudiero, Stephen W. Keckler, William J. Dally |
393-94-95 |
 |
1:30PM - 2:00PM |
Structure Slicing: Extending Logical Regions with Fields |
Michael Bauer, Sean Treichler, Elliott Slaughter, Alex Aiken |
388-89-90 |
 |
1:30PM - 2:00PM |
Understanding the Effects of Communication and Coordination on Checkpointing at Scale |
Kurt B. Ferreira, Scott Levy, Patrick M. Widener, Dorian C. Arnold, Torsten Hoefler |
393-94-95 |
 |
1:30PM - 2:00PM |
Parallelization of Reordering Algorithms for Bandwidth and Wavefront Reduction |
Konstantinos I. Karantasis, Andrew Lenharth, Donald Nguyen, Maria Garzaran, Keshav Pingali |
391-92 |
 |
2:00PM - 2:30PM |
Optimizing Data Locality for Fork/Join Programs Using Constrained Work Stealing |
Jonathan Lifflander, Sriram Krishnamoorthy, Laxmikant V. Kale |
388-89-90 |
 |
2:00PM - 2:30PM |
Exploring Automatic, Online Failure Recovery for Scientific Applications at Extreme Scales |
Marc Gamell, Daniel S. Katz, Hemanth Kolla, Jacqueline Chen, Scott Klasky, Manish Parashar |
393-94-95 |
 |
2:00PM - 2:30PM |
Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU Cluster |
Ichitaro Yamazaki, Sivasankaran Rajamanickam, Erik G. Boman, Mark Hoemmen, Michael A. Heroux, Stanimire Tomov |
391-92 |
 |
2:30PM - 3:00PM |
DISC: A Domain-Interaction Based Programming Model With Support for Heterogeneous Execution |
Mehmet Can Kurt, Gagan Agrawal |
388-89-90 |
 |
2:30PM - 3:00PM |
Optimization of Multi-Level Checkpoint Model with Uncertain Execution Scales |
Sheng Di, Leonardo Bautista-Gomez, Franck Cappello |
393-94-95 |
 |
2:30PM - 3:00PM |
Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and Its Application to Unstructured Matrices |
Jongsoo Park, Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Alexander Heinecke, Dhiraj D. Kalamkar, Xing Liu, Md. Mostofa Ali Patwary, Yutong Lu, Pradeep Dubey |
391-92 |
 |
3:30PM - 4:00PM |
FlexSlot: Moving Hadoop into the Cloud with Flexible Slot Management |
Yanfei Guo, Jia Rao, Changjun Jiang, Xiaobo Zhou |
388-89-90 |
 |
3:30PM - 4:00PM |
High-Performance Computation of Distributed-Memory Parallel 3D Voronoi and Delaunay Tessellation |
Tom Peterka, Dmitriy Morozov, Carolyn Phillips |
391-92 |
 |
3:30PM - 4:00PM |
ECC Parity: A Technique for Efficient Memory Error Resilience for Multi-Channel Memory Systems |
Xun Jian, Rakesh Kumar |
393-94-95 |
 |
4:00PM - 4:30PM |
Reciprocal Resource Fairness: Towards Cooperative Multiple-Resource Fair Sharing in IaaS Clouds |
Haikun Liu, Bingsheng He |
388-89-90 |
 |
4:00PM - 4:30PM |
Scalable Computation of Stream Surfaces on Large Scale Vector Fields |
Kewei Lu, Han-Wei Shen, Tom Peterka |
391-92 |
 |
4:00PM - 4:30PM |
Using an Adaptive HPC Runtime System to Reconfigure the Cache Hierarchy |
Ehsan Totoni, Josep Torrellas, Laxmikant V. Kale |
393-94-95 |
 |
4:30PM - 5:00PM |
Finding Constant From Change: Revisiting Network Performance Aware Optimizations on IaaS Clouds |
Yifan Gong, Bingsheng He, Dan Li |
388-89-90 |
 |
4:30PM - 5:00PM |
In-Situ Feature Extraction of Large Scale Combustion Simulations Using Segmented Merge Trees |
Aaditya G. Landge, Valerio Pascucci, Attila Gyulassy, Janine C. Bennett, Hemanth Kolla, Jacqueline Chen, Peer-Timo Bremer |
391-92 |
 |
4:30PM - 5:00PM |
Microbank: Architecting Through-Silicon Interposer-Based Main Memory Systems |
Young Hoon Son, Seongil O, Hyunggyun Yang, Daejin Jung, Jung Ho Ahn, John Kim, Jangwoo Kim, Jae W. Lee |
393-94-95 |
 |