sponsored byIEEEACMThe International Conference for High Performance 
Computing, Networking, Storage and Analysis
FacebookTwitterGoogle PlusLinkedInYouTubeFlickr

SCHEDULE: NOV 16-21, 2014

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Using Global View Resilience (GVR) to add Resilience to Exascale Applications

SESSION: Poster Reception

EVENT TYPE: Posters, Best Poster Finalist

TIME: 5:15PM - 7:00PM

AUTHOR(S):Hajime Fujita, Nan Dun, Aiman Fang, Zachary A. Rubenstein, Ziming Zheng, Kamil Iskra, Jeff Hammond, Anshu Dubey, Pavan Balaji, Andrew A. Chien

ROOM:New Orleans Theater Lobby


Resilience is a big challenge in future exascale machines.
Existing approaches are unlikely to address complex failures like latent errors, therefore we need a new approach.
We propose Global View Resilience (GVR), a new library that exploits a global view data model and adds reliability through versioning (multi-version), user control timing and rate (multi-stream), and flexible cross layer error signalling and recovery. GVR enables application programmers to exploit deep scientific and application code insights to manage resilience (and its overhead) in a flexible, portable fashion.
We applied the GVR library to several existing scientific application codes and showed that GVR can be easily applied and runtime overhead for versioning is negligible.

Chair/Author Details:

Hajime Fujita - University of Chicago and Argonne National Laboratory

Nan Dun - University of Chicago and Argonne National Laboratory

Aiman Fang - University of Chicago

Zachary A. Rubenstein - University of Chicago

Ziming Zheng - HP Vertica

Kamil Iskra - Argonne National Laboratory

Jeff Hammond - Argonne National Laboratory

Anshu Dubey - Lawrence Berkeley National Laboratory

Pavan Balaji - Argonne National Laboratory

Andrew A. Chien - University of Chicago and Argonne National Laboratory

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar