sponsored byIEEEACMThe International Conference for High Performance 
Computing, Networking, Storage and Analysis
FacebookTwitterGoogle PlusLinkedInYouTubeFlickr

SCHEDULE: NOV 16-21, 2014

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Fault-Tolerant Dynamic Task Graph Scheduling

SESSION: Resilience

EVENT TYPE: Papers, Best Student Paper Finalists

TIME: 4:30PM - 5:00PM

SESSION CHAIR: Ananta Tiwari

AUTHOR(S):Mehmet Can Kurt, Sriram Krishnamoorthy, Kunal Agrawal, Gagan Agrawal



In this paper, we present an approach to fault-tolerant execution of dynamic task graphs scheduled using work-stealing. In particular, we focus on selective and localized recovery of tasks in the presence of soft faults. We present an application programming interface that elicits the basic task graph structure in terms of successor and predecessor relationships. The work-stealing based task graph scheduling algorithm is then augmented to enable recovery when the data and meta-data associated with a task get corrupted. We use this redundancy, and the information on the task graph provided by the API, to selectively recovery from faults with low space and time overheads. We show that the fault-tolerant design retains the essential properties of the underlying work stealing-based task scheduling algorithm. We also show that the fault tolerant execution is asymptotically optimal when the re-execution of tasks is taken into account. Experimental evaluation demonstrates the effectiveness of the approach.

Chair/Author Details:

Ananta Tiwari (Chair) - PMaC Lab, SDSC

Mehmet Can Kurt - Ohio State University

Sriram Krishnamoorthy - Pacific Northwest National Laboratory

Kunal Agrawal - Washington University in St. Louis

Gagan Agrawal - Ohio State University

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

Paper provided by the ACM Digital Library

Paper also available from IEEE Computer Society