BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:2.0 BEGIN:VEVENT DTSTART:20141117T143000Z DTEND:20141117T230000Z LOCATION:398-99 DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: With High Performance Computing trends heading towards increasingly heterogeneous solutions, scientific developers face challenges adapting software to leverage these new architectures. For instance, many systems feature nodes that couple multi-core processors with GPU-based computational accelerators, like the NVIDIA Kepler, or many-core coprocessors, like the Intel Xeon Phi. In order to effectively utilize these systems, programmers need to leverage an extreme level of parallelism in applications. Developers also need to juggle multiple programming paradigms including MPI, OpenMP, CUDA, and OpenACC. =0A=0AThis tutorial provides in-depth exploration of parallel debugging and optimization focused on techniques that can be used with accelerators and coprocessors. We cover debugging techniques such as grouping, advanced breakpoints and barriers, and MPI message queue graphing. We discuss optimization techniques like profiling, tracing, and cache memory optimization with tools such as Vampir, Scalasca, Tau, CrayPAT, Vtune and the NVIDIA Visual Profiler. Participants have the opportunity to do hands-on GPU and Intel Xeon Phi debugging and profiling. Additionally, the OpenMP 4.0 standard will be covered which introduces novel capabilities for both Xeon Phi and GPU programming. We will discuss peculiarities of that specification with respect to error finding and optimization. A laptop will be required for hands on sessions. SUMMARY:Debugging and Performance Tools for MPI and OpenMP 4.0 Applications for CPU and Accelerators/Coprocessors PRIORITY:3 END:VEVENT END:VCALENDAR