BEGIN:VCALENDAR
PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN
VERSION:2.0
BEGIN:VEVENT
DTSTART:20141118T163000Z
DTEND:20141118T170000Z
LOCATION:393-94-95
DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: The gap between the cost of moving data and the cost of computing=0A continues to grow, making it ever harder to design iterative solvers on=0A extreme-scale architectures. This problem can be alleviated by=0A alternative algorithms that reduce the amount of data movement. We=0A investigate this in the context of Lattice Quantum Chromodynamics=0A and implement such an alternative solver algorithm, based on domain=0A decomposition, on Intel(R) Xeon Phi(TM) co-processor (KNC) clusters. We demonstrate close-to-linear on-chip scaling to all 60 cores of the KNC. With a mix of single- and half-precision the domain-decomposition method sustains 400-500 Gflop/s per chip. Compared to an optimized KNC implementation of a=0A standard solver [1], our full multi-node domain-decomposition solver=0A strong-scales to more nodes and reduces the time-to-solution by a factor of 5.
SUMMARY:Lattice QCD with Domain Decomposition on Intel(R) Xeon Phi(TM) Co-Processors
PRIORITY:3
END:VEVENT
END:VCALENDAR