SCHEDULE: NOV 16-21, 2014

FlexSlot: Moving Hadoop into the Cloud with Flexible Slot Management

SESSION: Cloud Computing II


TIME: 3:30PM - 4:00PM

SESSION CHAIR: Jennifer M. Schopf

AUTHOR(S):Yanfei Guo, Jia Rao, Changjun Jiang, Xiaobo Zhou



Load imbalance is a major source of overhead in Hadoop where the uneven distribution of input data among tasks can significantly delay job completion. Running Hadoop in a private cloud opens up opportunities for mitigating data skew with elastic resource allocation, where stragglers are expedited with more resources, yet introduces problems that often cancel out the performance gain: (1) performance interference from co-running jobs may create new stragglers; (2) there exists a semantic gap between Hadoop task management and resource pool based virtual cluster management preventing efficient resource usage.

We present FlexSlot, a user-transparent task slot management scheme that automatically identifies map stragglers and resizes their slots accordingly to accelerate task execution. FlexSlot adaptively changes the number of slots on each virtual node to promote efficient usage of resource pool. Experimental results with representative benchmarks show that FlexSlot effectively reduces job completion time by 46% and achieves better resource utilization.

Chair/Author Details:

Jennifer M. Schopf (Chair) - Indiana University

Yanfei Guo - University of Colorado at Colorado Springs

Jia Rao - University of Colorado at Colorado Springs

Changjun Jiang - Tongji University

Xiaobo Zhou - University of Colorado at Colorado Springs

