FAST: Near Real-time Searchable Data Analytics for the Cloud

SESSION: Machine Learning and Data Analytics


TIME: 11:30AM - 12:00PM


AUTHOR(S):Yu Hua, Hong Jiang, Dan Feng



With the explosive growth in data volume and complexity and the increasing need for highly efficient searchable data analytics, existing cloud storage systems have largely failed to offer an adequate capability for real-time data analytics. To address this problem, we propose a near-real-time and cost-effective searchable data analytics methodology, called FAST. The idea behind FAST is to explore and exploit the semantic correlation within and among datasets via correlation-aware hashing and manageable flat-structured addressing to significantly reduce the processing latency, while incurring acceptably small loss of data-search accuracy. FAST supports several types of data analytics, which can be implemented in existing searchable storage systems. We conduct a real-world use case in which children reported missing in an extremely crowded environment are identified in a timely fashion by analyzing 60 million images using FAST. Extensive experimental results demonstrate the efficiency and efficacy of FAST in the performance improvements and energy savings.

Hank Childs (Chair) - University of Oregon and Lawrence Berkeley National Laboratory

Yu Hua - Huazhong University of Science & Technology

Hong Jiang - University of Nebraska-Lincoln

Dan Feng - Huazhong University of Science & Technology

