Building Wavelet Histograms on Large Data in MapReduce
发布时间: 2011-12-23 04:15:00 浏览次数: 供稿:未知
演讲人:参考下方
讲座时间:0000-00-00 00:00:00
讲座地点:--
讲座内容

演讲人: Dr.? Feifei Li? (University of Utah)

讲座时间: 12月28日 上午10点半

讲座地点: 信息楼四楼学术报告厅

讲座内容: Abstract: Massive data have become ubiquitous and are being generated at an ever-increasing rate almost everywhere (e.g., in large data centers). This phenomena demands building and retrieving effective and concise summaries efficiently to represent the underlying data for further reasoning, mining, and analytics. In this talk, we present a comprehensive study on how to summarize massive data effectively and efficiently, using wavelet histograms on large distributed data. We leverage on both algorithmic (sampling, sketch) and system (MapReduce) techniques to fulfill our goal. We demonstrate that by using distributed and parallel frameworks, and blending algorithmic and database techniques, excellent scalability and efficiency can be achieved.

Bio: Feifei Li has been an assistant professor at the School of Computing, University of Utah since August 2011. He was an assistant professor at the Computer Science Department, Florida State University, between August 2007 and July 2011. He obtained his B.S. in computer engineering from Nanyang Technological University, Singapore in 2002 (transferred from Tsinghua University, China) and PhD in computer science from Boston University in 2007. His research focuses on large scale data management. He also works on probabilistic data, text/string processing, semantic web/graph data (e.g., RDF), as well as security and privacy issues in data management. His research has been actively supported by NSF, HP Labs, and the Florida Department of Revenue. He was a recipient for an NSF career award in 2011, an HP IRP award in 2011, and the IEEE ICDE best paper award in 2004.

演讲人简介