The First Workshop on Big Data Benchmarks, Performance Optimization, and Emerging hardware (BPOE 2013)

In conjunction with 2013 IEEE International Conference on Big Data (IEEE Big Data 2013)

October 8, 2013, Silicon Valley, CA, USA

Program and Glance

Opening remark, Jianfeng Zhan, Chinese Academy of Sciences [.ppt]

 

Session One: Capacity Engineering, Big Data Benchmarks, and Case Studies

Session Chair, Jianfeng Zhan, Chinese Acamemy of Sciences

 

Invited Talk, "Facebook: Using Emerging Hardware to Build Infrastructure at Scale", [.ppt]

Bill Jia, PhD. Manager, Performance and Capacity Engineering, Facebook

 

Invited Talk, "BigDataBench: Benchmarking big data systems",[.ppt]

Yingjie Shi, Chinese Academy of Sciences

 

Quan Jing, Shi Yingjie and Zhao Ming , "The Implications from Benchmarking Three Different Data Center Platforms", [.pdf]

University of Science and Technology of China, China

 

Runlin Zhou, Yingjie Shi, Chunge Zhu , "AxPUE: Application Level Metrics for Power Usage Effectiveness in Data Centers", [.pdf]

National Computer network Emergency Response Technical Team Coordination Center of China, China

 

Taoying Liu, "A Performance Evaluation of Hive for Scientific Data Management", [.pdf] Institute of Computing Technology, CAS, China

 

Chen Pengfei, Qi Yong, Li xinyi, and Li Su, "An Ensemble MIC-based Approach for Performance Diagnosis in Big Data Platform", [.pdf]

Xi'an jiaotong university, China

 

Fengfeng Ning, Chuliang Weng, and Yuan Luo, "Virtualization I/O Optimization Based on Shared Memory", [.pdf]

Shanghai Jiao Tong University, China

 

Wen Xiong and Zhibin Yu, "A Characterization of Big Data Benchmarks", [.ppt], [.pdf]

Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, China

 

Shengyuan Liu, Jungang Xu, Zongzhen Liu, and Xu Liu, "Evaluating Task Scheduling in Hadoop-based Cloud Systems", [.pdf], [.ppt]

College of Computer and Control Engineering, University of Chinese Academy of Sciences, China

 

Session Two: Performance optimization of big data systems

Session Chair: Tilmann Rabl, Univeristy of Toronto

 

Stephan Müller, Lars Butzmann, Stefan Klauck, and Hasso Plattner, "Workload-Aware Aggregate Maintenance in Columnar In-Memory Databases", [.pdf], [.ppt]

Hasso Plattner Institute, Germany

 

Dong Yang, Xiang Zhong, Dong Yan, Fangqin Dai, Xusen Yin, Cheng Lian, Zhongliang Zhu, Weihua Jiang, and Gansha Wu, "NativeTask: A Hadoop Compatible Framework for High Performance", [.pdf] ,

Intel Corporation, China

 

Martin Dimitrov, Karthik Kumar, Patrick Lu, Vish Viswanathan, and Thomas Willhalm, "Memory system characterization of Big Data workloads", [.pdf], [.ppt]

Intel, USA

 

Tao Zhong, Kshitij Doshi, Xi Tang, Ting Lou, Zhongyan Lu, and Hong Li,

"On Mixing High-Speed Updates and In-Memory Queries: A Big-Data Architecture for Real-time Analytics" ,[.pdf]

Intel

 

Wei-Chun Chung, Yu-Jung Chang, Chien-Chih Chen, Der-Tsai Lee, and Jan-Ming Ho, "Optimizing a MapReduce Module of Preprocessing High-Throughput DNA Sequencing Data", [.pdf]

Research Center for Information Technology Innovation, Academia Sinica

 

Session three: Experience and evaluation with emerging hardware for big data

Session Chair: Weijia Xu, University of Texas at Austin

 

Tyler Clemons, S M Faisal, Shirish Tatikonda, Charu Aggarwal, and Srinivasan Parthasarathy, "Hash in a Flash: Hash Tables for Flash Devices", [.ppt], [.pdf]

The Ohio State University, United States

 

Shinichi Yamagiwa and Hiroshi Sakamoto, "A Reconfigurable Stream Compression Hardware based on Static Symbol-Lookup Table" [.pdf]

University of Tsukuba, Japan

 

Xi Luo, Walid Najjar, and Vagelis Hristidis, "Efficient Near-Duplicate Document Detection using FPGAs", [.pdf]

UC Riverside, USA

 

Yaakoub El-Khamra, Niall Gaffney, David Walling, Eric Wernert, Weijia Xu, and Hui Zhang, "Performance Evaluation of R with Intel Xeon Phi Coprocessor" [.pdf]

University of Texas at Austin, USA

 

Closing remark Weijia Xu


Bill Jia

Bill Jia is the manager of Performance and Capacity Engineering group at Facebook, Menlo Park, California, where he leads the efforts to help various products and services to be built, well designed with robust architecture, tested with ensured performance, and planned with right-sized capacity. Bill Jia is directly involved in the decision making process on how to build data centers, how to best utilize the data centers with different clusters and how to provide power and networking supports to get capacity installed and provisioned. Bill Jia is also leading the effort to design/adopt the best hardware to help Facebook infrastructure to scale.

Prior to Facebook, Bill Jia was a senior performance and capacity engineer at Microsoft, Mountain View, California, where he was the leading engineer to architecture the platform capacity for Windows Live Hotmail, Calendar, MSN, and Storage.

 

Bill Jia is an active member in publication. He has more than 20 refereed international conference and international journal publication. He is the inventor (co-inventor) for 8 patents, registered in US Trademark and Patent Office. He has been the chair person for a number of international conference, invited sessions on large-scale services, operations research, management science, etc. Bill Jia holds a PhD degree in Operations Research from University of Southern California. He obtained his Master of Science Degree from National University of Singapore and a Dual-Bachlor of Engineering degree from Shanghai JiaoTong University.

Yingjie Shi

Yingjie Shi is currently an assistant professor at the Institute of Computing Technology, Chinese Academy of Sciences. She is now the manager of platform software group, which is researching the evaluation and analysis mechanisms of emerging hardware, the benchmarking and optimization of basic software platform. She received her PhD in Computer Software and Theory from Renmin University of China in 2013, and received her MS in Computer Architecture from Huazhong University of Science and Technology in China in 2007. Her research interests include big data management, evaluation and optimization of software platform for big data.