Users

Selective Research Papers Using BigDataBench

1.   Cloud Data Protection

Sherif Akoush, Lucian Carata, Ripduman Sohan and Andy Hopper, Computer Laboratory, University of Cambridge,  MrLazy: Lazy Runtime Label Propagation for MapReduce In 6th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 14)

2.       Workload Characterization

Zhen Jia, Jianfeng Zhan, Wang Lei, Rui Han, Sally A. McKee, Qiang Yang, Chunjie Luo, and Jingwei Li. Characterizing and subsetting big data workloads [PDF]. In 2014 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 2014.

Tao Jiang, Qianlong Zhang, Rui Hou, Lin Chai, Sally A. Mckee, Zhen Jia, and Ninghui Sun. Understanding the Behavior of In-Memory Computing Workloads [PDF].  In 2014 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 2014.

Wei Wei, Dejun Jiang, Jin Xiong, Mingyu Chen.  Exploring Opportunities for Non-Volatile Memories in Big Data Applications. BPOE-4, in conjunction with ASPLOS 2014.

Fengfeng Pan, YinliangYue, and Jin Xiong.  I/O Characterization of Big Data Workloads in Data Centers. BPOE-4, in conjunction with ASPLOS 2014.

Zhen Jia, Lei Wang, Jianfeng Zhan, Lixin Zhang, Chunjie Luo. Characterizing data analysis workloads in data centers. [PDF] [Slides]. 2013 IEEE International Symposium on Workload Characterization (IISWC 2013)Best paper award

3.       Evaluating and Optimizing Big Data Hardware Systems

Quan, J., Shi, Y., Zhao, M., & Yang, W. (2013, October).The implications from benchmarking three big data systems.[PDF].In Big Data, 2013 IEEE International Conference on (pp. 31-38). IEEE.

4.       SSD Cache Management 

Liu, J., Chai, Y., Qin X., & Xiao, Y.  PLC-Cache: Endurable SSD Cache for Deduplication-based Primary Storage. [pdf]. In Proceeding of MSST 2014 (30th International Conference on Massive Storage Systems and Technology).

5.       Performance  diagnosis and Optimization of Big Data Systems

Pengfei Chen, Yong Qi, Di Hou, and Huachong Sun. InvarNet-X : A Comprehensive Invariant based Approach for Performance Diagnosis in Big Data Platform. [PDF]  The fifth workshop on big data benchmarks, performance optimization and emerging hardware, in conjunction with VLDB 2014, Hangzhou, China.

Chen, P., Qi, Y., Li, X., & Su, L. (2013, October). An ensemble MIC-based approach for performance diagnosis in big data platform. [PDF]. In Big Data, 2013 IEEE International Conference on (pp. 78-85). IEEE.

6.       Evaluating and Optimizing  Big Data Systems Energy Efficiency

Zhou, R., Shi, Y., & Zhu, C. (2013, October).  AxPUE: Application level metrics for power usage effectiveness in data centers. [PDF].In Big Data, 2013 IEEE International Conference on (pp. 110-117). IEEE.

7.       Evaluation of Virtualization Systems

Ning, F., Weng, C., &Luo, Y. (2013, October). Virtualization I/O optimization based on shared memory.[PDF].In Big Data, 2013 IEEE International Conference on (pp. 70-77). IEEE.

8.       Evaluating Programming Systems

Liang, Fan; Feng, Chen; Lu, Xiaoyi; Xu, Zhiwei, “Performance Characterization of Hadoop and Data MPI Based on Amdahl’s Second Law,” Networking, Architecture, and Storage (NAS), 2014 9th IEEE International Conference on , vol., no., pp.207,215, 6-8 Aug. 2014

Liang Fan, Feng Chen, Lu Xiaoyi and Xu Zhiwei.  Performance Benefits of DataMPI: A Case Study with BigDataBench. BPOE-4, in conjunction with ASPLOS 2014.

9.       Resource management and scheduling

Yi Liang, Yufeng Wang, Minglu Fan, Chen Zhang, Yuqing Zhu. Predoop: Preempting Reduce Task for job execution accelerations. [PDF]  The fifth workshop on big data benchmarks, performance optimization and emerging hardware, in conjunction with VLDB 2014, Hangzhou, China.

Selective Papers Citing BigDataBench

Zujie Ren, Weisong Shi and Jian Wan. Towards Realistic Benchmarking in Cloud File Systems: Early Experiences, in Procceedings of the 2014 IEEE International Symposium on Workload Characterization (IISWC ’14), Raleigh, NC, Oct. 26-28, 2014.

Chowdhury, Badrul, Tilmann Rabl, Pooya Saadatpanah, Jiang Du, and Hans-Arno Jacobsen. University of Toronto. “A BigBench Implementation in the Hadoop Ecosystem.” In Advancing Big Data Benchmarks. Springer International Publishing.

Ammar, K., & Özsu, M. T. (2014). WGB: Towards a Universal Graph Benchmark. In Advancing Big Data Benchmarks (pp. 58-72). Springer International Publishing.

Afsin Akdogan, Hien To, Seon Ho Kim, and Cyrus Shahabi.  University of Southern California.  A Benchmark to Evaluate Mobile Video Upload to Cloud Infrastructures.  The fifth workshop on big data benchmarks, performance optimization and emerging hardware, in conjunction with VLDB 2014, Hangzhou, China.

Jianfeng Zhan; Lei Wang; Xiaona Li; Weisong Shi; ChuliangWeng; Wenyao Zhang; XiutaoZang, “Cost-Aware Cooperative Resource Provisioning for Heterogeneous Workloads in Data Centers,” Computers, IEEE Transactions on , vol.62, no.11, pp.2155,2168, Nov. 2013

Liu, S., Xu, J., Liu, Z., & Liu, X. (2013, October). Evaluating task scheduling in hadoop-based cloud systems. In Big Data, 2013 IEEE International Conference on (pp. 47-53). IEEE.

Chaitanya Baru, Milind Bhandarkar, Carlo Curino, Manuel Danisch, Michael Frank, Bhaskar Gowda, Hans-Arno Jacobsen, Huang Jie, Dileep Kumar, Raghu Nambiar, Meikel Poess, Francois Raab, Tilmann Rabl, Nishkam Ravi, Kai Sachs, Saptak Sen, Lan Yi, and Choonhan Youn, Discussion of BigBench: A Proposed Industry Standard Performance Benchmark for Big Data.
Nakucçi, E., Theodorou, V., Jovanovic, P., & Abelló, A. (2014, November). Bijoux: Data generator for evaluating etl process quality. In Proceedings of the 17th International Workshop on Data Warehousing and OLAP (pp. 23-32). ACM.
Lu, G.; Zhan, J.; Wang, H.; Yuan, L.; Gao, Y.; Weng, C.; Qi, Y., “PowerTracer: Tracing Requests in Multi-tier Services to Reduce Energy Inefficiency,” Computers, IEEE Transactions on , vol.PP, no.99, pp.1,1 doi: 10.1109/TC.2014.2315625 URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6783685&isnumber=4358213
Lu, X., Wasi-ur-Rahman, M., Islam, N. S., & Panda, D. K. D. (2014). A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks. In Advancing Big Data Benchmarks (pp. 32-42). Springer International Publishing.
Ralf Teusner, Michael Perscheid, Malte Appeltauer, Jonas Enderlein, Thomas Klingbeil and Michael Kusber. PopulAid: In-Memory Data Generation for Customized Benchmarks.  In Proceedings of Fifth Workshop on Big Data Benchmarking, Potsdam, Germany, August 2014, Springer.