Big data has emerged as a strategic property of nations and organizations. There are driving needs to generate values from big data. However, the sheer volume of big data requires significant storage capacity, transmission bandwidth, computations, and power consumption. It is expected that systems with unprecedented scales can resolve the problems caused by varieties of big data with daunting volumes. Nevertheless, without big data benchmarks, it is very difficult for big data owners to make choice on which system is best for meeting with their specific requirements. They also face challenges on how to optimize the systems and their solutions for specific or even comprehensive workloads. Meanwhile, researchers are also working on innovative data management systems, hardware architectures, operating systems, and programming systems to improve performance in dealing with big data.

This workshop, the seventh its series, focuses on architecture and system support for big data systems, aiming at bringing researchers and practitioners from data management, architecture, and systems research communities together to discuss the research issues at the intersection of these areas.

Call for Papers


The workshop seeks papers that address hot topic issues in benchmarking, designing and optimizing big data systems. Specific topics of interest include but are not limited to:

  • Big data workload characterization and benchmarking
  • Performance analysis of big data systems
  • Workload-optimized big data systems
  • Innovative prototypes of big data infrastructures
  • Emerging hardware technologies in big data systems
  • Operating systems support for big data systems
  • Interactions among architecture, systems and data management
  • Hardware and software co-design for big data
  • Practice report of evaluating and optimizing large-scale big data systems

Papers should present original research. As big data spans many disciplines, papers should provide sufficient background material to make them accessible to the broader community.

Download CFP

Paper Submissions

Papers must be submitted in PDF, and be no more than 8 pages in standard two-column SIGPLAN conference format including figures and tables but not including references. Shorter submissions are encouraged. The submissions will be judged based on the merit of the ideas rather than the length. Submissions must be made through the on-line submission site.

Submission site:

Important Dates

Papers due                                 February 16, 2017
Notification of acceptance      February 28, 2017
Workshop Session                    April 9, 2017



Opening remark



Keynote I: Scalable In-memory Computing: A Perspective from Systems Software

Speaker: Prof. Haibo Chen, Shanghai Jiao Tong University

Abstract: In-memory computing promises 1000X faster data access speed, which brings opportunities to boost transaction processing speed into a higher level. In this talk, I will describe our recent research efforts in providing speedy in-memory transactions at the scale of millions of transactions per second. Specifically, I will present how we leverage advanced hardware features like HTM, RDMA to provide both better single-node and distributed in-memory transactions and query processing, how operating systems and processor architecture could be refined to further ease and improve in-memory transaction processing, as well as how concurrency control protocol can be adapted accordingly to fit our need.

Bio: Haibo Chen is a Professor at the School of Software, Shanghai Jiao Tong University, where he currently leads the Institute of Parallel and Distributed Systems (IPADS) ( Haibo’s main research interests are building scalable and dependable systems software, by leveraging cross-layering approaches spanning computer hardware system virtualization and operating systems. He received best paper awards from ICPP, APSys and EuroSys, a best paper nominee from HPCA, the Young Computer Scientist Award from China Computer Federation, the distinguished Ph.D thesis award from China Ministry of Education and National Youth Top-notch Talent Support Program of China. He also received fault research award/fellowship from NetApp, Google, IBM and MSRA. He is currently the steering committee co-chair of ACM APSys, the general co-chair of SOSP 2017 and serves on program committees of SOSP 2017, ASPLOS 2017, Oakland 2017, EuroSys 2017 and FAST 2017 as well as the editorial board of ACM Transactions on Storage.



Invited Talk I: Big Data Dwarfs

Speaker: Dr. Lei Wang, Ms. Wanling Gao, ICT, CAS

Abstract: TBD.

Bio: Wei Wang obtained his PhD degree in 2016 from Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences. He is currently a senior engineer with Institute of Computing Technology, Chinese Academy of Sciences. His current research interests include resource management of cloud systems.
Wangling Gao is a Ph.D candidate in computer science at the Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences. Her research interests focus on big data benchmark and big data analytics. She received her B.S. degree in 2012 from Huazhong University of Science and Technology.


Tea Break



Invited Talk II: TBD

Speaker: Prof. Xu Liu, College of William and Mary

Abstract: TBD

Bio: Xu Liu is an assistant professor in the Department of Computer Science at College of William and Mary. He obtained his Ph.D. from Rice University in 2014. His research interests are parallel computing, compilers, and performance analysis. Prof. Liu has been working on a few open-source performance tools, which are world-widely used at universities, DOE national laboratories, and in industry. Prof. Liu received HPC fellowships from NAG, Schlumberger, and BP while a Ph.D. candidate at Rice University. After joining W&M, Prof. Liu received Best Paper Award at SC’15.


Regular Paper I: Benchmarking Kudu Distributed Storage Engine on High-Performance Interconnects and Storage Devices

Authors: Nusrat Sharmin Islam, Md. Wasi-ur-Rahman, Xiaoyi Lu, Dhabaleswar K. (DK) Panda (The Ohio State University)

Abstract: During the past several years, Hadoop MapReduce and Spark have proven to be the two most popular Big Data processing frameworks and Hadoop Distributed File System (HDFS) is the underlying file system for both of them. But data in HDFS is static; it does not allow any update operations on the stored data. Consequently, in order to accelerate Online Analytical Processing (OLAP) workloads, novel storage engines such as Kudu has emerged. All these Big Data processing frameworks along with the underlying storage engines are being increasingly used on High-Performance Com- puting (HPC) systems. It is, therefore, critical to understand the interaction of Kudu for fast analytics on rapidly changing data. Moreover, as the amount of data increases, the performance of Kudu becomes bounded by both network as well as storage perfor- mances. In this paper, we evaluate Kudu operations over different interconnects and storage devices on HPC platforms and observe that the performance of Kudu improves by up to 21% when moved to IP-over-InfiniBand (IPoIB) 100Gbps from 40GigE Ethernet. Sim- ilarly, while the underlying storage device is switched from hard disk to SSD, Kudu operations show a speed up of up to 29%. To the best of our knowledge, this is the first study to analyze the performance of Kudu over high-performance interconnects and storage devices.


Lunch Break



Keynote II: TBD

Speaker: Prof. TBD, TBD

Abstract: TBD

Bio: TBD



Invited Talk III:

Speaker: TBD, TBD

Abstract: TBD

Bio: TBD



Invited Talk IV: TBD

Authors: TBD ,TBD

Abstract: TBD.

Bio: TBD.


Regular Paper II: Page Table Walk Aware Cache Management for Efficient Big Data Processing

Authors: Eishi Arima, Hiroshi Nakamura (The University of Tokyo)

Abstract: The performance penalty of page table walks after TLB misses is serious for modern computer systems. Particu- larly, it is more severe while processing big data workloads because they generally experience TLB misses more fre- quently due to larger memory footprint and less access lo- cality. To execute such workloads more efficiently, we need to revisit caches. This is because they are accessed during page table walks to fetch Page Table Entries (PTEs) but not optimized accordingly. Thus, this paper proposes a novel cache management scheme that attempts to optimize the al- locations of PTEs in caches. More specifically, we optimize the eviction priorities for PTEs and data at each level of the cache hierarchy to maximize performance.

Venue Information


Contact Information

Prof. Jianfeng Zhan:
Dr. Gang Lu:      
Mr. Xinhui Tian: 


Program Chairs: 

Prof. Jianfeng Zhan, ICT, Chinese Academy of Sciences and University of Chinese Academy of Sciences
Dr. Gang Lu, Beijing Academy of Frontier Science & Technology
Mr. Xinhui Tian, ICT, Chinese Academy of Sciences and University of Chinese Academy of Sciences

Web and Publicity Chairs: 

Wanling Gao.  ICT, CAS

Keynote speaker


Program Committee (Confirmed)

  • Lei Wang, Institute of computing technology, Chinese Academy of Science
  • Xu Liu, College of William and Mary
  • Tilmann Rabl, Technical University of Berlin
  • Bingsheng He, National University of Singapore
  • Zhen Jia, Princeton University
  • Xiaoyi Lu, The Ohio State University

Photo Gallery


Previous Events





October 7, 2013

IEEE BigData Conference, San Jose, CA


October 31,2013

CCF HPC China, Guilin, China


December 5,2013

CCF Big Data Technology Conference 2013, BeiJing, China


March 1, 2014

ASPLOS 2014, Salt Lake City, Utah, USA


September 5, 2014

VLDB 2014, Hangzhou, Zhejiang Province, China


September 4, 2015

VLDB 2015, Hilton Waikoloa Village, Kohala Coast , Hawai‘i


April 3, 2016

ASPLOS 2016, Atlanta, Georiga, USA