Home

News: A full-day Tutorial at HPCA'19. Bench19 CFP. BigDataBench 5.0 released. PACT'18 paper onData Motif. IISWC'18 paper onBig Data and AI Proxy Benchmarks.

Overview

Architecture, system, data management, and AI or machine learning communities pay greater attention to innovative big data and AI algorithms, architecture, and systems. However, complexity, diversity, frequently changed workloads, and rapid evolution of big data and AI systems raise great challenges, as there is a lack of simple but elegant abstractions that facilitate understanding these most important classes of modern workloads. First, for the sake of conciseness, benchmarking scalability, portability cost, reproducibility, and better interpretation of performance data, we need understand what are the most time-consuming classes of unit of computation among big data and AI workloads. Second, for the sake of fairness, the benchmarks must include diversity of data and workloads. Third, for co-design of software and hardware, the benchmarks should be consistent across different communities; Moreover, we need simple but elegant abstractions that help achieve both efficiency and general-purpose.

We specify the common requirements of Big Data and AI workloads only algorithmically in a paper-and-pencil approach, reasonably divorced from individual implementations. We capture the differences and collaborations among IoT, edge, datacenter and HPC in handling Big Data and AI workloads. We consider each big data and AI workload as a pipeline of one or more classes of units of computation performed on initial or intermediate data inputs, each of which we call a data motif. For the first time, among a wide variety of big data and AI workloads, we identify eight data motifs (PACT 18 paper)— including Matrix, Sampling, Logic, Transform, Set, Graph, Sort and Statistic computation, each of which captures the common requirements of each class of unit of computation. Other than creating a new benchmark or proxy for every possible workload, we propose using data motif-based benchmarks—the combination of one or more data motifs—to represent diversity of big data and AI workloads.

We release an open-source big data and AI benchmark suite—BigDataBench. The current version BigDataBench 5.0 provides 13 representative real-world data sets and 44 benchmarks. The benchmarks cover seven workload types including AI, online services, offline analytics, graph analytics, data warehouse, NoSQL, and streaming from three important application domains: Internet services (including search engines, social networks, e-commerce), recognition sciences, and medical sciences. Our benchmark suite includes micro benchmarks, each of which is a single data motif, components benchmarks, which consist of the data motif combinations, and end-to-end application benchmarks, which are the combinations of component benchmarks. Meanwhile, data sets have great impacts on workloads behaviors and running performance (our CGO’18 paper). Hence, data varieties are considered with the whole spectrum of data types including structured, semi-structured, and unstructured data. Currently, the included data sources are text, graph, table, and image data. Using real data sets as the seed, the data generators—BDGS—generate synthetic data by scaling the seed data while keeping the data characteristics of raw data.

To achieve the consistency of benchmarks across different communities, we absorb state-of-the-art algorithms from the machine learning communities that considers the model’s prediction accuracy. For the benchmarking requirements of system and data management communities, we provide diverse implementations using the state-of-the-art techniques. For offline analytics, we provide Hadoop, Spark, Flink and MPI implementations. For graph analytics, we provide Hadoop, Spark GraphX, Flink Gelly and GraphLab implementations. For AI, we provide TensorFlow and Caffe implementations. For data warehouse, we provide Hive, Spark-SQL and Impala implementations. For NoSQL, we provide MongoDB and HBase implementations. For streaming, we provide Spark streaming and JStorm implementations.

For the architecture community, whatever early in the architecture design process or later in the system evaluation, it is time-consuming to run a comprehensive benchmark suite. The complex software stacks of the big data and AI workloads aggravate this issue. To tackle this challenge, we propose the data motif-based simulation benchmarks (IISWC’18 paper) for architecture communities, which speed up runtime 100 times while preserving system and micro-architectural characteristic accuracy. Also, we propose another methodology to reduce the benchmarking cost, we select a small number of representative benchmarks, called the BigDataBench subset according to workload characteristics from an architecture perspective. We provide the BigDataBench architecture subset (IISWC’14 paper) on the MARSSx86, gem5, and Simics simulator versions, respectively.

Modern datacenter computer systems are widely deployed with mixed workloads to improve system utilization and save cost. However, the throughput of latency-critical workloads is dominated by their worst-case performance-tail latency. To model this important application scenario, we propose an end-to-end application benchmark---DCMix to generate mixed workloads whose latencies range from microseconds to minutes with four mixed execution modes.

Modern Internet services workloads are notoriously complex in terms of industry-scale architecture fueled with machine learning algorithms. As a joint work with Alibaba, we release an end-to-end application benchmark---E-commerce Search to mimic complex modern Internet services workloads.

To measure and rank high performance AI computer systems (HPC AI) or AI supercomputers, we also release an HPC AI benchmark suite (AI500), consisting of micro benchmarks, each of which is a single data motif, and component benchmarks, e.g., resnet 50. We will release an AI500 list on BenchCouncil conferences soon.

Together with several industry partners, including Telecom Research Institute Technology, Huawei, Intel (China), Microsoft (China), IBM CDL, Baidu, Sina, INSPUR, ZTE and etc., we also release China’s first industry standard big data benchmark suite—-BigDataBench-DCA, which is a subset of BigDataBench3.0.

Contributors

If you have any questions, please contact via emails.
  • gaowanling@ict.ac.cn
  • luochunjie@ict.ac.cn
  • wangle_2011@ict.ac.cn
  • zhanjianfeng@ict.ac.cn

Prof. Jianfeng Zhan, SSL, ICT, Chinese Academey of Sciences,BenchCouncil    
Dr. Lei Wang, ICT, Chinese Academy of Sciences    
Dr. Wanling Gao, ICT, Chinese Academy of Sciences    
Chunjie Luo, ICT, Chinese Academy of Sciences
Dr. Chen Zheng, ICT, Chinese Academy of Sciences and BenchCouncil    
Zihan Jiang, University of Chinese Academy of Sciences
Fei Tang, University of Chinese Academy of Sciences
Minghe Yu, University of Chinese Academy of Sciences
Dr. Zheng Cao, Alibaba    
Hainan Ye, Beijing Academy of Frontier Sciences and BenchCouncil    
Dr. Zhen Jia, Princeton University and BenchCouncil    
Daoyi Zheng, Baidu    
Shujie Zhang, Huawei    
Haoning Tang, Tencent    
Dr. Yingjie Shi
Zijian Ming, Tencent    
Yuanqing Guo, Sohu    
Yongqiang He, Dropbox
Kent Zhan, Tencent (Previously), WUBA(Currently)    
Xiaona Li, Baidu    
Bizhu Qiu, Yahoo!
Qiang Yang, Beijing Academy of Frontiers Sciences    
Jingwei Li, Beijing Academy of Frontiers Sciences    
Dr. Xinhui Tian, ICT, Chinese Academy of Sciences    
Dr. Gang Lu, Beijing Academy of Frontiers Sciences
Xinlong Lin, Beijing Academy of Frontiers Sciences    
Rui Ren, ICT, Chinese Academy of Sciences    
Dr. Rui Han, Beijing Institute of Technology    

Benchmark Models

We provide three benchmark models for evaluating hardware, software system, and algorithms, respectively.

(1) BigDataBench intact Model Division. This model is for hardware benchmarking. The users should run the implementation on their hardware directly without modification. The only allowed tuning includes hardware, OS and compiler settings.

(2) BigDataBench constrained Model Division. This model is for software system benchmarking. The division specifies the model to be used and restricts the values of hyper parameters, e.g. batch size and learning rate. The users can implement the algorithms on their software platforms or frameworks by themselves.

(3) BigDataBench free Model Division. This model is for algorithm benchmarking. The users are specified with using the same data set, with the emphasis being on advancing the state-of-the-art of algorithms.

Metrics

For the BigDataBench intact Model Division, the metrics include the wall clock time and energy efficiency to run benchmarks.

For the BigDataBench constrained model division, the metrics include the wall clock time and energy efficiency to run benchmarks. In addition, the values of hyper parameters should be reported for audition.

For the BigDataBench free model division, the metrics include the accuracy, and the wall clock time and energy efficiency to run benchmarks.

Numbers

Benchmarking results are available soon.

Benchmark Methodology

We specify the common requirements of Big Data and AI only algorithmically in a paper-and pencil approach, reasonably divorced from individual implementations. We capture the differences and collaborations among IoT, edge, datacenter and HPC in handling Big Data and AI workloads. We consider each big data and AI workload as a pipeline of one or more classes of units of computation performed on initial or intermediate data inputs, each of which we call a data motif. Other than creating a new benchmark or proxy for every possible workload, we propose using data motif-based benchmarks—the combination of eight data motifs—to represent diversity of big data and AI workloads. Figure 1 summarizes our data motif-based scalable benchmarking methodology.

Figure 1 BigDataBench Benchmarking Methodology.

Benchmarks Summary

BigDataBench is in fast expansion and evolution. The current version BigDataBench 5.0 includes real-world data sets, big data and AI workloads, covering seven types. Table 1 summarizes the real-world data sets and scalable data generation tools included into BigDataBench 5.0, covering the whole spectrum of data types, including structured, semi-structured, and unstructured data, and different data sources, including text, graph, image, audio, video and table data. Table 2 and Table 3 present the micro benchmarks and component benchmarks in BigDataBench 5.0 from perspectives of involved data motif, application domain, workload type, data set and software stack.

Table 1. The summary of data sets and data generation tools

DATA SETS DATA SIZE SCALABLE DATA SET
1 Wikipedia Entries 4,300,000 English articles(unstructured text) Text Generator of BDGS
2 Amazon Movie Reviews 7,911,684 reviews(semi-structured text) Text Generator of BDGS
3 Google Web Graph 875713 nodes, 5105039 edges(unstructured graph) Graph Generator of BDGS
4 Facebook Social Network 4039 nodes, 88234 edges (unstructured graph) Graph Generator of BDGS
5 E-commerce Transaction Data table1:4 columns,38658 rows.

table2: 6columns, 242735 rows(structured table)

Table Generator of BDGS
6 ProfSearch Person Resumes 278956 resumes(semi-structured table) Table Generator of BDGS
7 CIFAR-10 60000 color images with the dimension of 32*32 Ongoing development
8 ImageNet ILSVRC2014 DET image dataset(unstructured image) Ongoing development
9 LSUN One million labelled images, classified into 10 scene categories and 20 object categories Ongoing development
10 TED Talks Translated TED talks provided by IWSLT evaluation campaign Ongoing development
11 SoGou Data the corpus and search query data from So-Gou Labs(unstructured text) Ongoing development
12 MNIST handwritten digits database which has 60,000 training examples and 10,000 test examples(unstructured image) Ongoing development
13 MovieLens Dataset User’s score data for movies, which has 9,518,231 training examples and 386,835 test examples(semi-structured text) Ongoing development

Micro Benchmarks

Table 2. The summary of the micro benchmarks in BigDataBench 5.0.

Micro Benchmark

Involved Data Motif

Application Domain

Workload Type

Date Set

Software Stack

Sort

Sort

SE, SN, EC, MP, BI [1]

Offline analytics

Wikipedia entries

Hadoop, Spark, Flink, MPI

Grep

Set

SE, SN, EC, MP, BI

Offline analytics

Wikipedia entires

Hadoop, Spark, Flink, MPI

Streaming

Random generate

Spark streaming

WordCount

Basic statistics

SE, SN, EC, MP, BI

Offline analytics

Wikipedia entires

Hadoop, Spark, Flink, MPI

MD5

Logic

SE, SN, EC, MP, BI

Offline analytics

Wikipedia entires

Hadoop, Spark, MPI

Connected Component

Graph

SN

Graph analytics

Facebook social network

Hadoop, Spark, Flink, GraphLab, MPI

RandSample

Sampling

SE, MP, BI

Offline analytics

Wikipedia entires

Hadoop, Spark, MPI

FFT

Transform

MP

Offline analytics

Two-dimensional matrix

Hadoop, Spark, MPI

Matrix Multiply

Matrix

SE, SN, EC, MP, BI

Offline analytics

Two-dimensional matrix

Hadoop, Spark, MPI

Read

Set

SE, SN, EC

NoSQL

ProfSearch resumes

HBase, MongoDB

Write

Set

SE, SN, EC

NoSQL

ProfSearch resumes

HBase, MongoDB

Scan

Set

SE, SN, EC

NoSQL

ProfSearch resumes

HBase, MongoDB

OrderBy

Set, Sort

EC

Data warehouse

E-commerce transaction

Hive, Spark-SQL, Impala

Aggregation

Set, Basic statistics

EC

Data warehouse

E-commerce transaction

Hive, Spark-SQL, Impala

Project

Set

EC

Data warehouse

E-commerce transaction

Hive, Spark-SQL, Impala

Filter

Set

EC

Data warehouse

E-commerce transaction

Hive, Spark-SQL, Impala

Select

Set

EC

Data warehouse

E-commerce transaction

Hive, Spark-SQL, Impala

Union

Set

EC

Data warehouse

E-commerce transaction

Hive, Spark-SQL, Impala

Convolution

Transform

SN, EC, MP, BI

AI

Cifar, ImageNet

TensorFlow, Caffe

Fully Connected

Matrix

SN, EC, MP, BI

AI

Cifar, ImageNet

TensorFlow, Caffe

Relu

Logic

SN, EC, MP, BI

AI

Cifar, ImageNet

TensorFlow, Caffe

Sigmoid

Matrix

SN, EC, MP, BI

AI

Cifar, ImageNet

TensorFlow, Caffe

Tanh

Matrix

SN, EC, MP, BI

AI

Cifar, ImageNet

TensorFlow, Caffe

MaxPooling

Sampling

SN, EC, MP, BI

AI

Cifar, ImageNet

TensorFlow, Caffe

AvgPooling

Sampling

SN, EC, MP, BI

AI

Cifar, ImageNet

TensorFlow, Caffe

CosineNorm

Basic Statistics

SN, EC, MP, BI

AI

Cifar, ImageNet

TensorFlow, Caffe

BatchNorm

Basic Statistics

SN, EC, MP, BI

AI

Cifar, ImageNet

TensorFlow, Caffe

Dropout

Sampling

SN, EC, MP, BI

AI

Cifar, ImageNet

TensorFlow, Caffe

[1]SE (Search Engine), SN (Social Network), EC (e-commerce), BI (Bioinformatics), MP (Multimedia Processing).

Component Benchmark

each component benchmark is specified with a problem statement, one or more dataset, algorithms, involved data motifs, implementations and their contributors.

Image classification

Workloads type: AI
Application domains:
Dataset: Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M. S.; Berg, A. C.; Fei-Fei, L. (2015). ImageNet Large Scale Visual Recognition Challenge, International Journal of Computer Vision (IJCV).
Algorithm: He, K.; Zhang, X.; Ren, S; Sun, J. (2015), 'Deep Residual Learning for Image Recognition', CoRR abs/1512.03385.
Involved data motifs:
Software stacks:
Implementation Contributors:

Image generation

Workloads type: AI
Application domains:
Dataset: Fisher Yu, Yinda Zhang, Shuran Song, Ari Seff, and Jianxiong Xiao. LSUN: Construction of a large-scale image dataset using deep learning with humans in the loop. Corr, abs/1506.03365, 2015.
Algorithm: Arjovsky, Martin, Chintala, Soumith, and Bottou, L´eon. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
Involved data motifs:
Software stacks:
Implementation Contributors:

Text-to-Text Translation

Workloads type: AI
Application domains:
Dataset: WMT English-German from Bojar, O.; Buck, C.; Federmann, C.; Haddow, B.; Koehn, P.; Monz, C.; Post, M.; Specia, L., ed. (2014), Proceedings of the Ninth Workshop on Statistical Machine Translation, Association for Computational Linguistics, Baltimore, Maryland, USA.
Algorithm: Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; Polosukhin, I. (2017), 'Attention Is All You Need', CoRR abs/1706.03762.
Involved data motifs:
Software stacks:
Implementation Contributors:

Image-to-Text

Workloads type: AI
Application domains:
Dataset: MS COCO dataset, http://cocodataset.org/
Algorithm: "Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge." Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan. IEEE transactions on pattern analysis and machine intelligence (2016).
Involved data motifs:
Software stacks:
Implementation Contributors:

Image-to-Image

Workloads type: AI
Application domains:
Dataset: M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele. The cityscapes dataset for semantic urban scene understanding. In CVPR, 2016
Algorithm: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
Involved data motifs:
Software stacks:
Implementation Contributors:

Speech-to-Text

Workloads type: AI
Application domains:
Dataset: Panayotov, V.; Chen, G.; Povey, D.; Khudanpur, S. (2015), Librispeech: An ASR corpus based on public domain audio books, in '2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)', pp. 5206-5210.
Algorithm: Amodei, D.; Anubhai, R.; Battenberg, E.; Case, C.; Casper, J.; Catanzaro, B.; Chen, J.; Chrzanowski, M.; Coates, A.; Diamos, G.; Elsen, E.; Engel, J.; Fan, L.; Fougner, C.; Han, T.; Hannun, A. Y.; Jun, B.; LeGresley, P.; Lin, L.; Narang, S.; Ng, A. Y.; Ozair, S.; Prenger, R.; Raiman, J.; Satheesh, S.; Seetapun, D.; Sengupta, S.; Wang, Y.; Wang, Z.; Wang, C.; Xiao, B.; Yogatama, D.; Zhan, J.; Zhu, Z. (2015), 'Deep Speech 2: End-to-End Speech Recognition in English and Mandarin', CoRR abs/1512.02595.
Involved data motifs:
Software stacks:
Implementation Contributors:

Face embedding

Workloads type: AI
Application domains:
Dataset: G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst, October 2007. 5
Algorithm: FaceNet: A Unified Embedding for Face Recognition and Clustering
Involved data motifs:
Software stacks:
Implementation Contributors:

Object detection

Workloads type: AI
Application domains:
Dataset: Lin, T.-Y.; Maire, M.; Belongie, S. J.; Bourdev, L. D.; Girshick, R. B.; Hays, J.; Perona, P.; Ramanan, D.; Dollбr, P.; Zitnick, C. L. (2014), 'Microsoft COCO: Common Objects in Context', CoRR abs/1405.0312.
Algorithm: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Involved data motifs:
Software stacks:
Implementation Contributors:

Recommendation

Workloads type: AI
Application domains:
Dataset: Harper, F. M.; Konstan, J. A. (2015), 'The MovieLens Datasets: History and Context', ACM Trans. Interact. Intell. Syst. 5(4), 19:1--19:19.
Algorithm: Koren, Y., Bell, R.M., Volinsky, C. Matrix factorization techniques for recommender systems. IEEE Computer 42(8), 30–37 (2009)
Involved data motifs:
Software stacks:
Implementation Contributors:

PageRank

Workloads type: Graph Analytics
Application domains:
Dataset: Google web graph. http://snap.stanford.edu/data/web-Google.htm
Algorithm: L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford University, Stanford, CA, 1998. 17, 18, 88
Involved data motifs:
Software stacks:
Implementation Contributors:

Graph Model

Workloads type: Graph Analytics
Application domains:
Dataset: Wikipedia English articles. https://dumps.wikimedia.org/
Algorithm: D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993–1022, 2003.
Involved data motifs:
Software stacks:
Implementation Contributors:

Clustering

Workloads type: Big Data
Application domains:
Dataset: Facebook social network. http://snap.stanford.edu/data/egonets-Facebook.html
Algorithm: Krishna, K., Murty, M. N. (1999). Genetic K-means algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 29(3), 433-439.
Involved data motifs:
Software stacks:
Implementation Contributors:

Classification

Workloads type:
Application domains:
Dataset: Amazon movie review. http://snap.stanford.edu/data/web-Movies.html
Algorithm: Rish, I. (2001, August). An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence (Vol. 3, No. 22, pp. 41-46). New York: IBM.
Involved data motifs:
Software stacks:
Implementation Contributors:

Feature Exaction

Workloads type:
Application domains:
Dataset: ImageNet. http://www.image-net.org
Algorithm: Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2), 91-110.
Involved data motifs:
Software stacks:
Implementation Contributors:

Search Engine Indexing

Workloads type:
Application domains:
Dataset: Wikipedia English articles. https://dumps.wikimedia.org/
Algorithm: Black, Paul E., inverted index, Dictionary of Algorithms and Data Structures, U.S. National Institute of Standards and Technology Oct 2006. Verified Dec 2006.
Involved data motifs:
Software stacks:
Implementation Contributors:

Application Benchmark

DCMix

Modern datacenter computer systems are widely deployed with mixed workloads to improve system utilization and save cost. However, the throughput of latency-critical workloads is dominated by their worst-case performance-tail latency. To model this important application scenario, we propose an end-to-end application benchmark---DCMix to generate mixed workloads whose latencies range from microseconds to minutes with four mixed execution modes.

E-commerce search

Modern Internet services workloads are notoriously complex in terms of industry-scale architecture fueled with machine learning algorithms. As a joint work with Alibaba, we release an end-to-end application benchmark---E-commerce Search to mimic complex modern Internet services workloads.

HPC AI Version

To measure and rank high performance AI computer systems (HPC AI) or AI supercomputers, we also release an HPC AI benchmark suite (AI500), consisting of micro benchmarks, each of which is a single data motif, and component benchmarks, e.g., resnet 50. We will release an AI500 list on BenchCouncil conferences soon.

Evolution

As shown in Figure 2, the evolution of BigDataBench has gone through three major stages: At the first stage, we released three benchmarks suites, BigDataBench 1.0 (6 workloads from Search engine), DCBench 1.0 (11 workloads from data analytics), and CloudRank 1.0(mixed data analytics workloads).

At the second stage, we merged the previous three benchmark suites and release BigDataBench 2.0 , through investigating the top three important application domains from internet services in terms of the number of page views and daily visitors. BigDataBench 2.0 includes 6 real-world data sets, and 19 big data workloads with different implementations, covering six application scenarios: micro benchmarks, Cloud OLTP, relational query, search engine, social networks, and e-commerce. Moreover, BigDataBench 2.0 provides several big data generation tools–BDGS– to generate scalable big data, e.g, PB scale, from small-scale real-world data while preserving their original characteristics.

BigDataBench 3.0 is a multidisciplinary effort. It includes 6 real-world, 2 synthetic data sets, and 32 big data workloads, covering micro and application benchmarks from typical application domains, e. g., search engine, social networks, and e-commerce. As to generating representative and variety of big data workloads, BigDataBench 3.0 focuses on units of computation that frequently appear in Cloud OLTP, OLAP, interactive and offline analytics.

BigDataBench 4.0 provides 13 representative real-world data sets and 47 benchmarks. Other than creating a new benchmark or proxy for every possible workload, we propose using data motif-based benchmarks—the combination of eight data motifs—to represent diversity of big data and AI workloads. Our benchmark suite includes micro benchmarks, each of which is a single data motif, components benchmarks, which consist of the data motif combinations, and end-to-end application benchmarks, which are the combinations of component benchmarks.

Figure 2: BigDataBench Evolution

Previous releases

BigDataBench 4.0 http://prof.ict.ac.cn/BigDataBench/old/4.0/

BigDataBench 3.2 http://prof.ict.ac.cn/BigDataBench/old/3.2/

BigDataBench 3.1 http://prof.ict.ac.cn/BigDataBench/old/3.1/

BigDataBench 3.0 http://prof.ict.ac.cn/BigDataBench/old/3.0/

BigDataBench 2.0 http://prof.ict.ac.cn/BigDataBench/old/2.0/

BigDataBench 1.0 http://prof.ict.ac.cn/BigDataBench/old/1.0/

DCBench 1.0 http://prof.ict.ac.cn/DCBench/

CloudRank 1.0 http://prof.ict.ac.cn/CloudRank/

License

BigDataBench is available for researchers interested in big data. Software components of BigDataBench are all available as open-source software and governed by their own licensing terms. Researchers intending to use BigDataBench are required to fully understand and abide by the licensing terms of the various components. BigDataBench is open-source under the Apache License, Version 2.0. Please use all files in compliance with the License. Our BigDataBench Software components are all available as open-source software and governed by their own licensing terms. If you want to use our BigDataBench you must understand and comply with their licenses. Software developed externally (not by BigDataBench group)

Software developed internally (by BigDataBench group) BigDataBench_4.0 License BigDataBench_4.0 Suite Copyright (c) 2013-2018, ICT Chinese Academy of Sciences All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  • Redistribution of source code must comply with the license and notice disclaimers
  • Redistribution in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimers in the documentation and/or other materials provided by the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE ICT CHINESE ACADEMY OF SCIENCES BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.