Downloading user manuals

BigDataBench 3.0  user manual  [user manual 3.0]

Downloading raw data sets

Table 1: The Summary of Data Sets

Data sets Download Description
1 Wikipedia Entries Wiki.bz2 Size:[9.8GB]
2 Amazon Movie Reviews AMR.tar.gz Size:[3.1GB]
3 Google Web Graph GWG.bz2 Size:[23MB]
4 Facebook Social Network FSN.bz2 Size:[220KB]
5 E-commerce Transaction Data ECT.tar.gz Size:[3MB]
6 ProfSearch Person Resumes PPR.tar.gz Size:[182MB]
7 CALDA Data (synthetic data) Hive_benchmark.tar.gz Size:[257KB]
8 TPC-DS Web Data (synthetic data) TPCDS.tar.gz Size:[384KB]

Downloading software packages

We provide two options: download the full software package one time or download components one by one. Please note that you need to download and deploy prerequisite software packages before using BigDataBench.  Please refer to the user manual. The following packages should be installed firstly, and the running platform is Linux.

Software Version Download
Hadoop 1.0.2
HBase 0.94.5
Cassandra 1.2.3
MongoDB 2.4.1
Mahout 0.8
Hive 0.9.0 #GettingStarted-InstallationandConfiguration
Spark 0.8.0
Shark 0.8.0
Impala 1.1.1
Boost 1_43_0
Scala 2.9.3
GCC 4.8.2
GSL 1.16

Full downloading

Full software packages of different implementations are available from the following links:

Separate downloading

You may download different components of BigDataBench from the following Tables.

BDGSBig Data Generator Suite in BigDataBench

  Name Description
BDGS generates big data on the basis of six raw data sets Text BigDataGeneratorSuite.tar.gzSize: 95MB


BigDataBench workloads.  Please note that each shell script for generating data and running workloads is included in the distribution.


Application Scenarios Application Type Workloads Description
 Cloud OLTP  Micro Benchmarks Read BasicDatastoreOperations.tar.gzSize: 95MB
Applications Search Server Available soon
Offline Analytics Micro Benchmarks Sort MicroBenchmarks.tar.gz

  • Hadoop version, size: 4.8MB
  • MPI  version, size:1.4MB
  • Spark version, size:300KB
BFS MPI Version: BFS_MPI.tar.gzSize: 4.7MB
Analytics Workloads Index SearchEngine.tar.gz

Kmeans SNS.tar.gz

Connected Components
Collaborative Filtering E-commerce.tar.gz

Naive Bayes
OLAP and Interactive Analytics Micro Benchmarks Project InteractiveMicroBenchmark.tar.gz

Cross Product
Analytics Workloads Join Query InteractiveQuery.tar.gz

Select Query
Aggregation Query
Eight TPC-DS Web Queries OLAP.tar.gz