You are on page 1of 6

Prof.

Hasso Plattner

A Course in
In-Memory Data Management
The Inner Mechanics
of In-Memory Databases

September 4, 2015

This learning material is part of the reading material for Prof.


Plattners online lecture "In-Memory Data Management" taking place at
www.openHPI.de. If you have any questions or remarks regarding the
online lecture or the reading material, please give us a note at openhpi-
imdb@hpi.uni-potsdam.de. We are glad to further improve the material.

Research Group "Enterprise Platform and Integration Concepts",


http://epic.hpi.uni-potsdam.de
Chapter 5
A Blueprint of SanssouciDB

SanssouciDB is a prototypical database system for unified analytical and


transactional processing. The concepts of SanssouciDB build on prototypes
developed at the HPI and an existing SAP database system. SanssouciDB is
an SQL database and it contains similar components as other databases such
as a query builder, a plan executer, meta data, a transaction manager, etc.

5.1 Data Storage in Main Memory

In contrast to traditional database management systems, the primary persis-


tence of SanssouciDB is main memory. Yet logging and recovery still require
disks as non-volatile data storage to ensure data consistency in case of fail-
ures. All operators, e.g., find, join, or aggregation can anticipate that data
resides in main memory. Thus, operators are implemented differently mov-
ing the focus from optimizing for disk access towards optimizing for main
memory access and CPU utilization (see Chapter 4).
This apparently subtle difference of moving the primary persistence has
a vast impact on performance even when disk-based databases are com-
pletely memory resistent. Ailamaki et al. invested such fully cached disk-
based databases and found that a large portion of query execution is spent
for memory and resource stalls [ADHW99]. Those stalls are mainly caused
by in-page data placements that do not utilize the CPU caches properly.
In many cases, the actual computation accounts for less than 40% of the
execution time. Besides, Harizopoulos et al. found that the buffer manage-
ment of disk-based databases alone contributes 31% to the overall instruction
count [HAMS08]
Consequently, the reason for the performance advantages of in-memory
over disk-based databases derives from optimized data structures and algo-
rithms avoiding memory and resource stalls together with the removal of
additional indirections.

33
34 5 A Blueprint of SanssouciDB

5.2 Column-Orientation

Another concept used in SanssouciDB was invented more than two decades
ago, that is, storing data column-wise [CK85] instead of row-wise. In column-
orientation, complete columns are stored in adjacent blocks. This can be con-
trasted with row-oriented storage where complete tuples (rows) are stored in
adjacent blocks. Column-oriented storage, in contrast to row-oriented stor-
age, is well suited for reading consecutive entries from a single column. This
can be useful for aggregation and column scans. More details on column-
orientation and its differences to row-orientation can be found in Chapter 8.
To minimize the amount of data that needs to be transferred between stor-
age and processor, SanssouciDB uses several different data compression
techniques, which will be discussed in Chapter 7.

5.3 Implications of Column-Orientation

Column-oriented storage has become widespread in database systems


specifically developed for OLAP, as the advantage of column-oriented stor-
age is clear in case of quasi-sequential scanning of single attributes and set
processing thereof. If not all fields of a table are queried, column-orientation
can be exploited as well in transactional processing (avoiding "SELECT *").
An analysis of enterprise applications showed that there is actually no ap-
plication that uses all fields of a given tuple. For example, in dunning only
17 attributes are necessary out of a table that contains 300 attributes. If only
the 17 needed attributes are queried instead of the full tuple representation
of all 300 attributes, an instant advantage of factor eight to 20 for data to be
scanned can be achieved.
As disk is not the bottleneck any longer, but access to main memory has to
be considered, an important aspect is to work on a minimal set of data. So far,
application programmers were fond of "SELECT *" statements. The differ-
ence in runtime between selecting specific fields or all fields in row-oriented
storage is often insignificant and in case changes to an application need
more fields, the data was already there (which besides is a weak argument
for using SELECT * and retrieving unnecessary data). However, in case of
column-orientation, the penalty for "SELECT *" statements grows with table
width. Especially if tables are growing in width during productive usage,
actual runtimes of applications cannot be anticipated during programming.
With the column-store approach, the number of indices can be signifi-
cantly reduced. In a column store, every attribute can be used as an index.
Because all data is available in memory and the data of a column is stored
consecutively, the scanning speed is high enough that a full sequential scan
of an attribute is sufficient in most cases. If this is not fast enough, dedicated
indices can still be used in addition for further speedup.
5.4 Architecture Overview 35

Storing data in columns instead of rows is challenging for workloads


with many data modifying operations. Therefore, the concept of a differen-
tial buffer was introduced, where new entries are written to a differential
buffer first. In contrast to the main store, the differential buffer is optimized
for inserts. At a later point in time and depending on thresholds, e.g. the
frequency of changes and new entries, the data in the differential buffer is
merged into the main store. More details about the differential buffer and
the merge process will be provided later in Chapter 25 and Chapter 27.

5.4 Architecture Overview

The architecture shown in Figure 5.1 grants an overview of the components


of SanssouciDB.
SanssouciDB is split in three different logical layers fulfilling specific tasks
inside the database system. The Management Layer handles the commu-
nication to applications, creates query execution plans, stores meta data and
contains the logic for database transactions. Inside the main memory of a spe-
cific machine the main working set of SanssouciDB is located. That working
set is accessed during query execution and is stored either in row, column or
hybrid-oriented data layout, depending on the specific type of queries sent
to the database tables. The non-volatile memory in the durable storage area
is used for logging and recovery purposes, as well as for data aging and time
travel. All those concepts will be described in the subsequent sections.

14.8.2014 Canvas 9

Financials Logistics
Manu-

OLTP & OLAP
facturing Applications

SQL Interface
Stored Procedures Management
Layer
Query Execution Metadata Sessions Transactions

Read-onlyReplicas
Read-only Replicas Main Memory
Storage
Cold Store - 2 Cold Store - 1 Hot Store (Master)
Merge

Main Main Main Delta


Attribute Vectors Attribute Vectors Attribute Vectors Attribute Vectors

Dictionaries Dictionaries
Dictionaries
Index
Index
Dictionaries
Aggregate Cache Aggregate Cache Index

Index

History
Aggregate
Cache

Durable
Log Checkpoint
Checkpoints Storage

Fig. 5.1: Schematic Architecture of SanssouciDB

file:///Users/sykarian/Dropbox/EPIC/Vorlesungen/TuKSS2014/Overview.svg 1/1
36 REFERENCES

5.5 References

[ADHW99] Anastassia Ailamaki, David J. DeWitt, Mark D. Hill, and


David A. Wood. Dbmss on a modern processor: Where does
time go? In Malcolm P. Atkinson, Maria E. Orlowska, Patrick
Valduriez, Stanley B. Zdonik, and Michael L. Brodie, editors,
VLDB, pages 266277, San Francisco, CA, USA, 1999. Morgan
Kaufmann.
[CK85] George P. Copeland and Setrag N. Khoshafian. A Decomposi-
tion Storage Model. SIGMOD Rec., 14(4):268279, May 1985.
[HAMS08] Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and
Michael Stonebraker. Oltp through the looking glass, and what
we found there. In Jason Tsong-Li Wang, editor, SIGMOD Con-
ference, pages 981992. ACM, 2008.

You might also like