You are on page 1of 4

c 



cccc
c 

One of the key challenges faced by organizations of any
size and any industry is managing and analyzing the
soaring quantity of data, and harnessing that information
to improve the business. IT is challenged by the high costs
associated with the purchase and maintenance of hardware
needed to accommodate large data volumes, while business users need quick access to
information and analytics in order to react to changing market conditions.

SAP has always understood this challenge and has reacted by developing in-memory solutions
that specifically address the need to analyze large data volumes while reducing IT costs.

SAP is evolving its in-memory technology with the introduction of SAP In-Memory Appliance
(SAP HANA) software, a flexible, multi-purpose, data source agnostic in-memory appliance that
combines SAP software components optimized on hardware provided and delivered by SAP
leading hardware partners.

SAP HANA can enable organizations to analyze business operations based on large volumes of
detailed information as it develops. Organizations can instantly explore and analyze all of their
transactional and analytical data from virtually any data source in real time. Operational data is
captured in memory as business happens, and flexible views expose analytic information at the
speed of thought. External data can be added to analytic models to expand analysis across the
entire organization.

 !c "  # !$  # ! %  " %

?V  &' ($ %  %" %  with new ways to look at your business based on instant,
intuitive access to relevant information, along with greater ease of collaboration.
?V  '! ) ) *

! $  %$' + +)!   % $  %* +


! $% to dramatically improve existing planning, forecasting, pricing optimization,
and other processes.
?V $$! ' % %%
" $*+ !( $   with less hardware,
maintenance, and testing based on proven tools that are intuitive to implement whether
delivered on-demand or via appliances.

V
Ê  
  V VVV
V  
V VV
V V 
VVVV
V
 
V V
 V
 V!VV VV  V  
V"V #V  VV$


V
   V

%
VÊ  
    Ê  

V$ &
VV$
 '
V #V V
VV
$ V
V 
 VV
VVV 
 V

VV

V
 
V VV"
V(  VVV&
 VVV

V
V
VV V V
V 
 V

V
 



    VVVV 
VV "VV VV )*+V!VV
 V
V

VV
V$  VV$ VV 'VV V $
V$
$
&
 V%
V
 

VVVV 
V&

V
V VV
 V
 V

 $& (
OLAP (online analytical processing) cubes can be thought of as extensions to the two-
dimensional array of a spreadsheet. For example a company might wish to analyze some
financial data by product, by time-period, by city, by type of revenue and cost, and by comparing
actual data with a budget. These additional methods of analyzing the data are known as
dimensions.[3] Because there can be more than three dimensions in an OLAP system the term
hypercube is sometimes used.

,( - $  ! 

The OLAP cube consists of numeric facts called |   which are categorized by | .
The cube metadata (structure) may be created from a star schema or snowflake schema of tables
in a relational database. Measures are derived from the records in the fact table and dimensions
are derived from the dimension tables.
A database program must show its data as two-dimensional tables, of columns and rows, but
store it as one-dimensional strings. For example, a database might have this table.


c( %  %  ! 
1 Smith Joe 40000
2 Jones Mary 50000
3 Johnson Cathy 44000

This simple table includes an employee identifier (EmpId), name fields (Lastname and
Firstname) and a salary (Salary).

This table exists in the computer's memory (RAM) and storage (hard drive). Although RAM and
hard drives differ mechanically, the computer's operating system abstracts them. Still, the
database must coax its two-dimensional table into a one-dimensional series of bytes, for the
operating system to write to either the RAM, or hard drive, or both.

A row-oriented database serializes all of the values in a row together, then the values in the next
row, and so on.

VVVVVV  
V
VVVVVV   V
VVVVVV

A column-oriented database serializes all of the values of a column together, then the values of
the next column, and so on.

VVVVVV   V
VVVVVV  V
VVVVVV  V
VVVVVV
 

This is a simplification. Partitioning, indexing, caching, views, OLAP cubes, and transactional
systems such as write ahead logging or multiversion concurrency control all dramatically affect
the physical organization. That said, online transaction processing (OLTP)-focused RDBMS
systems are more row-oriented, while online analytical processing (OLAP)-focused systems are
a balance of row-oriented and column-oriented.

,( - " %


Comparisons between row-oriented and column-oriented systems are typically concerned with
the efficiency of hard-disk access for a given workload, as seek time is incredibly long compared
to the other delays in computers. Sometimes, reading a megabyte of sequentially stored data
takes no more time than one random access[3]. Further, because seek time is improving at a slow
rate relative to CPU power (see Moore's Law), this focus will likely continue on systems reliant
on hard-disks for storage. Following is a set of over-simplified observations which attempt to
paint a picture of the trade-offs between column- and row-oriented organizations. Unless, of
course, the application can be reasonably assured to fit most/all data into memory, in which case
huge optimizations are available from in-memory database systems.

1.V Column-oriented systems are more efficient when an aggregate needs to be computed
over many rows but only for a notably smaller subset of all columns of data, because
reading that smaller subset of data can be faster than reading all data.
2.V Column-oriented systems are more efficient when new values of a column are supplied
for all rows at once, because that column data can be written efficiently and replace old
column data without touching any other columns for the rows.
3.V Row-oriented systems are more efficient when many columns of a single row are
required at the same time, and when row-size is relatively small, as the entire row can be
retrieved with a single disk seek.
4.V Row-oriented systems are more efficient when writing a new row if all of the column
data is supplied at the same time, as the entire row can be written with a single disk seek.

In practice, row-oriented architectures are well-suited for OLTP-like workloads which are more
heavily loaded with interactive transactions. Column stores are well-suited for OLAP-like
workloads (e.g., data warehouses) which typically involve a smaller number of highly complex
queries over all data (possibly terabytes). However, there are a number of proven row-based
OLAP RDBMS that handles terabytes, or even petabytes of data, such as Teradata.

You might also like