Professional Documents
Culture Documents
http://debajitb.wix.com/debajitbanerjee/apps/blog/comparison-between-row-and-column-store
With the new Business Suite, load times have been cut by a factor of 20 and batch programming eliminated
completely, he explained, and mathematical functions have been slimmed down in the quest for faster updates,
with a goal of making three second response times the norm.
But if we look into within the database, why & how it is fast?
- Exploitation of current hardware developments
- Main Memory is the New Disk
- Non-Uniform Memory Access (NUMA)
- Multi-core processor parallelism
- Efficient communication between database layer with the application layer
- Pushing more application semantics into data management layer
- Data compression achieves a reduction in disk space
- Different compression techniques, Light-Weight/Heavy-Weight
- Compression-aware query execution
- Data-Dependent Optimization
SAP HANA Database supports Row Store and Column Store. Some of the HANA Administrative tables in row
store (e.g. SYS schema and tables from Statistics Server) whereas other Administrative tables in column
store(e.g. _SYS_BI, _SYS_BIC, _SYS_REPO schema, etc.). Transactional data stored in the physical tables of
SAP HANA Database used for Analytical purposes. HANA Analytical Data Modeling is only possible for
Columnar Tables; i.e, Information Modeler only works with column storage tables. Replication Server(SLT) and
Data Services create tables in column store by default.
If transaction data is stored in a column-based table, then it enables
- fast on-the-fly aggregations,
- ad-hoc reporting.
How data stored in column format? How it differs from row format?
Here, the below diagram will help you to understand very easily.
After data load into CDHDR_ROW and CDHDR_COLUMN from flat file, within HANA result shows (I didnt do
anything extra):
Now you can see, ROW stored CDHDR_ROW table occupies how much memory & disk-size ?
One thing, ROW store table automatically loaded into memory when HANA starts-up. But ROW store table can
not be unloaded from memory.
Whereas, columnar table CDHDR_COLUMN taking very less memory & disk-size.
Again, later I have converted ROW store table CDHDR_ROW into COLUMNAR table and record the output:
Now, both tables are in column store.
Now you can see, it occupies less memory and disk-size compared to its ROW based table state.
Demystifying the Column Store: Store Statistics in SAP HANA and the Benefits for
Business Insights
http://www.agilityworks.co.uk/our-blog/demystifying-the-column-store-%E2%80%93-viewing-
column-store-statistics-in-sap-hana-and-understanding-the-benefits-for-business-insights/
Although there have been a number of articles already written about the column store in SAP HANA, its still a
question that gets asked regularly, particularly around data reconstruction. The aim of this blog is to explain:
1. How we can access the statistics behind a table stored in column store format within SAP HANA, such as the
main memory size and compression rate.
2. The workings of a column store, including data reconstruction.
3. Some of the advantages of the column store for business insights and analytics.
In order to discuss these aims, lets work with an example of a row based table from a typical relational database. We
will load the data into SAP HANA in order to store it in a column store table within main memory.
A sample of the table we will work with is in Figure 1. In total we have 650,216 records. It shows New York Stock
Exchange Data. For each company that trades (stock symbol) it shows the prices for each day.
Figure 6 The original column for Stock Symbol and how it is now represented in column store format
The next question to answer is how we reconstruct the row when querying the data in column store. Lets suppose we
want to run a query showing us the sum of the Stock Volume for the Stock Symbol ABB. First we would read the
dictionary of the Stock Symbol to get the Value ID for ABB (value ID 2) and use the inverted index to access the record
IDs that use Value ID 2 (records 1, 4 & 7). See reference 7 for a visual reference of this.
Figure 7 Reading of the Stock Symbol Dictionary and Index for ABB records
Then with the index of the Stock Volume we can map the records (1, 4 & 7) to the Stock Volume Value IDs (4, 5 & 6)
and then map these to the actual Stock Volumes values in the dictionary (see Figure 8). Then we apply the sum
function against the values to give us 8,804,800.