Professional Documents
Culture Documents
Abstract
Many organizations today have adopted business intelligence (BI) as a catalyst to meet specific
business needs and to improve organizational effectiveness. Although BI has become more
robust and pervasive, some organizations are still unable to maximize the return on their BI
investments. One contributing reason is the lack of a good guiding BI architecture to support
the implementation of such a system. Having a solid architecture can help organizations to
better control the implementation process as well as the operation of the entire BI
environment. A review of the existing literature shows that although the importance of a good
BI architecture is non-arguable, research in this area is still lacking. To fill the gap, this paper
proposes a framework of BI architecture which consists of five layers: data source, ETL, data
warehouse, end user, and metadata layers. These five layers are essential to ensure high data
quality and smooth information flow in a BI system.
Copyright © 2011 In Lih Ong, Pei Hwa Siew and Siew Fan Wong. This is an open access article distributed
under the Creative Commons Attribution License unported 3.0, which permits unrestricted use,
distribution, and reproduction in any medium, provided that original work is properly cited. Contact
author In Lih Ong E-mail: ongil1@mail2.utar.edu.my
Communications of the IBIMA 2
whereas Baars and Kemper (2008) and area, the process of working on the data
Turban et al. (2008) include only data from data source and the process of
warehouse. In order to address operational loading the data into data warehouse can
data needs of an organization, it is essential be very time consuming and resource-
to implement ODS to provide current or intensive (Melchert et al., 2004).
near current integrated information that
can be accessed or updated directly by The Proposed Framework of Business
users. Through this way, decision makers Intelligence Architecture
will be able to react faster to changing
business environment and requirements. This paper proposes a framework of a five-
Furthermore, it is necessary to consider layered BI architecture (see Figure 1),
data staging area in the ETL (Extract- taking into consideration the value and
Transform-Load) process. As most of the quality of data as well as information flow
data from data source require cleansing in the system. The five layers are data
and transformation, it is important to source, ETL (Extract-Transform-Load),
create a temporary storage for data to data warehouse, end user, and metadata
reside prior to loading into ODS or data layers. The rest of this section describes
warehouse. Without building this staging each of the layers.
marts to support them. A data mart is a views, stored procedures), and referential
subset of the data warehouse that is used integrity constraints (Ma et al., 2011; Wang
to support analytical needs of a particular & Ye, 2010). As data are integrated into the
business function or department data warehouse layer using ETL tools, an
(Bukhbinder et al., 2005). Like data extraction log is maintained to record the
warehouses, it contains historical data that changes made to data element during the
can help users to access and analyze extraction process to ensure the quality of
different data trends (Ranjan, 2009). data. This log is ETL metadata and it is
However, it can only keep data for 60 to 90 stored in metadata repository. ETL
days. Therefore, the amount of data stored metadata generally contains information
in a data mart is much lesser than the data about sources, targets, transformation
stored in a data warehouse. There can be rules, and mapping. Metadata repository is
many data marts inside an organization. also used to document the information
Data warehouses and data marts are built about data contained in the data
based on multi-dimensional data model warehouse layer. It includes description of
which consists of fact and dimension data structure (schema, dimensions, and
tables. Fact table contains quantitative data hierarchies) and definitions of conformed
about business entities such as sales dimensions and conformed facts
amount, quantity, and price. Dimension (Chaudhuri & Dayal, 1997; Sen & Sinha,
table contains data (such as product, 2005). These metadata guide the process of
customer, data, and location) that extracting, transforming, and loading data
describes facts (Kimball et al., 2008). into target repository (Shariat &
Hightower, 2007). OLAP metadata provides
Metadata Layer descriptions about structure of cubes,
dimensions, hierarchies, levels, and the
Metadata refers to data about data. It type of drill paths being taken. Data mining
describes where data are being used and metadata include descriptions about
stored, the source of data, what changes algorithms and queries (Nelson, 2008).
have been made to the data, and how one Reporting metadata are XML-based and are
piece of data relates to other information used to store report templates and
(Giovinazzo, 2003). Metadata repository is reporting descriptions such as report
used to store technical and business name, start date, and end date (Al-Noukari
information about data as well as business & Al-Hussan, 2008). These metadata also
rules and data definitions (Davenport & contain information about structures of
Harris, 2007). Good management and use charts and queries.
of metadata can reduce development time,
simplify on-going maintenance, and End User Layer
provide users with information about data
source (Bryan, 2009). For instance, users The end user layer consists of tools that
do not have to re-design data structure display information in different formats to
(such as table name and data types) for different users. These tools can be grouped
data modelling since the data structures hierarchically in a pyramid shape (as
needed have been stored as metadata. shown in Figure 1). As one moves from the
Users can just query and retrieve these bottom to the top of the pyramid, the
metadata from repositories. Therefore, it is degree of comprehensiveness at which
essential to ensure that metadata in data are being processed and presented
repositories are maintained and updated increases. This is to tailor to increasing
regularly. complexity in decision-making as one
moves up organizational hierarchy. For
There are many different types of metadata instance, the highest level of pyramid
to support a BI architecture such as data consists of analytical applications which
source, ETL, reporting, OLAP, and data are usually used by top management while
mining metadata. Data source metadata the lowest level consists of query and
consists of information about access mode, reporting tools which are used mostly by
structure of data sets (e.g., relational tables, operational management level.
7 Communications of the IBIMA
One or more OLAP servers can manage • Pivot: It enables users to rotate the axes
data in the data warehouse layer for of the data cube, meaning swapping the
reporting, analysis, modelling, and dimensions to get different views of data.
planning to optimize business (Ranjan,
2009). OLAP server is a “data manipulation Data Mining
engine that is designed to support multi-
dimensional data structures” (Reinschmidt Data mining process can be achieved with
& Francoise, 2000, p. 13). OLAP server can the integration of data warehouses and
provide multi-dimensional and OLAP servers by performing further data
summarized views of aggregated data. analysis in OLAP cubes. Since the amount
OLAP is a user-friendly graphical tool that of data in an organization is growing
allows users to quickly view and analyze rapidly, it is necessary to have data mining
business data from different perspectives. to make decisions faster. Basically, data
Besides that, OLAP also allows users to mining is a process that automatically
easily compare different types of data and identifies useful information such as
complex computations. unusual patterns, trends, and relationships
that are hidden within large amount of
In order to reduce query time, data in OLAP data. This can be achieved by applying
server are organized in the form of data statistical techniques such as classification,
cubes instead of tables (rows and columns) time-series analysis or clustering (Al-
as in relational data model (Wang et al., Noukari & Al-Hussan, 2008; Kerdprasop &
2005). Data cubes are dimensional models Kerdpraso, 2007; Kimball et al., 2008). Data
stored in multi-dimensional OLAP mining techniques have been used in many
structures. They contain fact and application areas such as marketing,
dimensional tables to store and manage financial, medical, and manufacturing to
multi-dimensional data so that users can predict future results and summarize
analyze data easily and in a faster manner details of data (Al-Noukari & Al-Hussan,
(Prevedello et al., 2010). Four basic OLAP 2008).
operations used in analyzing multi-
dimensional data are (Chaudhuri & Dayal,
1997; Han & Kamber, 2006):
Communications of the IBIMA 8
Gartner (2009b). “Business Intelligence Kimball, R. & Caserta, J. (2004). The Data
Ranked Top Technology Priority by CIOs Warehouse ETL Toolkit: Practical
for Fourth Year in a Row,” [Online], Techniques for Extracting, Cleaning,
[Retrieved November 27, 2010], Conforming, and Delivering Data, Wiley,
http://www.gartner.com/it/page.jsp?id=8 Indianapolis, Indiana.
88412.
Kimball, R., Ross, M., Thornthwaite, W.,
Gartner (2011). “Gartner Forecasts Global Mundy, J. & Becker, B. (2008). The Data
Business Intelligence Market to Grow 9.7 Warehouse Lifecycle Toolkit, Wiley, New
Percent in 2011,” [Online], [Retrieved York.
February 27, 2011],
http://www.gartner.com/it/page.jsp?id=1 Li, Z., Huang, Y. & Wan, S. (2007). “Model
553215. Analysis of Data Integration of Enterprises
and E-commerce Based on ODS,”
Giovinazzo, W. A. (2003). Internet-enabled International Conference on Research and
Business Intelligence, Prentice-Hall, Upper Practical Issues of Enterprise Information
Saddle River, New Jersey. Systems (CONFENIS 2007). 1, 275-282.
Haag, S., Cummings, M. & Philips, A. (2007). Ma, Z., Wang, C. & Wang, Z. (2011).
Management Information Systems for the “Semantic Model Based on Three-layered
Information Age, McGraw-Hill, Boston. Metadata for Oil-gas Data Integration,”
Advances in Information Sciences and
Han,, J. & Kamber, M. (2006)."Data Mining: Service Sciences, 3(7). 216-224.
Concepts and Techniques," Morgan
Kaufmann, San Francisco, California. MarketResearch.com.(2010). “Business
Intelligence Tools Market Forecasted to
Hobbs, L. (2007). Oracle Database 10g Data Grow $13 Billion by 2015,” [Online],
Warehousing, Elsevier Digital Press, [Retrieved December 3, 2010],
Burlington, Massachusetts. http://www.marketwire.com/press-
release/Business-Intelligence-Tools-
Hoffer, J. A., Prescott, M. B.& McFadden, F. Market-Forecasted-to-Grow-
R. (2007). Modern Database Management, 13-Billion-by-2015-1162837.htm.
Pearson/Prentice Hall, Upper Saddle River,
New Jersey. Melchert, F., Winter, R. & Klesse, M. (2004).
“Aligning Process Automation and Business
IBM. (2009). “The New Voice of the CIO: Intelligence to Support Corporate
Insights from the Global Chief Information Performance Management,” Proceedings of
Officer Study,” [Online], [Retrieved the Tenth Americas Conference on
December 4, 2010], http://www- Information Systems, 4053-4063.
935.ibm.com/services/us/cio/ciostudy/pd
f/ midsize.pdf. Negash, S. (2004). “Business Intelligence,”
Communications of the Association for
Imhoff, C., Galemmo, N. & Geiger, J. G. Information Systems, 13, 177-195.
(2003). Mastering Data Warehouse Design:
Relational and Dimensional Techniques, Nelson, G. (2008). Metadata 100 Success
John Wiley & Sons, Indianapolis, Indiana. Secrets 100 Most Asked Questions on Meta
Data How-to Management, Repositories,
Inmon, W. H. (2005). Building the Data Software, Standards, Tools and Databases,
Warehouse, Wiley, Indianapolis, Indiana. Emereo, London, UK.