Professional Documents
Culture Documents
Question: How many books could one store in one Terabyte of memory?
Question: How much is the cost of 100GB disk? What is the cost of a PC and what is
its CPU performance?
12000
10000
8000
6000
4000
2000
0
1996 1997 1998 1999 2000 2001 2002 2003
Year
Question: What do the graphs in the last two slides tell us? What scales are used in
them? What was the pink line is the first graph?
Question: Why OLTP cannot be used for sales forecasting and analysis?
Question: Compare the two sets of steps, one given in previous few slides and the
CRISP-DM approach. Which approach is better?
• Feature Selection
• Use sampling?
• Normalization
• Smoothing
• Dealing with duplicates, missing data
• Dealing with time-dependent data
•Dhar, V. and Stein, R., 1997, Seven methods for transforming corporate
data into business intelligence, Prentice Hall.
•M.S. Chen, J. Han, and P.S. Yu, Data Mining: An Overview from a
Database Perspective, IEEE Transactions on Knowledge and Data
Engineering, 8(6), pp 866-883, 1996.
•Berry, M. and Linoff, G., 1997, Data mining techniques for marketing,
sales and support, John Wiley & Sons.
•Berry, M. and Linoff, G., 1999, Mastering data mining, John Wiley &
Sons.