Data Mining
Master’s Thesis
by
Edward D’Costa
Thesis Supervisor
Prof. Ashay Dharwadker
Ansal Institute of Technology
15th April 2004
Abstract
This thesis is a survey of Data Mining techniques with an emphasis on
Time Series analysis. A brief history of the development of the field
and its extreme usefulness in today’s world of large databases is discussed.
The architecture of a typical data mining system is studied with examples
and applications. Time series, Seasonal Variations, Exponentially Smoothed
Weighted Moving Averages (ESWMA), Partial Auto-Correlation Functions (PACF)
and the Box-Jenkins methodology are studied in detail, with examples. We
have implemented the ESWMA/PACF based trend curve fitter application in
C++ and provide the source code of the software under the GNU public license
for noncommercial use on the accompanying CD.
Contents
1. What is Data Mining?
2. History of Data Mining
3. Some other terms for Data Mining
4. Functions of Data Mining
5. From ‘Data’ to ‘Knowledge Discovery’
6. Architecture of a Typical Data Mining System
7. Real-life Applications of Data Mining
7.1 An Example of a Data Mining Business Software at Eddie Bauer
7.2 Other Examples
8. Data Mining – On What Kind of Data?
8.1 Theoretical Concepts Involved
9. Time Series
9.1 What is Trend Measurement?
9.2 Method of Moving Averages (MA) to Determine Trend
9.3 Criteria for the selection of Period for the Moving Average
10. Components of Time Series
11. Seasonal Variations
11.1 Measurement of Seasonal Variations
11.2 Uses of Seasonal Index
11.3 Ratio-to-Moving Average method of measuring Seasonal Variations
12. Exponentially Smoothed Weighted Moving Average (ESWMA)
12.1 Implementation into Software Application
12.2 Proposed Improvements in Current Version of ESWMA–based Trend
Curve fitter Application
13. References
|