Data Mining: Data Mining, And Knowledge Discovery Process

1448 Words3 Pages

CHAPTER 1 INTRODUCTION

1.1 Data Mining (Knowledge Discovery Process)
Data Mining (DM), or Knowledge Discovery is extraction of implicit, hidden trends, previously unknown, and useful information from data. DM research adopted many techniques from research areas like artificial intelligence, statistics and machine learning.
Stages in Data Mining:
1. Selection of data: selecting data to be analysed.
2. Preprocessing: Preprocessing of data to ensure consistent and common format. This is the data cleaning stage where certain information is removed and dicarded which is unnecessary and may slow down queries or missing value treatment is done. To make data consistent, it is reconfigured to proper format as there may inconsistent data formats because the data is collected from several sources e.g. information about sex may be recorded as f or m and as 1 or 0.
3. Transformation of data: the data are transformed into forms appropriate for analysis e.g. normalization. Some operations like summary or aggregation can be performed.
4. Data mining: Patterns are extraction, this stage is concerned with the extraction of patterns from the data.
5. Interpretation of patterns for decision …show more content…

We can find and measure the strength of a relationship between two variables from the data e.g. co-variance. From a regression analysis, relationships can be discovered between dependent and independent variables. This type of analysis is used for prediction. Discriminant analysis is a classifier-based approach which categorizes the data based upon the combination of features/attributes that maximally separate the data. Clustering is an unsupervised classification of observations, data points, data items, or feature/attribute vectors to groups or clusters. We can group the data points into clusters or classes, so that objects within a cluster are more similar and very dissimilar to objects in different

Open Document