Knowledge Discovery in Databases

1670 Words4 Pages

Knowledge Discovery in Databases Abstract Knowledge Discovery in Databases is the process of searching for hidden knowledge in the massive amounts of data that we are technically capable of generating and storing. Data, in its raw form, is simply a collection of elements, from which little knowledge can be gleaned. With the development of data discovery techniques the value of the data is significantly improved. A variety of methods are available to assist in extracting patterns that when interpreted provide valuable, possibly previously unknown, insight into the stored data. This information can be predictive or descriptive in nature. Data mining, the pattern extraction phase of KDD, can take on many forms, the choice dependent on the desired results. KDD is a multi-step process that facilitates the conversion of data to useful information. Our increased ability to gain information from stored data raises the ethical dilemma of how the information should be treated and safeguarded. Introduction The desire and need for information has led to the development of systems and equipment that can generate and collect massive amounts of data. Many fields, especially those involved in decision making, are participants in the information acquisition game. Examples include: finance, banking, retail sales, manufacturing, monitoring and diagnosis, health care, marketing and science data acquisition. Advances in storage capacity and digital data gathering equipment such as scanners, has made it possible to generate massive datasets, sometimes called data warehouses, that measure in terabytes. For example, NASA's Earth Observing System is expected to return data at rates of several gigabytes per hour by the end of the century. Mod... ... middle of paper ... ... of data warehouses increase. New methods of analysis and pattern extraction are being developed and adapted to KDD. Which method is used depends on the domain and results expected. The accuracy of the recorded data must not be overlooked during the KDD process. Domain specific knowledge assists with the subjective analysis of KDD results. Much attention has been given to the data mining phase of KDD but earlier steps, such as data cleaning, play a significant role in the validity of the results. The potential benefits of discovery driven data mining techniques in extracting valuable information from large complex databases are unlimited. Successful applications are surfacing in industries and areas were data retrieval is outpacing man's ability to effectively analyze its content. Users must be aware of the potential moral conflicts to using sensitive information.

Open Document