Data quality is defined as “an inexact science in terms of assessments and benchmarks” [93]. Similarly high quality data can be described as “data that is fit for use by data consumers” [94].
11.2. Origin of Bad Data
There may be different sources from where erroneous data is originated. Data may become dirty if it is mistakenly entered, received from invalid external data source, or when good data is combined with outdated data and there is no way to distinguish between the two.
11.3. Categories and Dimensions of Data Quality
Since before data was the most valuable asset of an organization and data was rarely shared. Now businesses, governments, and research organizations rely on the exchange and sharing of various forms of data. As there is an increase in interconnectivity among data producers and data consumers; interest in data quality increases steadily. The management of data quality is typically a complex job. For the entire data management process all data quality aspects should be observed. Following table indicates the categories and dimensions of data quality [94]:
Table 11.1 Categories and Dimensions of Data Quality [94]
Categories Dimensions
Intrinsic Accuracy
Objectivity
Believability
Reputation
Contextual Completeness
Timeliness
Relevancy
Value Added
Among of data
Representational Interpretability
Ease of Understanding
Concise/Consistent representation
Accessibility Accessibility
Access security
11.4. Classification of data quality problems in data sources
Data quality problems are classified in two main categories: Single-Source problems and Multi-Source Problems [95]. A brief view of the classification and sub-classification is shown in the figure below that shows som...
... middle of paper ...
...ng Hai Do, “Data Cleaning: Problems and Current Approaches,” Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 2000.
[96] M. Angélica Caro, Coral Calero, Ismael Caballero, Mario Piattini, “Data Quality in Web Applications: A State of The Art,” IADIS International Conference on WWW/Internet, 2005.
[97] Panos Vassiliadis, “Data Warehouse Modeling and Quality Issues,” National Technical University of Athens Zographou, Athens, GREECE, 2000.
[98] Larry P. English, “Information Stewardship: Accountability for Information Quality,” Information Impact International, Inc, 2006.
[99] Peter Block, “Stewardship: Choosing Service over Self-Interest,” San Francisco: Berett-Koehler, 1993.
[100] M. Pamela Neely “Data Quality Tools for Data Warehousing – A Small Sample Survey,” Center for Technology in Government University at Albany / SUNY, 1998.
The next project deliverable is a robust, modernized database and data warehouse design. The company collects large amounts of website data and uses this data to analyze it for the company’s customers. This document will provide an overview of the new data warehouse along with the type of database design that has been selected for the data warehouse. Included in the appendix of this document is a graphical depiction of the logical design of the
AHIMA's data quality management model depicts data collection as one of the four primary data functions. The others are application, warehousing, and analysis. All characteristics of data quality management should be applied to data collection ...
Veracity refers to the messiness or trustworthiness of the data. There are many forms of big data quality and accuracy that make it very hard to control. An example of this would be hashtags, abbreviations, or typos. Technology now lets us to work with this type of data.
The four key processes in the data quality management model are analysis, warehousing, collection and application of data (AHIMA 2)
Laudon, K., & Laudon, J. (2007). Essentials of Business Information Systems (7th ed.) (Bob Horan, Ed.). Upper Saddle River, New Jersey: Pearson Prentice Hall.
As the Big Data era advances, the significance of data is changing. In addition to supporting business decisions and transactions, data is often now the good being traded, as companies begin to grasp the seemingly boundless potential value inherent in the data itself. Decreasing storage costs combined with the ability to collect data passively (through technological progress) mean that many companies are finding it easier to justify preserving the data rather than discarding it once its primary function h...
[7] Elmasri & Navathe. Fundamentals of database systems, 4th edition. Addison-Wesley, Redwood City, CA. 2004.
A data warehouse comprised of disparate data sources enables the “single version of truth” through shared data repositories and standards and also provides access to the data that will expand frequency and depth of data analysis. Due to these reasons, data warehouse is the foundation for business intelligence.
Wang , R. Y. (1998) A Product Perspective on Total Data Quality Management. Communications of the ACM, 41(2), 58-65.
Data is needful in accountants’ lives. Dealing with data is the first daily work for those accountants who work in the public or private firm. Data, such as assets, liabilities, stockholder’s equities, revenues, and expenses and so on, are the most basic elements of accounting. They are also the foundation of advanced research. Data is a good assistance for accountants. Accountants can obtain further information by analyzing the raw data. Without data, accountants can do nothing.
Is there any technical guarantees of accuracy, reliability, ease of access, and the data security?
Prior to the start of the Information Age in the late 20th century, businesses had to collect data from non-automated sources. Businesses then lacked the computing resources necessary to properly analyze the data, and as a result, companies often made business d...
HAND, D. J., MANNILA, H., & SMYTH, P. (2001).Principles of data mining. Cambridge, Mass, MIT Press.
Big data is a concept that has been misunderstood therefore I will be writing this paper with the intentions of thoroughly discussing this technological concept and all its dimensions with regard to what constitutes big data and how the term came about. The rapid innovations in Information Technology have brought about the realisation of big data. The concept of big data is complex and has different connotations but I intend to clarify its functions. Big data refers to the concept of a collection of large and complex amounts of data that are found extremely difficult to notate or even process by most on-hand devices and database technologies.
Data Collection is the process of collecting information that will be utilized in the diagnostic process and eventually used to make business recommendation. In this data collection process, it is critical to ensure the highest quality of data possible. In the data collection component, the information is gathered on the specific department or organization such as inputs, design components, an...