CHAPTER ONE
INTRODUCTION
1.0 Introduction
Data mining is the process of sorting through large data sets to identify patterns and establish relationships to solve problems through data analysis. Data mining tools allow enterprises to predict future trends. It is an interdisciplinary subfield of computer science and statistics with an overall goal to extract information (with intelligent methods) from a data set and transform the information into a comprehensible structure for further use. Clifton, Christopher (2010); Hastie, Trevor et al., (2009). Data mining is the analysis step of the “knowledge discovery in databases” process, or KDD.Fayyad, Usama et al., (1996). Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating. The difference between data analysis and data mining is that data analysis is to summarize the history such as analyzing the effectiveness of a marketing campaign, in contrast, data mining focuses on using specific machine learning and statistical models to predict the future and discover the patterns among data.Olson, D. L. (2007).
The term “data mining” is in fact a misnomer, because the goal is the extraction of patterns and knowledge from large amounts of data, not the extraction (mining) of data itself.Han, Jiawei; Kamber, Micheline (2001). It also is a buzzword and is frequently applied to any form of large-scale data or information processing (collection, extraction, warehousing, analysis, and statistics) as well as any application of computer decision support system, including artificial intelligence (e.g., machine learning) and business intelligence. The book Data mining: Practical machine learning tools and techniques with Java (which covers mostly machine learning material) was originally to be named just Practical machine learning, and the term data mining was only added for marketing reasons. Bouckaert, Remco R. et al., (2010). Often the more general terms (large scale) data analysis and analytics – or, when referring to actual methods, artificial intelligence and machine learning – are more appropriate.
DISTRIBUTED DATA MINING IN CREDIT CARD FRAUD DETECTION
Terms of Use: This is an academic paper. Students should NOT copy our materials word to word, as we DO NOT encourage Plagiarism. Only use as a guide in developing your original research work. Thanks.
Disclaimer: All undertaking works, records, and reports posted on this website, eprojectguide.com are the property/copyright of their individual proprietors. They are for research reference/direction purposes and the works are publicly supported. Do not present another person’s work as your own to maintain a strategic distance from counterfeiting its results. Use it as a guide and not duplicate the work in exactly the same words (verbatim). eprojectguide.com is a vault of exploration works simply like academia.edu, researchgate.net, scribd.com, docsity.com, course hero, and numerous different stages where clients transfer works. The paid membership on eprojectguide.com is a method by which the site is kept up to help Open Education. In the event that you see your work posted here, and you need it to be eliminated/credited, it would be ideal if you call us on +2348064699975 or send us a mail along with the web address linked to the work, to eprojectguide@gmail.com. We will answer to and honor each solicitation. Kindly note notification it might take up to 24 – 48 hours to handle your solicitation.