Chapter 4 – Data Mining

Authors: Kai Puolamäki, Alessio Bertone, Roberto Theron, Otto Huisman, Jimmy Johansson, Silvia Miksch, Panagiotis Papapetrou, Salvo Rinzivillo

This chapter considers data mining, which is seen as fundamental to the automated analysis components of visual analytics. Since today’s datasets are often extremely large and complex, the combination of human and automatic analysis is key to solving many information gathering tasks. Some case studies are presented which illustrate the use of knowledge discovery and data mining (KDD) in bioinformatics and climate change. The authors then pose the question of whether industry is ready for visual analytics, citing examples of the pharmaceutical, software and marketing industries. The state of the art section gives a comprehensive review of data mining/analysis tools such as statistical and mathematical tools, visual data mining tools, Web tools and packages. Some current data mining/visual analytics approaches are then described with examples from the bioinformatics and graph visualisation fields. Technical challenges specific to data mining are described such as achieving data cleaning, integration, data fusion etc. in real-time and providing the necessary infrastructure to support data mining. The challenge of integrating the human into the data process to go towards a visual analytics approach is discussed together with issues regarding its evaluation. Several opportunities are then identified, such as the need for generic tools and methods, visualisation of models and collaboration between the KDD and visualisation communities.


Download

chapter 4 (2.0MB)   [Note that the images are low res. to reduce the file size]