Data mining- Introduction

Data Mining


Introduction to Data Analysis

Assessment is coursework= 50%
One on data mining
Survey on emerging topic within this area

See book: H. Du Data Mining Techniques and applications

Software used: WEKA

Data- isolated factual recording of separate object and events
Information- meaningful
Knowledge- using information for business

Data- supermarket sales
Information-coke often purchased along with crisps
Knowledge- put near each other


What is it

Useful info
Non-trivial implicit info- not raw nor result of simple data summary


Useful
Bank seeing credit card information’ salary or house
Would want house so if they cant pay you can take it

Non-trivial information
Online analytic processing- interactive reporting

Datamining- discovery of hidden embedded patterns

Real life databases
May be v big
Lots of different data types
Quality can be poor
Available on second storage media

Efficient algorithms
Use little memory and have  a quick execution time
May not be 100%

Objectives-
Classification
Estimation
Prediction
Description




WEKA
Investigative interactive data mining
Used on small data sets
WEKA explorer- main one we will use
WEKA knowledge flow- not covered
WEKA is available online (Free)

Comments