THIS PAGE IS NO LONGER
UPDATED
E-Commerce Technology 2 – Database Systems, Data
Warehousing, OLAP and Data Mining
Data mining- Study guide for 2003/04
How the assessment relates to
the learning outcomes
Supplementary reading and study
material
This module is part of the
Postgraduate Diploma in E-Commerce, Spring Term 2004. It covers the fundamental
principles of modern database systems, data warehouses, on-line analytical
processing and data mining, and their impact on e-commerce. For details on the
databases part of the module please visit Prof. Poulovassilis web pages at http://www.dcs.bbk.ac.uk/~ap/teaching/pdecDW2004/
. This study guide covers the data mining part of the module. Students should
read this study guide carefully and also ensure that all the links have been
followed to other accompanying. The information in this study guide is
maintained by the Dr George D. Magoulas, who you can contact by email: gmagoulas@dcs.bbk.ac.uk if you have
any queries or you want to set up an appointment.
This document will be updated while
the course is in progress. Please be sure that you check it to find out
up-to-date information about the module.
Web-based business is generating a
vast amount of data on consumer transactions, browsing behaviours, usage times,
and preferences. Data mining is in general the task of extracting implicit,
previously unidentified and potentially useful information from data. In
certain case, the data sets are characterised by incompleteness (missing
parameter values), incorrectness (systematic or random noise in the data),
sparseness (few and/or non-representable records available), and inexactness
(inappropriate selection of parameters for the given task). The aim of this
module is to present and discuss issues associated with data mining of
web-based applications. It will cover the basic concepts and techniques for
data mining and intelligent data analysis, including methods for knowledge
engineering, artificial neural networks and clustering. No particular
programming language knowledge is assumed and mathematical prerequisites are
kept to a minimum.
·
To introduce the basic concepts in data mining and intelligent data analysis
· To demonstrate the
process of knowledge discovery using practical examples.
· To have hands-on
experience with the Clementine data mining tool.
By the end of the module students must
demonstrate ability to:
·
Discuss basic concepts of data mining and intelligent data analysis.
· Explain the process of
knowledge discovery
· Demonstrate the
process of knowledge discovery in practical examples.
The module will be delivered through a
series of lectures and lab sessions. Lectures are on Wednesdays 18:00-21:00 in
room 153 in the Main Building. The Lab sessions are all in room 128. Room 128
is booked for this course for all Wednesdays, 18:00-21:00, until March 26th. It
is strongly recommend that you attend these labs.
Lectures are on Wednesdays
18:00-21:00. The lecture programme is as follows:
Week 23 February – 27
February 2004 |
Lecture 10: Data mining services |
|
Week 1 March – 5 March 2004 |
Lecture 11: Tools for knowledge
representation |
|
Week 8 March – 12 March
2004 |
Lecture 13: Clustering for data mining |
Lecture 14: Clustering algorithms |
|
Week 15 March – 19 March
2004 |
Lab session 5: Clementine |
Lab session 6: Clementine |
|
Week 22 March – 26 March
2004 |
|
REVISION LECTURE: Wednesday 28th of April – Room
121 |
The exam paper of the module will have
five questions in total; students will choose three. Two out of the five
questions are on data mining.
How the examination relates to the learning outcomes
The examination relates to the basic
learning outcomes stated earlier in this document. Exam questions will cover
all aspects of the data mining part of the module, assessing the accomplishment
of ALL learning outcomes.
The following reading list is a
recommended source of course material.
“Learning from data”, Vladimir
Cherkassky, Wiley, 1998, ISBN: 0-471-15493-8.
“Artificial Intelligence: a Guide to Intelligent
Systems”, Michael Negnevitsky, Addison Wesley, 2002, ISBN: 0-201-71159.
“Data mining techniques: for marketing, sales,
and customer support “,Michael J.A. Berry, Gordon Linoff, Wiley , 1997, ISBN:
0471179809.
“Data mining: concepts, models, methods, and
algorithms”, Mehmed Kantardzic, Wiley-Interscience: IEEE Press , 2003,
ISBN: 0471228524.
“A guide to neural computing applications”,
Lionel Tarassenko, Arnold Publishers, 1998, ISBN: 0-340-705892.
“Neural Networks”, Picton P., 2nd edition, Palgrave,
2000, ISBN: 0-333-80287-X.
“Fundamentals of Neural networks:
Architectures, Algorithms and Applications”, Faussett L., Prentice Hall, 1994,
ISBN: 0-13-042250-9.
“Pattern Recognition Using Neural Networks”,
Looney C., Oxford University Press, 1997, ISBN: 0-19-507920-5
“Mathematical classification and
clustering “,Boris Mirkin, Kluwer Academic , 1996, ISBN: 0792341597.
“Data mining: practical machine learning
tools and techniques with Java implementations”, Ian H. Witten, Eibe Frank,
Morgan Kaufmann , 2000, ISBN: 1558605525
Electronic Resources
On the following sites you can find
relevant material:
·
http://portal.acm.org/portal.cfm
The URL
of the ACM digital library.
·
http://www-us.ebsco.com/online/Reader.asp
Links to
several journals. You might need your ATHENS password to access that.
·
http://www.elsevier.nl/locate/issn/08936080
Elsevier
Publisher. Access to several Neural Network journals. You might need your
ATHENS password to access that.
EEVL is
an award-winning free service, which provides quick and reliable access to the
best engineering, mathematics, and computing information available on the
Internet.
·
http://www.bids.ac.uk/
Links to
several journals. You might need your ATHENS password to access that.
If you don't have an ATHENS
password please visit the Library's web page and follow instructions to get an
ATHENS password.
Journals
·
ACM Transactions on Internet Technology (electronic access)
· Data & Knowledge
Engineering (electronic access)
· Data Mining and
Knowledge Discovery (electronic access)
· IEEE Transactions on
Neural Networks
· Neurocomputing
(electronic access)
· IEEE Intelligent
Systems (electronic access)
· Expert systems with applications
(electronic access)
· The knowledge
engineering review (electronic access)
· IEEE Transactions on
knowledge and data engineering (electronic access)
· Applied Intelligence
(electronic access)
· IEEE Internet
Computing (electronic access)
· Intelligent Data
Analysis (electronic access)
Supplementary reading and study material
The reading list, above, is the
recommended source of course material. You are advised to acquire at least one
of the books, but should initially satisfy yourself as to the suitability of
each textbook. Use this study guide to assess the coverage in each book. Some
of the books will not cover the course entirely and, may contain material not
covered in the course.
It is advisable to look in the library
or on the Web for further reading around the topic of the module; you will find
a lot of literature dealing with data mining, neural computing, and clustering.
Books included in the reading list, mentioned above, as well as related books
can be found in the Library. Feel free to buy a book of your own choice if it
is not included in the reading list, and use the library frequently. You will
find it contains lots of other material that will interest you.
Please refer them to the PDEC
course booklet – this is standard across the whole college.