GCO5828 Applications of data mining - Semester 1 , 2007 unit guide

Semester 1, 2007

Chief Examiner

Kai Ming Ting

Lecturers

Gippsland : Kai Ming Ting

Outline

In the modern corporate world, data is viewed not only as a necessity for day-to-day operation, it is seen as a critical asset for decision making. However, raw data is of low value. Succinct generalizations are required before data gains high value. Data mining produces knowledge from data, making feasible sophisticated data-driven decision making. This unit will provide students with an understanding of the major components of the data mining process, the various methods and operations for data mining, knowledge of the applications and technical aspects of data mining, and an understanding of the major research issues in this area.

Objectives

At the completion of this unit, students will have:

    Knowledge of:

  • characteristics of major components of the data mining process
  • current data mining methods, operations and major application areas
  • Skills in:

  • using data mining tools to solve data mining problems
  • Understanding of:

  • current major research issues

Prerequisites

Students who meet the entry requirements of Master of Business Systems, Master of Information Technology and Master of Multimedia Computing can take this subject.

Unit relationships

GCO5828 is an elective unit in Master of Business Systems, Master of Information Technology and Master of Multimedia Computing.

Texts and software

Required text(s)

Witten, I.H. & Frank, E. Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann Publishers, second edition, 2005. ISBN: 0-12-088407-0

Textbook availability

Text books are available from the Monash University Book Shops. Availability from other suppliers cannot be assured. The Bookshop orders texts in specifically for this unit. You are advised to purchase your text book early.

Software requirements

1. Software Title: WEKA, version 3.4.3

2. Magnum OPUS version 3

Both are freeware and they are made available in the GSIT CD-ROM or retrievable from the websites stated in the relevant unit home page.

Hardware requirements

Students studying off-campus are required to have the minimum system configuration specified by the faculty as a condition of accepting admission, and regular Internet access. On-campus students, and those studying at supported study locations may use the facilities available in the computing labs. Information about computer use for students is available from the ITS Student Resource Guide in the Monash University Handbook.

Recommended reading

1. Kennedy, R.L., Lee, Y. Roy, B.V., Reed, C.D. & Lippman, R.P., Solving Data Mining Problems through Pattern Recognition, Prentice Hall, 1998.

2. Cabena, P., Hadjinian, P., Stadler, R., Verhees, J. & Zanasi, A., Discovering Data Mining: from concept to implementation, Prentice Hall, 1997.

3. Berry, J.A.M. & Linoff, G. Data Mining Techniques for Marketing, Sales, and Customer Support, John Wiley & Sons, 1997.

4. Tan, P-N, Steinbach, M. & Kumar, V. Introduction to Data Mining, Addison Wesley, 2006.

5. Dunham, M.H., Data Mining: Introductory and Advance Topics, Pearson Education, 2003.

6. Groth, R., Data Mining: Building competitive advantage, Prentice Hall, 2000.

7. Berson,. A., Smith, S. & Thearling, K., Building Data Mining Applications for CRM, McGraw Hill. 2000.

8. Berry, J.A.M. & Linoff, G. Mastering Data Mining: The Art and Science of Customer Relationship Management, John Wiley & Sons, 2000.

9. Mena, J. Data Mining Your Website. Digital Press, 1999.

10. Westphal, C. & Blaxton, T. Data Mining Solutions, John Wiley & Sons, 1998.

11. Quinlan, J.R. C4.5: Program for Machine Learning, Morgan Kaufmann, 1993.

12. Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P. & Uthurusamy, R. Advances in Knowledge Discovery and Data Mining, AAAI Press/MIT Press, 1996.

13. Weiss, S.M. & Indurkhya, N. Predictive Data Mining, Morgan Kaufmann, 1997.

Library access

You may need to access the Monash library either personally to be able to satisfactorily complete the subject.  Be sure to obtain a copy of the Library Guide, and if necessary, the instructions for remote access from the library website.

Study resources

Study resources for GCO5828 are:

A printed Unit Book containing the unit information and 12 Study Guides.

A CD-ROM sent at the start of the year, with software required for all units.


Unit website

http://muso.monash.edu.au/

Structure and organisation

Week Topics
1 The Need for Data Mining
2 Model Building
3 Model Representation
4 Data Mining Process
5 Performance Evaluation
6 Engineering the input and output
Non teaching week
7 Algorithms
8 Market basket analysis
9 Implementation Issues
10 Case Studies
11 Data Mining Applications
12 Research Issues
13 Study Week

Timetable

The timetable for on-campus classes for this unit can be viewed in Allocate+

Assessment

Assessment weighting

Nominally, the assignments will have a weighting of 40% and the exam a weighting of 60%.

However, your final mark cannot be more than 10 marks higher than either your assignment work percentage or exam percentage, as shown in the assessment formula in 14.3.

Assessment Policy

To pass this unit you must:

The final mark must be 50% or above according to the formula shown in 14.3.

Your score for the unit will be calculated by:

Final grade = min (A+10, E+10, E*R+A*(1-R))

Where A = overall assignment percentage

E = examination percentage

R = exam weighting (0.6)

 

Assessment Requirements

Assessment Due Date Weighting
Assignment 1 4 April 2007 10%
Assignment 2 9 May 2007 10 %
Reading Assignment 22 May 2007 20 %
Examination 3 hours, closed book Exam period (S1/07) starts on 07/06/07 60 %

Assignment specifications will be made available at the Unit Home page.

Assignment Submission

Assignments will be submitted by electronic submission to http://wfsubmit.cc.monash.edu.au/

Extensions and late submissions

Late submission of assignments

Assignments received after the due date will be subject to a penalty of 20% a day. Assignments received later than one week after the due date will not normally be accepted.

This policy is strict because comments or guidance will be given on assignments as they are returned, and sample solutions may also be published and distributed, after assignment marking or with the returned assignment. 

Extensions

It is your responsibility to structure your study program around assignment deadlines, family, work and other commitments. Factors such as normal work pressures, vacations, etc. are seldom regarded as appropriate reasons for granting extensions. 

Requests for extensions must be made by email to the unit lecturer at least two days before the due date. You will be asked to forward original medical certificates in cases of illness, and may be asked to provide other forms of documentation where necessary.

Grading of assessment

Assignments, and the unit, will be marked and allocated a grade according to the following scale:

Grade Percentage/description
HD High Distinction - very high levels of achievement, demonstrated knowledge and understanding, skills in application and high standards of work encompassing all aspects of the tasks.
In the 80+% range of marks for the assignment.
D Distinction - high levels of achievement, but not of the same standards. May have a weakness in one particular aspect, or overall standards may not be quite as high.
In the 70-79% range.
C Credit - sound pass displaying good knowledge or application skills, but some weaknesses in the quality, range or demonstration of understanding.
In the 60-69% range.
P Pass acceptable standard, showing an adequate basic knowledge, understanding or skills, but with definite limitations on the extent of such understanding or application. Some parts may be incomplete.
In the 50-59% range.
N Not satisfactory failure to meet the basic requirements of the assessment.
Below 50%.

Assignment return

We will aim to have assignment results made available to you within two weeks after assignment receipt.

Feedback

Feedback to you

You will receive feedback on your work and progress in this unit. This feedback may be provided through your participation in tutorials and class discussions, as well as through your assignment submissions. It may come in the form of individual advice, marks and comments, or it may be provided as comment or reflection targeted at the group. It may be provided through personal interactions, such as interviews and on-line forums, or through other mechanisms such as on-line self-tests and publication of grade distributions.

Feedback from you

You will be asked to provide feedback to the Faculty through a Unit Evaluation survey at the end of the semester. You may also be asked to complete surveys to help teaching staff improve the unit and unit delivery. Your input to such surveys is very important to the faculty and the teaching staff in maintaining relevant and high quality learning experiences for our students.

And if you are having problems

It is essential that you take action immediately if you realise that you have a problem with your study. The semester is short, so we can help you best if you let us know as soon as problems arise. Regardless of whether the problem is related directly to your progress in the unit, if it is likely to interfere with your progress you should discuss it with your lecturer or a Community Service counsellor as soon as possible.

Unit improvements

Based on the student feedback, additional tutorials and study plan have been provided. This is in addition to the unit content and study guide update.

Plagiarism and cheating

Plagiarism and cheating are regarded as very serious offences. In cases where cheating  has been confirmed, students have been severely penalised, from losing all marks for an assignment, to facing disciplinary action at the Faculty level. While we would wish that all our students adhere to sound ethical conduct and honesty, I will ask you to acquaint yourself with Student Rights and Responsibilities and the Faculty regulations that apply to students detected cheating as these will be applied in all detected cases.

In this University, cheating means seeking to obtain an unfair advantage in any examination or any other written or practical work to be submitted or completed by a student for assessment. It includes the use, or attempted use, of any means to gain an unfair advantage for any assessable work in the unit, where the means is contrary to the instructions for such work. 

When you submit an individual assessment item, such as a program, a report, an essay, assignment or other piece of work, under your name you are understood to be stating that this is your own work. If a submission is identical with, or similar to, someone else's work, an assumption of cheating may arise. If you are planning on working with another student, it is acceptable to undertake research together, and discuss problems, but it is not acceptable to jointly develop or share solutions unless this is specified by your lecturer. 

Intentionally providing students with your solutions to assignments is classified as "assisting to cheat" and students who do this may be subject to disciplinary action. You should take reasonable care that your solution is not accidentally or deliberately obtained by other students. For example, do not leave copies of your work in progress on the hard drives of shared computers, and do not show your work to other students. If you believe this may have happened, please be sure to contact your lecturer as soon as possible.

Cheating also includes taking into an examination any material contrary to the regulations, including any bilingual dictionary, whether or not with the intention of using it to obtain an advantage.

Plagiarism involves the false representation of another person's ideas, or findings, as your own by either copying material or paraphrasing without citing sources. It is both professional and ethical to reference clearly the ideas and information that you have used from another writer. If the source is not identified, then you have plagiarised work of the other author. Plagiarism is a form of dishonesty that is insulting to the reader and grossly unfair to your student colleagues.

Communication

Communication methods

Questions related to the unit shall always be posted in the relevant newsgroups. This allows all students of this unit to share and learn the experience encountered by individual students.

If you have questions of a personal nature, contact your Unit Adviser via email. This is the preferred option for ease of contact and rapid turn-around of replies.

Notices

All students in this unit will have access to message areas known as newsgroups for unit discussion and information on the unit web site. You may post any questions you have to the appropriate newsgroup. You may also use these message areas to interact with staff and students. There will be at least three newsgroups, which are:

  • Notices
  • General
  • Assignments
All important announcements about the unit will be made in the first newsgroup. You can access these newsgroups at the unit web site at http://muso.monash.edu.au. You should visit these message areas at least weekly, to get the latest information about your studies.

Consultation Times

Questions related to the unit shall always be posted in the relevant newsgroups. This allows all students of this unit to share and learn the experience encountered by individual students.

If you have questions of a personal nature, contact your Unit Adviser via email. This is the preferred option for ease of contact and rapid turn-around of replies.

On-campus consultation hours:

Monday 2pm-3pm

Thursday 2pm-3pm

If direct communication with your unit adviser/lecturer or tutor outside of consultation periods is needed you may contact the lecturer and/or tutors at:

Associate Professor Kai Ming Ting
Director of Undergraduate Studies
Phone +61 3 990 26241

All email communication to you from your lecturer will occur through your Monash student email address. Please ensure that you read it regularly, or forward your email to your main address. Also check that your contact information registered with the University is up to date in My.Monash.

Last updated: Mar 7, 2007