CSE4500 Information retrieval systems - Semester 1 , 2008

Unit leader :

Maria Indrawan

Lecturer(s) :

Caulfield

  • Maria Indrawan

Tutors(s) :

Caulfield

  • Flora Salim

Introduction

Unit synopsis

This unit focuses on the theory and practises of information retrieval, in particular, the retrieval of data from non-traditional databases. Examples of the non-traditional databases are XML collections and text databases. The XML technologies covered includes DTD, XML Schema, XPath and XSLT. The text retrieval issues include indexing, storing, retrieving and measuring performance. An introduction to the design and implementation of search engine is also provided.

Learning outcomes

On completion of the subject, students should be able to:

  • have an understanding of the roles of XML in providing IT solutions to organisations.
  • have the knowledge of the different XML technologies and their roles in providing XML solutions to organisational problems.
  • have the knowledge of specific issues and requirements related to the adoption of XML technologies in organisation, in particular XML document, DTD, XML Schema, XPath and XSLT.
  • be able to design and create a well-formed and valid XML document.
  • be able to retrieve and transform XML document into a number of different presentation format.
  • ¨ have an understanding of the different issues related to storing and retrieving textual data.
  • ¨ be able to identify different components of a text retrieval system.
  • ¨ be able to evaluate the different techniques used in building a text retrieval systems.
  • - have an understanding of the different design and implementation issues related to search engines.

Workload

This unit expects students to spend total of 8-10 hours perweek. 

  • 2 hours of lecture attendance
  • 2 hours of tutorial attendance
  • 4-8 hours personal study time to prepare and complete exercises before attending tutorial class and to prepare and complete the specified assignments. The exact number of hours may vary from week to week.

Unit relationships

Prerequisites

Before attempting this unit you must have satisfactorily completed CSE9002 or equivalent

You should have knowledge of

  • HTML
  • Relational database concepts, including SQL, indexing.
  • File organisation.
  • Understanding of a programming language

Relationships

CSE4500 is an elective unit in the MIT, MAIT and the MIT(Minor Thesis). Before attempting this unit you must have satisfactorily completed CSE9002 or equivalent.

Continuous improvement

Monash is committed to ‘Excellence in education' and strives for the highest possible quality in teaching and learning. To monitor how successful we are in providing quality teaching and learning Monash regularly seeks feedback from students, employers and staff. Two of the formal ways that you are invited to provide feedback are through Unit Evaluations and through Monquest Teaching Evaluations.

One of the key formal ways students have to provide feedback is through Unit Evaluation Surveys. It is Monash policy for every unit offered to be evaluated each year. Students are strongly encouraged to complete the surveys as they are an important avenue for students to "have their say". The feedback is anonymous and provides the Faculty with evidence of aspects that students are satisfied and areas for improvement.

Student Evaluations

The Faculty of IT administers the Unit Evaluation surveys online through the my.monash portal, although for some smaller classes there may be alternative evaluations conducted in class.

If you wish to view how previous students rated this unit, please go to http://www.monash.edu.au/unit-evaluation-reports/

Over the past few years the Faculty of Information Technology has made a number of improvements to its courses as a result of unit evaluation feedback. Some of these include systematic analysis and planning of unit improvements, and consistent assignment return guidelines.

Monquest Teaching Evaluation surveys may be used by some of your academic staff this semester. They are administered by the Centre for Higher Education Quality (CHEQ) and may be completed in class with a facilitator or on-line through the my.monash portal. The data provided to lecturers is completely anonymous. Monquest surveys provide academic staff with evidence of the effectiveness of their teaching and identify areas for improvement. Individual Monquest reports are confidential, however, you can see the summary results of Monquest evaluations for 2006 at http://www.adm.monash.edu.au/cheq/evaluations/monquest/profiles/index.html

Unit staff - contact details

Unit leader

Dr Maria Indrawan
Senior Lecturer
Phone +61 3 990 31916
Fax +61 3 990 31077

Contact hours : Monday, 10 AM - 12 PM

Lecturer(s) :

Dr Maria Indrawan
Senior Lecturer
Phone +61 3 990 31916
Fax +61 3 990 31077

Tutor(s) :

Ms Flora Salim

Teaching and learning method

The lectures will cover the introduction of the basic principles of the topic. Reinforcement of the basic principles and further exploration of the topic will be covered through practical exercises during tutorial classes.

The assignments will be used to consolidate knowledge and skills learnt during lectures and tutorial. The nature of the assignments will be problem solving to allow students to apply what they have learnt in the lectures and tutorials into a case study. 

Tutorial allocation

On-campus students should register for tutorials/laboratories using Allocate+.

Communication, participation and feedback

Monash aims to provide a learning environment in which students receive a range of ongoing feedback throughout their studies. You will receive feedback on your work and progress in this unit. This may take the form of group feedback, individual feedback, peer feedback, self-comparison, verbal and written feedback, discussions (on line and in class) as well as more formal feedback related to assignment marks and grades. You are encouraged to draw on a variety of feedback to enhance your learning.

It is essential that you take action immediately if you realise that you have a problem that is affecting your study. Semesters are short, so we can help you best if you let us know as soon as problems arise. Regardless of whether the problem is related directly to your progress in the unit, if it is likely to interfere with your progress you should discuss it with your lecturer or a Community Service counsellor as soon as possible.

Unit Schedule

Week Topic Key dates
1 Introduction to XML  
2 Designing XML based data storage  
3 XML Schema part 1  
4 XML Schema part 2  
Mid semester break
5 XML namespace  
6 XPath, XSLT part 1  
7 XSLT part 2 Assignment 1 Due, Monday 12 Noon. Presentation will be conducted in tutorials
8 XSLT Part 3  
9 Introduction to Text Retrieval  
10 Text Indexing and Storage  
11 Text Retrieval Model and performance Assignment 2 Due, Monday 12 Noon
12 Distributed Text Retrieval Systems/Search Engines  
13 Revision  

Unit Resources

Prescribed text(s) and readings

Recommended text(s) and readings

Dwight Peltzer, XML:Language Mechanics & Applications, Addison Wesley, 2004, 0-201-77168-3

Required software and/or hardware

XML Writer 2.6, Wattle Software

Software may be:

  • downloaded from www.xml.org (30 days evaluation copy)

Equipment and consumables required or provided

N/A

Study resources

Study resources we will provide for your study are:

The unit website is available via Blackboard online learning system, accessable from my.monash.edu.au. At the website, the following study resources are available:

* lecture notes
* tutorial exercises
* assignment specifications
* discussion groups;
* this Unit Guide outlining the administrative information for the unit;

Library access

The Monash University Library site contains details about borrowing rights and catalogue searching. To learn more about the library and the various resources available, please go to http://www.lib.monash.edu.au.  Be sure to obtain a copy of the Library Guide, and if necessary, the instructions for remote access from the library website.

Monash University Studies Online (MUSO)

All unit and lecture materials are available through MUSO (Monash University Studies Online). Blackboard is the primary application used to deliver your unit resources. Some units will be piloted in Moodle.

You can access MUSO and Blackboard via the portal (http://my.monash.edu.au).

Click on the Study and enrolment tab, then Blackboard under the MUSO learning systems.

In order for your Blackboard unit(s) to function correctly, your computer needs to be correctly configured.

For example :

  • Blackboard supported browser
  • Supported Java runtime environment

For more information, please visit

http://www.monash.edu.au/muso/support/students/downloadables-student.html

You can contact the MUSO Support by: Phone: (+61 3) 9903 1268

For further contact information including operational hours, please visit

http://www.monash.edu.au/muso/support/students/contact.html

Further information can be obtained from the MUSO support site:

http://www.monash.edu.au/muso/support/index.html

If your unit is piloted in Moodle, you will see a link from your Blackboard unit to Moodle at http://moodle.med.monash.edu.au.
From the Faculty of Information Technology category, click on the link for your unit.

Assessment

Unit assessment policy

To pass the unit, students need to:

  • obtain 50% of the total amount of marks available in the unit and
  • at least 40% of the available marks in assignments and presentation.
  • at least 40% of the available marks in the exam.
If a student does not achieve 40% or more in the unit examination or the unit non-examination assessment then a mark of no greater than 44-N will be recorded for the unit.

Assignment tasks

  • Assignment Task

    Title : XML Design

    Description :

    Students will be required to design a XML data storage solution. The design includes the design of the distribution of data into several XML documents and the design of XML schemas that allow the validation of the XML documents proposed.

    Weighting : 20%

    Criteria for assessment :

    The following components will be assessed in the assignment:

    • creation of XML schema(s) to represent the case study.
    • creation of XML document(s) that are valid in accordance to the XML schema(s) created.
    • quality of the design for XML schema and XML document.
    • discussion on the design decision made to produce the submitted work.

    Due date : Monday, 14th April 2008, 12 Noon

    Remarks ( optional - leave blank for none ) :

    All hard copy of the assignments need to be dropped in the assignment box on H building level 6, the softcopy to be packaged and submitted online via MUSO.

  • Assignment Task

    Title : XML Programming

    Description :

    Students will be required to write XSLT to retrieve data from a set of XML documents.

    Weighting : 20%

    Criteria for assessment :

    The following components will be assessed:

    • correct XSLT scripts are produced to meet queries specified in the assignment.
    • design of the XSLT scripts

    Due date : Monday, 12th May 2008, 12 Noon

    Remarks ( optional - leave blank for none ) :

    All hard copy of the assignments need to be dropped in the assignment box on H building level 6, the softcopy to be packaged and submitted online via MUSO

  • Assignment Task

    Title : Presentation

    Description :

    Students will be expected to present their design solution of assignment 1 during tutorial in week 7.

    Weighting : 10%

    Criteria for assessment :

    Presentation will be marked based on content, structure and delivery of the presentation.

    Due date : Week 7 tutorials

Examinations

  • Examination

    Weighting : 50%

    Length : 2 hours

    Type ( open/closed book ) : Closed book

Assignment submission

Assignments to be submitted in both printed and softcopy format for the XML Design and XML programming assignments.

All hard copies need to be submitted to the Caulfield School of IT assignment box located on the H building level 6 by 12 NOON on the due date.

All softcopies are to be submitted via Blackboard. 

Assignment coversheets

Assignment cover sheet is available electronically online via Blackboard. A link to the coversheet will be setup on the assessment page of the unit website.

University and Faculty policy on assessment

Due dates and extensions

The due dates for the submission of assignments are given in the previous section. Please make every effort to submit work by the due dates. It is your responsibility to structure your study program around assignment deadlines, family, work and other commitments. Factors such as normal work pressures, vacations, etc. are seldom regarded as appropriate reasons for granting extensions. Students are advised to NOT assume that granting of an extension is a matter of course.

Requests for extensions must be made to Maria Indrawan at least two days before the due date. You will be asked to forward original medical certificates in cases of illness, and may be asked to provide other forms of documentation where necessary. A copy of the email or other written communication of an extension must be attached to the assignment submission.

Late assignment

Assignments received after the due date will be subject to a penalty of 10% of the available mark per day. Submission received later than one week after the due date will not normally be accepted.

Return dates

Students can expect assignments to be returned within two weeks of the submission date or after receipt, whichever is later.

Assessment for the unit as a whole is in accordance with the provisions of the Monash University Education Policy at http://www.policy.monash.edu/policy-bank/academic/education/assessment/

We will aim to have assignment results made available to you within two weeks after assignment receipt.

Plagiarism, cheating and collusion

Plagiarism and cheating are regarded as very serious offences. In cases where cheating  has been confirmed, students have been severely penalised, from losing all marks for an assignment, to facing disciplinary action at the Faculty level. While we would wish that all our students adhere to sound ethical conduct and honesty, I will ask you to acquaint yourself with Student Rights and Responsibilities (http://www.infotech.monash.edu.au/about/committees-groups/facboard/policies/studrights.html) and the Faculty regulations that apply to students detected cheating as these will be applied in all detected cases.

In this University, cheating means seeking to obtain an unfair advantage in any examination or any other written or practical work to be submitted or completed by a student for assessment. It includes the use, or attempted use, of any means to gain an unfair advantage for any assessable work in the unit, where the means is contrary to the instructions for such work. 

When you submit an individual assessment item, such as a program, a report, an essay, assignment or other piece of work, under your name you are understood to be stating that this is your own work. If a submission is identical with, or similar to, someone else's work, an assumption of cheating may arise. If you are planning on working with another student, it is acceptable to undertake research together, and discuss problems, but it is not acceptable to jointly develop or share solutions unless this is specified by your lecturer. 

Intentionally providing students with your solutions to assignments is classified as "assisting to cheat" and students who do this may be subject to disciplinary action. You should take reasonable care that your solution is not accidentally or deliberately obtained by other students. For example, do not leave copies of your work in progress on the hard drives of shared computers, and do not show your work to other students. If you believe this may have happened, please be sure to contact your lecturer as soon as possible.

Cheating also includes taking into an examination any material contrary to the regulations, including any bilingual dictionary, whether or not with the intention of using it to obtain an advantage.

Plagiarism involves the false representation of another person's ideas, or findings, as your own by either copying material or paraphrasing without citing sources. It is both professional and ethical to reference clearly the ideas and information that you have used from another writer. If the source is not identified, then you have plagiarised work of the other author. Plagiarism is a form of dishonesty that is insulting to the reader and grossly unfair to your student colleagues.

Register of counselling about plagiarism

The university requires faculties to keep a simple and confidential register to record counselling to students about plagiarism (e.g. warnings). The register is accessible to Associate Deans Teaching (or nominees) and, where requested, students concerned have access to their own details in the register. The register is to serve as a record of counselling about the nature of plagiarism, not as a record of allegations; and no provision of appeals in relation to the register is necessary or applicable.

Non-discriminatory language

The Faculty of Information Technology is committed to the use of non-discriminatory language in all forms of communication. Discriminatory language is that which refers in abusive terms to gender, race, age, sexual orientation, citizenship or nationality, ethnic or language background, physical or mental ability, or political or religious views, or which stereotypes groups in an adverse manner. This is not meant to preclude or inhibit legitimate academic debate on any issue; however, the language used in such debate should be non-discriminatory and sensitive to these matters. It is important to avoid the use of discriminatory language in your communications and written work. The most common form of discriminatory language in academic work tends to be in the area of gender inclusiveness. You are, therefore, requested to check for this and to ensure your work and communications are non-discriminatory in all respects.

Students with disabilities

Students with disabilities that may disadvantage them in assessment should seek advice from one of the following before completing assessment tasks and examinations:

Deferred assessment and special consideration

Deferred assessment (not to be confused with an extension for submission of an assignment) may be granted in cases of extenuating personal circumstances such as serious personal illness or bereavement. Information and forms for Special Consideration and deferred assessment applications are available at http://www.monash.edu.au/exams/special-consideration.html. Contact the Faculty's Student Services staff at your campus for further information and advice.