Honours Stream of Bachelor of Software Engineering:

For current students:

Honours stream of Bachelor of Software Engineering

Monash University, Clayton School of IT

BCS, BSE and BBIS Honours Projects 2008

Honours projects for the Clayton School of IT, 2008

Note that some BSE projects are 12-pts, and that BCS/BBIS projects are 24-pts. Some projects come in 24- and 12-pt versions; the latter involves less work.

Supervisors as of Monday 11th February 2008.


A Smart Client for Solving Computational Inversion Problems (12 or 24 pts)
David Abramson
Project-id: Abramson-inverse
Inverse problems are very common in science and engineering. They can occur when we only know the outputs of a process and want to know the input values that caused them. In geology, an example might be to decide what parameters values, or initial conditions, are needed to drive a model of the earths crust to produce realistic output. Inverse problems are difficult to solve, however, most solutions combine non-linear optimization with computational modeling. We have developed a family of tools called Nimrod the can be used to solve inverse problems. Nimrod allows a user to specify some design goal, and the system searches the parameter space automatically.
Normally, it is necessary to convert the output of the computation to a single objective cost function value. However, many of our users have indicated that whilst they cannot objectively give a performance metric for a given solution, they recognise a good solution when they see one. This suggests that subjective analysis by users might be useful as part of the design approach. We have been experimenting with the design of such systems in which the user sees a number of solutions, and then ranks them for evaluation by a rank-driven optimization algorithm. Thus, the expert user is an integral part of the evaluation process rather than blindly using a deterministic mathematical equation.
In this project we plan to build a Smart Client that uses Nimrod to generate multiple solutions using computational models, visualizes them, and then display them for a user to assess. [nimrod(click)]
 
High-performance protein crystallography using GRID computing (12 or 24 pts)
David Abramson and Ashley Buckle (Medicine)
Project-id: Abramson-Buckle
X-ray crystallography is the most powerful technique for determining the 3-dimensional structures of proteins, but is increasingly dependant on high performance computing. The aim of this project is to develop solutions that will enable a wide range of crystallographic calculations to be performed on available GRID resources.

Simulation in Geostatistics (12pt or 24pt)
David Albrecht
Project-id: Alb-geosim

Consider the problem of determining the amount of soil contamination
given several samples in a contaminated site. If you also have a
geostatistic model of how the contamination varies in the site, then
you could use simulation to estimate the distribution of the soil
contamination. In this project we will investigate various methods of
doing simulation in geostatistics.

Modeling Geostatistics Data (12pt or 24pt)
David Albrecht
Project-id: Alb-geomodel

Geostatistics is an area of research concerned with the study of
data distributed in space or in time and space. Examples
include:

* ore grades in a mineral deposit,
* depth and thickness of geological layer,
* density of trees in a forest,
* rainfall over a catchment area, and
* concentrations of pollutants in a contaminated site.

One standard approach to model the variation of the data is the use of
variograms. In this project we will investigate different approaches to
fit variograms to data.

 


Approximation-Algorithms for Strict Minimum Message Length Code-Books (12 or 24-pts)
Lloyd Allison and Graham Farr
Project-id: Allison-Farr-MML
Strict Minimum Message Length (SMML) inference is an information-theoretic criterion for machine learning. It was introduced by Wallace and Boulton and has several desirable statistical properties. The SMML code-book problem is to create an optimal code-book that gives the greatest compression of future data. There is a quadratic-time algorithm for data drawn from a binomial distribution, and a good heuristic for the trinomial distribution, but the code-book problem is NP-hard (very probably requires exponential time) in general [FW02]. The project is to devise fast approximation-algorithms, that is algorithms that run quickly and create good, but not necessarily optimal, code books. The first cases to consider will be essentially unbounded one-dimensional in character such as the geometric and Poisson distributions. Optimisations may be possible if the prior on the parameter(s) is "nice". The Gaussian (normal) and related distributions are important for continuous data. The project involves and relates to algorithm analysis, complexity, data compression, and machine learning. You should have good results in mathematics, algorithms and data structures, and formal methods, and must have discussed the project with LA or GF before submitting "the form".

[FW02] G. E. Farr and C. S. Wallace.
The complexity of strict minimum message length inference.
Computer Journal, 45(3), pp.285-292, 2002 [doi:10.1093/comjnl/45.3.285]

Bayesian Evolutionary Trees (12 or 24-pts)
Lloyd Allison and Kevin Korb
Project-id: Allison-Korb-BET

Given K DNA sequences, or protein sequences, the multiple-alignment problem and the evolutionary-tree problem are "chicken and egg" problems: Given an optimal evolutionary-tree one can search for an optimal multiple-alignment of the K sequences (and we know how to do that [LW94]), and given an optimal multiple-alignment of the K sequences one can search for an optimal evolutionary-tree.

An evolutionary-tree is a special case of a Bayesian net [KN04]: a tree rather than a general DAG, with discrete variables (over {A,C,G,T} if DNA) at each node, known values at all leaves, and missing values at all internal (ancestral) nodes, e.g., see fig.4 [LW94].

The project is to modify the CaMML algorithm for Bayesian nets to infer evolutionary trees given a multiple alignment. If that is successful, the feasibility of simultaneously solving the multiple-alignment and evolutionary-tree problems will be investigated.

The project involves algorithms, probability and machine learning. You should have good results in mathematics, algorithms and data structures, and formal methods, and must have discussed the project with LA or KK before submitting "the form".

References:

[LW94] L. Allison and C. S. Wallace.
The Posterior Probability Distribution of Alignments and
its Application to Parameter Estimation of Evolutionary Trees and
to Optimisation of Multiple Alignments.
Jrnl. Molec. Evol., 39(4), pp.418-430, 1994
http://www.csse.monash.edu.au/~lloyd/tildeStrings/Multiple/94.JME/
or
http://dx.doi.org/10.1007/BF00160274

[KN04] K. B. Korb and A. E. Nicholson.
Bayesian Artificial Intelligence.
Chapman and Hall / CRC, 2004.



Artificial Vision and Perception (12 or 24-pts)
Damminda Alahakoon
Project-id: Alahakoon-vis
Vision is indispensable. Building automated machines with human vision capabilities is a formidable challenge in both industry and academia. Much of human learning can be viewed as an unsupervised incremental learning process. Investigating this gives a significant contribution to machine learning. In this project our main focus is to investigate how humans accumulate knowledge incrementally by a series of objects, induce a hierarchy of concepts that summarize and organize such observations and use them in classifying future experiences.
Antepartum Fetus Health Investigations
Damminda Alahakoon
Project-id: Alahakoon-fetus
Pre-labour fetal health is vital for a successful birth. Despite the maternal involvement in assuring fetal health, there are the risky deliveries where intervention is mandatory. The only external measurement that relates directly to fetal health is fetal heart rate patterns. Analyses of this property can be used to determine ominous instances. However, these pattens are largely misinterpreted by human observers leading to unnecessary surgical procedures. This project presents a cognitive approach to fetal heart rate interpretation. Artificial knowledge acquisition means are revamped with emphasis on understanding rather than enforcing knowledge.
Bioinformatics
Damminda Alahakoon
Project-id: Alahakoon-bio
Bioinformatics research involves retrieval and analysis of a large amount of biological data such as gene expressions, DNA, RNA, and proteins. Biological data is characterised by its high volume, high dimensionality and also complexity -in terms of structure, evolving nature and biological processes it involves in. Machine learning algorithms have been widely used for solving problems in bioinformatics such as sequence alignment, gene finding and protein domain analysis. This research is an attempt to develop a connectionist system that identifies dimensional changes in biological data.

Biometric Cryptosystem (12 or 24-pts)
Nandita Bhattacharjee and Bala Srinivasan
Project-id: Nandita-bio
Biometric information can serve not only for access or authetication, but also for data protection. It is possible to generate a key from the biometric information that can be used with some cryptographic algorithm to encipher and decipher data. In this project we shall study methods of generating key from finger prints or iris codes, addressing the fundamental problems of biometric cryptography of key change and distortion tolerance. Given the biometrics, a set of reliable features are extracted. In this project we shall design a system in which we do not need to store a biometric template, but only a string of error-correction data from which the biometric cannot be derived, and from which the key cannot be derived either unless the biometric is present.
References
1.Uludag, U.; Pankanti, S.; Prabhakar, S.; Jain, A.K.;"Biometric cryptosystems: issues and challenges", Proceedings of the IEEE, Volume 92, Issue 6, June 2004 Page(s):948 - 960.
2. Hao F, Anderson R and Daugman J, "Combining cryptography with biometrics eectively", Technical report number 640, computer laboratory, University of Cambridge, July 2005.

Slime Mould Computer Graphics Model and Agent-Based Simulation (12 or 24-pts)
Alan Dorin
Project-id: Dorin-Slime

The name Slime Mould doesn't give an organism a very appealing ring to it... but these "creatures" are really fascinating! There are two types of slime moulds: plasmodial slime moulds and cellular slime moulds. This project is interested in the latter. The trait that makes these slimes so amazing is their diverse range of behaviours during a complex life cycle. They have several "phases" within a motile (moving) period that give the mould the appearance of an animal, and an immotile (static) period that gives them the appearance of a plant or fungus.

Individual cells of the mould start their lives scattered around an area. At a chemical signal triggered by starvation, they all coagulate into a single blob. This then becomes a slug called a plasmodium that may be a few inches long. This slug then wanders around seeking nutrients. When it finds a good spot, the slug turns into a reproductive structure called a fruiting body. This is a fungus-like structure with a stalk and a ball on top that produces spores. These spores are then scattered... and the life cycle starts over again.

The purpose of this project is to build an agent-based simulation of the slime mould's life cycle (to some level of abstraction) and to visualise it using computer graphics techniques. The look of the model is important! The aim is to produce an attractive but true-to-life representation of the organism's appearance and behaviour. A movie of the slime mould plasmodium is online: http://www.youtube.com/watch?v=96U-6iU8W_A Some decent illustrations of the various phases of the life cycle are online too: http://universe-review.ca/R10-18-slimemoulds.htm


Rating and ranking players and teams using MML (12 or 24 pts)
David Dowe
Project-id: Dowe-ranking

Ratings are used in chess, tennis, golf, other sports, etc. in order
to rank both teams and individual players. A variety of systems are
used to do the ratings and rankings - including Elo, Glicko, Sonas and
systems more concerned with prize money (over the past 12 months).

We can re-visit and improve these systems using the Minimum Message
Length (MML) principle (Wallace & Boulton, 1968)(Wallace & Freeman,
1987)(Wallace & Dowe, 1999a, which has been the Computer Journal's most
downloaded article)(Wallace, posthumous, 2005)(Comley and Dowe, 2005)
to arrive at a comparatively simple model with an improved and
near-optimal predictive accuracy on future games and contests.

This project will require strong mathematics - calculus (partial
derivatives, second-order partial derivatives, integration by parts,
determinants of matrices, etc.), etc.

ComleyDowe2005 Comley, Joshua W. and D.L. Dowe (2005). Minimum
Message Length, MDL and Generalised Bayesian Networks with Asymmetric
Languages, Chapter 11 (pp265-294) in "Advances in Minimum Description
Length: Theory and Applications", M.I.T. Press, April 2005.
[Final camera-ready copy submitted Oct. 2003.]

DoweGardnerOppy2007 Dowe, D. L., S. Gardner and G. Oppy (2007).
Bayes not bust! Why simplicity is no problem for Bayesians. British
J. Philos. Sci., pp709-754.

Wallace2005

WallaceBoulton1968

WallaceDowe1999a

WallaceFreeman1987

Minimum Message Length analysis in the Higgs boson search (24 pts)
David Dowe
Project-id: Dowe-hbs

The existence of the Higgs boson is central to the structure of the standard
model of elementary particles in Physics. The current search for the Higgs
boson involves experiments at the Fermilab Tevatron and at the CERN Large
Hadron Collider (LHC).
Simulation software is available for the outputs of the Tevatron and the LHC
for the standard particle model and for various other competing models. This
enables us to statistically analyze the signals of the presence of a Higgs
boson in the data, in advance.

The main statistical approach considered would be the Minimum Message
Length (MML) principle, an approach from Bayesian information theory which
quantitatively trades off the simplicity off a hypothesis against its
goodness of fit to the observed data. Our search problem is partly a problem
in mixture modelling (or clustering), where we wish to identify whether or not
there is sufficient evidence that there is a component of the observed data
which is best explained by the presence of the Higgs boson.

The student should have a reasonably strong background in mathematics (partial
derivatives, integration, matrix determinants, etc.) and advanced quantum
physics.

References:
D. L. Dowe, S. Gardner and G.R. Oppy (2007) "Bayes not Bust! Why Simplicity
is no Problem for Bayesians", British Journal for the Philosophy of Science
(BJPS), Dec. 2007, pp709-754.

Wallace2005

An MML-guided approach to reconstruct images from asymmetric sets of noisy discrete projections (24 pts)
A./Prof. David Dowe (Clayton I.T.) and Dr Imants Svalbe (Physics)
Project-id: dld-tomog

The optimal reconstruction of images from traditional computer
tomography (CT) projections in the presence of real noise is a mature
problem with several practical solutions. This project proposes use
of a combination of symmetric (Finite Radon Transform) and asymmetric
(Mojette Transform) discrete projection techniques to reconstruct
images of test data and real images, and uses Minimum Message Length
techniques to optimise the image reconstruction for a variety of assumed
noise models. It also aims to use MML-related techniques to guide the
selection of the best views to capture the minimal amount of
sufficient real data required to reconstruct variable resolution images.

References:

Comley, J. W. and D. L. Dowe (2005). Minimum Message Length and
Generalized Bayesian Networks with Asymmetric Languages, Chapter 11
(pp265-294) in "Advances in Minimum Description Length: Theory and
Applications", M.I.T. Press, April 2005.
[Final camera-ready copy submitted Oct. 2003.]

D. L. Dowe, S. Gardner and G. R. Oppy (2007) "Bayes not Bust! Why
Simplicity is no Problem for Bayesians", Brit. Journal Philos. Sci.
(BJPS), Dec. 2007, pp709-754.

Guedon, JP and Normand, N., The mojette transform: the first ten years,
LNCS 3429 (2005) 79-91

Matus, F. and Flusser, J., Image representation via a finite Radon
transform, IEEE PAMI 15(10) (1993) 996-1006

Wallace2005

Minimum Message Length Support Vector Machines (24 pts)
David Dowe
Projecy-id: dld-mmidvm

Support Vector Machines (SVMs) are a popular approach to classification
in machine learning and "data mining". They are usually only used to
divide between two classes ("yes"/"no" or "positive"/"negative") and
nor are they typically able to give probabilities with their
predictions. They also have some arbitrariness in the choice of "kernel"
functions for specifying non-linear boundaries. Using Minimum Message
Length (MML) approaches such as those in Tan & Dowe (2004), other notions
intimated in Dowe (2007) and some previously overlooked coding
inefficiencies, we will be able to overcome all these shortcomings.
This will enable us to come up with comparatively simple SVMs which give
excellent (probabilistic) predictions on multi-class problems, possibly
using non-linear cuts.

The mathematics in this project will not be trivial.

References:

ComleyDowe2005

D. L. Dowe (2007), Discussion following "Hedging Predictions in Machine
Learning, A. Gammerman and V. Vovk", Computer Journal, Vol. 50, No. 2,
March 2007, pp167-168

D. L. Dowe, S. Gardner and G. R. Oppy (2007) "Bayes not Bust! Why Simplicity
is no Problem for Bayesians", Brit. Journal Philos. Sci. (BJPS), Dec. 2007,
pp709-754.

P. J. Tan and D. L. Dowe (2004). MML Inference of Oblique Decision Trees,
Proc. 17th Australian Joint Conf. on Artificial Intelligence (AI'04),
Dec. 2004, Lecture Notes in Artificial Intelligence (LNAI) 3339, Springer,
pp1082-1088.

Wallace2005

Limits in inference of games and markets (12 or 24 pts)
David Dowe
Projecy-id: dld-games

In games such as "rock, paper, scissors", we typically first seek
to maximise our returns against an opponent's best strategy, and then
we seek to exploit any systematic sub-optimalities in our opponents'
play. There is a balance between trying simultaneously to play an
optimal strategy and learn a model for the opponents' strategies.

The project will start with just one other player and then build
up to games with several other players and games where the number of
other players is not known. We shall eventually seek to infer the
total number of players and their strategies. This has relevance to
financial markets. The ability of MML to infer when the amount of
data per parameter is very limited (e.g., Dowe & Wallace (1997)) will
be central here.

The mathematics in this project will not be trivial. An interest
in game theory is essential, a knowledge of game theory is desirable.

References:

ComleyDowe2005

D. L. Dowe, S. Gardner and G. R. Oppy (2007) "Bayes not Bust! Why Simplicity
is no Problem for Bayesians", Brit. Journal Philos. Sci. (BJPS), Dec. 2007,
pp709-754.

Dowe, D. L. and C. S. Wallace (1997). Resolving the Neyman-Scott Problem by
Minimum Message Length. Computing Science and Statistics (Vol. 28), Proc.
28th Symposium on the interface, Sydney, Australia, pp614-618, 1997.

Wallace2005

WallaceDowe1999a

MML inference of systems of differential equations (24 pts)
David Dowe
Projecy-id: dld-diffeq

Many real-world systems (aerodynamic, biological, economic, financial,
meteorological, etc.) are modelled by systems of one or more
differential equations - such as Bernoulli, Navier-Stokes, etc.
Such systems are often well-known, but even then they are typically
inexact. One such example is turbulence and the difficulty in
properly accounting for it.

This project will start off simply with data from which we will infer
a differential equation with only one unknown. We will then introduce
a noise term which will enable us to model data which we can not fit
exactly - due to any number of effects, ranging from quantum mechanical
to measurement inaccuracies to the fact that the system is simply too
complicated for our model.

Our modelling will include Minimum Message Length (MML) - variously
because of a variety of optimality properties and its ability to use
simple models that fit more accurately than the more complex models
of other approaches.

The mathematics in this project will not be trivial.

References:

Comley, J. W. and D. L. Dowe (2005). Minimum Message Length and
Generalized Bayesian Networks with Asymmetric Languages, Chapter 11
(pp265-294) in "Advances in Minimum Description Length: Theory and
Applications", M.I.T. Press, April 2005.
[Final camera-ready copy submitted Oct. 2003.]

D. L. Dowe, S. Gardner and G. R. Oppy (2007) "Bayes not Bust! Why Simplicity
is no Problem for Bayesians", Brit. Journal Philos. Sci. (BJPS), Dec. 2007,
pp709-754.

Wallace2005

WallaceDowe1999a

Modelling language change (24 points)

Project-id: dld-ling

Supervisors:
A./Prof. David Dowe (IT) and Dr Simon Musgrave (Linguistics)
(24 points)

Languages spoken by humans change over time. For example, French, Italian and
many European languages originate from Latin, but each diverges from Latin in
rather different ways. English is a more complex case, with influences from
Latin and French overlaying its basic Germanic nature. Representing language
by the sounds of spoken words, we use Minimum Message Length (MML) to model
how languages might have developed. Our aim is to reach a model which best
approximates the known history of a language group by running simulations
which assign different probabilities to different types of change, e.g.
internal variation, development of regularity by analogy with existing forms,
and external influences (borrowing). This project will be quite mathematical.

References:
Cangelosi, Angelo and Domenico Parisi (eds.) (2002). Simulating the Evolution
of Language. Berlin: Springer-Verlag.

Dowe, Gardner & Oppy (2007)

Kosmidis, Kosmas, John M. Halley, and Panos Argyrakis (2005). Language
evolution and population dynamics in a system of two interacting species.
Physica A 353: 595-612.

Nowak, Martin A., Natalia Komarova, and Partha Niyogi (2002). Computational
and evolutionary aspects of language. Nature 417: 611-617

Ooi, J.N. and D. L. Dowe (2005). Inferring Phylogenetic Graphs of
Natural Languages using Minimum Message Length, Proc. CAEPIA
2005 (11th Conf. Spanish Assoc. for Artificial Intelligence), Vol. 1,
pp I:143 - I:152, Nov. 2005; ISBN 84-96474-13-5

 

(24 point or 12 point)
A./Prof. David Dowe and Dr Karen Smith (MAS)

Title: Determination of predictors of emergency patient outcome via MML

[dld-ambo]

Ambulance patients and their vital signs are often regularly monitored while
en route to hospital. Depending on the length of the trip, there might be one
to several to many such measurements of indicators such as blood pressure
(diastolic and systolic), etc. Other measured variables include respiratory
rate, pulse rate, neurological status and temperature. We will use this data
to predictively model the impact of patient physiological, demographic and
treatment factors on outcome for key clinical sub-groups such as cardiac
arrest, stroke, trauma and respiratory distress. Like many real-world
problems, there are many potentially relevant variables, but there is not as
much data as we would like for individual patients, and measurements are often
noisy. Such data is ideally suited to Minimum Message Length (MML) inference,
which we will use to arrive at relatively simple models which fit the available
data about as well as is possible.

References:
Comley, J.W. and D.L. Dowe (2005). Minimum Message Length and Generalized
Bayesian Networks with Asymmetric Languages, Chapter 11 (pp265-294) in
"Advances in Minimum Description Length: Theory and Applications", M.I.T.
Press, April 2005. [Final camera-ready copy submitted Oct. 2003.]

Dowe, D. L., S. Gardner and G. Oppy (2007). Bayes not bust! Why simplicity
is no problem for Bayesians. Brit. J. Philos. Sci., pp709-754.

Wallace2005

WallaceBoulton1968

WallaceDowe1999a

WallaceFreeman1987


[See: Allison & Farr project]

How many rules are needed to express propagation? (24-pts)
Mark Wallace and Maria Garcia de la Banda
Project-id: WGB-prop
Constraint inference is widely applied for efficiently solving combinatorial problems. Typically, during problem solving, inference is performed at each step in the search for a solution. Instead of running the generic inference process, it is theoretically possible simply to execute a single inference rule tailored to the constraint of interest in order to achieve the same inference. However, the number of alternatives needed for each inference rule is believed to be exponential in the number of variables in the constraint. In this project we explore whether this assumption is true. In particular, we want to explore how many rules are actually needed under different assumptions about the power of the inference. There are to date no published results on this topic, so we can hope that this project will produce a new and publishable result. This also means candidate students should have a solid mathematical background.

References

Le Provost, T., Wallace, M., Domain independent propagation.
Proc. Int.Conf. on 5th Gen. Comp. Systems, pages 1004--1011, 1992
http://citeseer.ist.psu.edu/leprovost92domain.html M. Maher. Propagation completeness of reactive constraints.
Proc. ICLP'02, pages 148-162, 2002

 
The Best Way to Schedule (12 or 24-pts)
Maria Garcia de la Banda and Mark Wallace
Project-id: GBW-schedule

Scheduling problems appear in all walks of life: from airlines to hospitals, and from offices to sports leagues. They involve performing a set of jobs, each requiring a sequence of tasks to be completed using a set of resources under some time constraints. The specific resource and duration for each task is fixed, and two tasks can't use the same resource at the same time. As a simple example, consider a computer game in which the components of a gravity gun, a radiation suit and a bullet-time disruptor need to be built according to the following schedule:

           |     Task_1     |     Task_2     |    Task_3
-------------------------------------------------------------
Grav. Gun  |  6 hours at M1 | 50 hours at M4 | 17 hours at M3
Rad. Suit  | 25 hours at M2 | 15 hours at M1 | 10 hours at M4
Disruptor  | 16 hours at M1 |  5 hours at M2 | 20 hours at M3

where M1,M2,M3, and M4 are four machines. The aim is to schedule the tasks to complete all jobs in the minimum amount of time. These problems can be modelled by associating a variable with the start time of each task (9 variables in our example above), and searching for the best compatible start times. Alternatively, they can be modelled by associating a variable with each pair of tasks on each resource (6 variables), and searching for the best task-orders. This project addresses the assumption that underlies many scheduling algorithms: that deciding task-orders is better. When, if ever, is this assumption false?

Ref: An Algorithm for Solving the Job Shop Problem, Carlier and Pinson, Management Science, Feb. 1989.

Automatic analysis of "no holes" in constraint programs (12 or 24-pts)
Maria Garcia de la Banda, Kim Marriott
Project-id: GBM-constraint
Constraint satisfaction problems involve a set of variables, their domains (i.e., their set of possible values), and a set of constraints determining the allowed combination of values. For example, the N-queens problem tries to place N queens on an NxN chessboard so that they cannot take each other. It can be modelled with Q1,...,QN variables (the N queens), each Qi with domain {1,...,N} (the N columns in which Qi can appear), and constraints ensuring no two queens are in the same column or diagonal.
Solvers collect the constraints and determine their effect on the variables. Finite domain (FD) solvers handle variables with finite domains (e.g., each Qi above has a finite domain of N elements). FD solvers often handle constraints by using either domain or bounds propagation. While the latter is usually less powerful, it is also faster. [1] has shown that domain propagation can be replaced by the more efficient bounds propagation without loss of power if no constraint creates a "hole" in the domain of any variable. While [1] proposes a program analysis to automatically infer the "no hole" property from the problem, the analysis was not implemented. Thus, it is not known how effective it is in practice.
The aim of this project is to implement the analysis, systematically test it on a wide range of problems, determine whether it is accurate enough to effectively guide a domain-to-bounds transformation, and if not, study possible extensions.
[1] Schulte and Stuckey. When do bounds and domain propagation lead to the same search space. ACM. PPDP 2001.

Expert Course Advice - Mechanically (12 or 24-pts)
John Hurst
Project-id: Hurst-Expert

As part of an on-going process to better utilize the advantages of information technology in a university learning environment, a recent student project looked at ways of extracting course and unit information from publically available handbooks, and turning this into an "expert system" that could provide course advice to students.

One outcome from this project was the automatic rendition of a Prolog program that was tailored to a given student at some arbitrary point in his or her course. Knowing what units had been completed, what units the course structure required, and what units might be undertaken in the next semester, the program would not only supply the student with a list of possible units that could be taken over the next semester, but also would show what units had to be taken in order to complete the course with (say) a specialization in a particular field.

The project showed the feasibility of this approach, but did not deliver a workable prototype. This project would be to take the existing work, and (re)engineer it to the point of completing a workable prototype that could be employed via say a web interface, or faculty kiosk.

Knowledge of Prolog would be useful, although not essential. A copy of the thesis may be found at http://www.csse.monash.edu.au/~ajh/research/PfreyThesis.pdf


Multi-dimensional Graph Neuron (GN) Array for Wireless Sensor Networks (WSN) (12 or 24 points)
Asad Khan
Project-id:Khan-gn

GN is a highly-scalable associative memory algorithm1 capable of handling multiple streams of input, which are processed and matched with the historical data (available within the network) in real time. It is capable of utilising a large number of low performance processors in connected configurations. Hence GN is highly suited for resource constrained devices such as Berkeley motes. The project will involve generalising the existing GN implementation, which handles 1-dimensional sensory inputs, to a multi-dimensional version capable of analysing 3 or higher dimensional sensory inputs to a WSN. The 24 point project will also investigate whether the algorithm suffers from curse of dimensionality. This project is available to students with programming background in C/C++ or Java.
1. Nasution and Khan, A Hierarchical Graph Neuron Scheme

for Real-Time Pattern Recognition, http://ieeexplore.ieee.org/iel5/72/4359168/04359217.pdf?arnumber=4359217

Implementation of a Password Capability Unix Filesystem (12 or 24-pts)
Carlo Kopp
Project-id: Kopp-walnut
The Walnut password capability operating system has demonstrated the utility and unique access control and security properties inherent in this class of operating system. The aim of this project is to exploit earlier research effort in this area to implement and test an alternative filesystem for Unix (Linux) platforms. This project is suitable for a student with prior Linux kernel experience.
Needs a good Linux-internals background. BCS/BDS student, or a strong BSE student.
Suburban Ad Hoc Networks (24-pts)  
Carlo Kopp & Ron Pose
Project-id: Kopp-sahn
The Suburban Ad Hoc Networking group focusses its research activities on techniques for implementing Suburban Ad Hoc Networks. These are self organising, quasi-static ad hoc (typically wireless) networks which provide an alternative technology for providing high speed digital connectivity to households, small businesses and distributed campuses. Specific areas of research interest include security, low level routing protocols, access controls and propagation behaviour. Given the broad scope of the research performed in this area, there is considerable choice in project topics. Students should consult Dr Ronald Pose or Dr Carlo Kopp for suitable project topics.
See [/research/san/].
Enhancement of Point to Point Protocol with Forward Error Control capability. (24-pts)
Carlo Kopp
Project-id: Kopp-ptp
he topic has the potential to also produce a good journal paper if done properly. The project requires some theoretical work to survey the plethora of FEC codes and codecs, and modification of existing open source PPP code to include LCP changes and the FEC codec. C language skills and some comms understanding required. The main requirement is a smart student. Potential for code release into the GPL domain - PPP is widely used in various open source OS'.

Please note Kevin Korb is away on OSP in Semester 2, 2008 and will not be offering individual projects.
 
See: Nicholson and Korb and Allison and Korb projects (which are bring offered).

 
 
Jon McCormack's Projects:

Jon McCormack's projects are in the area of computer graphics, human-computer interaction, adaptive intelligence and evolutionary computing. Current project offerings:

 
• Adaptive Parameter Mapping for Complex User Interfaces
• Developmental Modelling with Generalised Cylinders
• Computational Models of Artistic Creativity
 
Please see Jon McCormack's honours project page for further details on his honours projects.

Probability Density Estimation by Minimum Message Length (12 or 24 pts)
Enes Makalic and Daniel Schmidt
Project-id:Makalic-MML

Density estimation is the process of building approximations to an unknown
probability density function on the basis of some observed data. This
includes such familiar approaches as histogram estimation, as well as more
complicated 'smoothing' methods based on kernels. The task of density
estimation is an important topic and finds common use in fields ranging from
physics to the social sciences.

This project entails the application of Wallace's Minimum Message Length (MML)
principle, a state of the art model selection criterion developed at Monash
University, to the process of `learning' the underlying probability density
in the form of histograms or smooth curves. The MML principle is grounded in
information theory and presents the process of learning from data as one of
communicating the learnable information in the form of a message.

The potential student should have good results in mathematics and must have
discussed the project with Daniel Schmidt or Enes Makalic before
submitting "the form".

References:

[1] Rissanen, J.; Speed, T. P. & Yu, B.
Density estimation by stochastic complexity
IEEE Transactions on Information Theory, 1992, 38, 315-323

[2] Kontkanen, P. & Myllymaki, P. Meila, M. & Shen, X. (ed.)
MDL Histogram Density Estimation
Proceedings of AISTATS 2007, 2007

[3] Wallace, C. S.
Statistical and Inductive Inference by Minimum Message Length
Springer, 2005


Human Comprehension of Organisation Charts (24-pts)
Kim Marriott
Project_id: Marriott-Charts
Organisation charts are one of the most common kind of diagrams. However, comparatively little is known about how people understand or use them. In this project eye-tracking equipment will be used to determine how the focus of attention moves around the elements in an organisation chart when it is first seen, and then used in subsequent tasks. Based on this the project will be to develop a model for focus of attention changes when viewing organisation charts and to test the model with eye tracking data. [The project can also be offered for other sorts of visual notations such as UML notations, advanced mathematics, etc]
 
Automatic Program Code Layout (24-pts)
Kim Marriott and Peter Moulder
Project_id: Marriott-Figure
Automatic document layout is increasingly important topic because of the need to adapt a document's layout to different viewing environments and to dynamically generated content. An important part of technical document layout is how to layout program code, i.e. where to break lines of code if the line is too long. Based on this the project will be to develop a function for measuring the quality of layout of the code and incorporate this into an automatic document layout tool.
 
Also see [G de la B] & M.

Bernd Meyer's Projects:
BM1 Recruitment in Ant Colonies (biology)
BM2 Communication in Ant Colonies (biology)
BM3 Adaptiveness of Slime Molds (biology)
BM4 Self-organizing Stable Marriages
BM5 Physarum Solver made useful
BM6 Nature-inspired FPGA task allocation
Please see Assoc. Prof Bernd Meyer's honours project page for details of his projects. There projects are 24 pts but can potentially be modified into 12 pts (to be negotiated on a case by case basis).

Knowledge Engineering Dynamic Bayesian Networks for Ecological Risk Assessment (12 or 24 pts)
Ann Nicholson and Carmel Pollino (ANU)
Project-id: ich-Poll-ecology
Bayesian Networks (BNs) are graphical models for probabilistic reasoning, which are now widely accepted in the AI community as intuitively appealing and practical representations for reasoning under uncertainty. One use of BNs is for prediction, and within that general task, for the problem of risk assessment. We have done two recent projects in ecological risk assessment: (a) assessing the risk for native fish populations of water management interventions and (b) risk management of tropical seagrass. However, to date, the BN modeling in these projects has been done with so-called "static" BNs, where there is no explicit representation of time. Clearly, temporal modelling in such prediction tasks is very important, and it could (and should) be done using an existing extension to BNs, called Dynamic Bayesian networks. The aim of this project is to develop knowledge engineering techniques and methodologies for Dynamic Bayesian networks, using the ecological risk assessment domains as case studies.
Autonomous intelligent systems (mobile robotics: building maps) (12 or 24 pts)
Ann Nicholson and Bijan Shirinzadeh (robotics and Mechatronics Research Laboratory, Engineering)
Project-id: Nich-Shir-robotics
Working primarily with mobile robots and systems, the aim of the research is to establish methodologies for autonomous multi-agent systems. Multi-agent systems would work in a complex interactive network to achieve a common task in an optimized manner. Applications include: cooperative cleaning, exploration of unknown areas, security patrol, search for dangerous targets, search-and-rescue, etc.
Autonomous aerial vehicle (fixed wing and helicopter: sensing and flight control) (12 or 24 pts)
Ann Nicholson and Bijan Shirinzadeh (robotics and Mechatronics Research Laboratory, Engineering)
Project-id: Nich-Shir-flight
These on-going projects focus on research and development of autonomous flying drones - i.e. fully (autonomous) unmanned aerial vehicles. These projects aim to establish methodologies for integrated sensing capabilities with inertial measurement unit (IMU), global positioning system (GPS), and vision, together with optimized intelligent control, navigation and mission planning methodologies for such aerial vehicles.

Bayesian Poker (12 or 24-pts)
Ann Nicholson and Kevin Korb
Project-id: Nic-Korb-Poker
The Monash Bayesian poker project was started by Kevin Korb in 1993. Since then 5 honours students have worked on various aspects of the project. Our Bayesian poker player (BPP) was originally developed to play 5-card stud poker, using Bayesian network technology. Most recently, BPP has been converted to play Texas Hold'em Poker, the main online form of poker, and re-written in python. BPP has been entered in the inaugural world automated poker playing competition at the main American Artificial Intelligence conference AAAI-2006 and we plan to compete again in 2007. We have developed a simple GUI interface that will allow people to play against BPP via the internet. This project offers lots of options for making BPP a better poker player including improving its bluffing strategies and its opponent modelling. We have access to the AAAI-2006 tournament dgame logs, which provide a rich source of data for automated learning of opponent modelling. Or you might be interested in developing a much more interesting playing interface.

See http://www.csse.monash.edu.au/bai/poker/poker.html for links to Honours projects, online versions of research papers, and to play against the latest version of BPP online.

Causal Discovery (12 or 24-pts)
Kevin Korb & Ann Nicholson
Project-id: Korb-Nich-cause
Causal discovery algorithms learn causal Bayesian networks from data. The oldest of them dates from 1991. At Monash we have developed CaMML (Causal discovery via Minimum Message Length), which "data mines" observational data to find the causal model most probable in light of the data. In this honours project you may choose any one of a number of possible specific problems to investigate, including:
- Extending CaMML to learn Dynamic Bayesian Network (DBN) structures
- Incorporating expert knowledge as priors for the causal discovery algorithm.c
- The proper evaluation of causal discovery algorithms. The common uses of predictive accuracy, unweighted edit distance and Kullback-Leibler divergence are all demonstrably inadequate. We have an alternative "Causal Kullback-Leibler" which is superior, but needs further work. This is also related to metrics of causal power that we are developing, which assess how much causal influence one variable has over another.
- Integrating latent (unobserved) variables into our causal discovery algorithms.
- The philosophy of token causality explicated in terms of causal models (Twardy & Korb "A Criterion of Probabilistic Causation" Phil of Sci, 2004; Korb, Twardy, Handfield and Oppy "Causal Reasoning with Causal Models" Synthese, submitted)
- The mathematics of causal intervention (Korb & Nyberg "The Power of Intervention" Minds and Machines, 2006; Korb, Hope, Nicholson and Axnick "Varieties of Causal Intervention" PRICAI, 2004)


Integrating auditory and visual stimuli using Self-Organizing Cortical Neural Networks (SoCNN) (12 or 24 pts)
Andrew Paplinski
Project-id:Paplinski-SoCNN
The project continues the development of a computer model of an integration of audio-visual information by human brain using Self-Organizing Cortical Neural Networks (SoCNN). A good starting point for the problem specification is given in our recent report:
http://www.csse.monash.edu.au/publications/2007/tr-2007-210-full.pdf
and the paper:
A P Paplinski and L Gustafsson: Feeback in multimodal self-organizing networks enhances perception of corrupted stimuli, Lecture Notes in Computer Science, Springer-Verlag, vol. 4304, pp 19-28, 2006.
The human cortex can be thought of as a linked two-dimensional hierarchical associative memory system. Such a structure is modeled by our SoCNN. By now we are able to model integration of phonemes and related phonetic symbols/letters. This project will concentrate on the integration of spoken and written words using SoCNN.

Online procurement
Md Mahbubur Rahim
Project-id: Rahim-epro
Online procurement represents an important business-to-business (B2B) e-commerce initiative. In recent years, a large automotive company in Australia has implemented an internet-based supplier portal which facilitates exchange of procurement related documents in XML and EDIFACT format. This project aims at determining the success of the portal from the suppliers’ perspective. This aim is addressed by developing a theoretical framework and collecting empirical data from the suppliers.
Internet-Banking
Md Mahbubur Rahim
Project-id: Rahim-ebanking
Internet banking refers to the practice of conducting financial transactions by customers over the Internet through a bank’s website. It has attracted considerable adoption by the retail banking customers in Australia. This project aims at measuring the satisfaction of customers with internet banking in Australia and will identify the factors that facilitate their usage.

Co-movements and heterogeneity in the APEC Stock market
Nasrin Rahmati
Project-id: Rahmati-apec
The co-movements of the world's national equity market index returns have long been a popular research topic in stock market modeling literature. Low correlation between national stock markets is often presented as evidence in support of the benefit of global portfolio diversification. Although the co-movements of the world's major stock markets have been studied extensively, the co-movements of stock markets in a common market or free trade area have not received sufficient attention. This project will provide empirical evidence on the co-movements of the APEC stock markets by using two relevant statistical methods to study the portfolio diversification prospects of national stock markets.

Dependable Software Services (12 or 24-pts)
Sita Ramakrishnan
Project-id: SR-services
As Service-oriented architecture (SOA) is being considered increasingly in critical applications by distributed enterprises, the provision of dependable services is becoming an important area of research. In this project, we consider dependability attributes of a software service in a sustainable manufacturing domain. Sustainable manufacturing is the employment of eco-efficient technologies and industry standards for engineering manufacturing systems to minimise environmental burdens of green house emissions and energy use. Dependability attributes can be expressed in terms of timeliness, availability and reliability of the correct service, and trustworthiness attributes such as confidentiality, integrity and maintainability of deployed services.
Providing a predictable level of dependability for services which have been composed from services from various providers is an important requirement. Standards such as SOAP, UDDI and WSDL [1] are being adopted by major web service providers. These services need to establish and adhere to standards and hence, the dependability of service will become a differentiating point for services. Existing standards do not provide sufficient information to make decisions about how dependable and available the various services & components are, and how they fail. Some of the challenges that we will be dealing with respect to component interactions and composition of services in the sustainable manufacturing domain are: how to ensure trust in correctness when dealing with global partners and services composed using their services how to ensure reliability when composing services from these partners who are outside the control of the system ...[see SR for more information]

Feature Weighting in Content Based Image Retrieval (24 or 12 pts)
Sid Ray
Project-id: Ray-feature
The purpose of a Content-Based Image Retrieval (CBIR) system is to retrieve images from a database such that the visual contents of retrieved images are similar to that of a query image. Often the image similarities are computed from the representation of visual contents such as colour, texture and shape, in terms of a multi-dimensional feature vector. Understandably, the features cannot be assumed to have equal weights. More important features deserve more weights and less important features deserve less weights. These weights can be assigned fully automatically from feature variation statistics in the database images. Better still is to use the so called relevance feedback (RF) mechanism in which the weights are modified iteratively based on users' response regarding the relevance of the retrieved images. The aim of this project is to investigate the feature weighting problem for both cases, namely, without RF and with RF.
Small Sample Size Effects on CBIR Accuracy (24 or 12 pts)
Sid Ray
Project-id: Ray-sample
In recent years Content-Based Image Retrieval (CBIR) has become one of the most active areas of research in image data mining. In studies involving the design of a CBIR system often the question arises whether the data set sizes used are big enough to rely on the accuracy achieved. The aim of this second project in CBIR is to study the impact, on accuracy, of the number of samples in the database, the number of samples per semantic category, the number of retrieved images returned to user for RF, the number of features used, and their relative sizes.
Hierarchical Feature Selection for Pattern Recognition (24 or 12 pts)
Sid Ray
Project-id: Ray-hierarchy
In statistical pattern recognition, often patterns are represented by a large number of numerical features. Although there is no conceptual justification in reducing the number of features to a small number, in practical problem solving, this becomes a necessary step due to the wellknown phenomenon of the 'curse of dimensionality' of the feature vector on the complexity of the pattern classifier. The aim of the proposed project is to develop a hierarhical feature selection paradigm that is well-suited to multi-class pattern recognition problems.
The project will comprise (i) study of some existing feature selection criteria followed by an experimental investigation of them, (ii) analysis of the above results leading to, hopefully, a hierarchical feature selection paradigm, and (iii) development of an interactive software tool for the above paradigm.
The methodology developed will be such that depending on the classification accuracy arrived at a certain stage, the user will have the option of increasing or decreasing the dimensionality value. The software tool will include procedures for displaying the distribution of pattern samples in different feature spaces obtained by different feature selection methods.
Texture Analysis of Images (24 pts)
Sid Ray
Project-id: Ray-texture
Texture plays an important role in both human interpretation of visual scenes and computer analysis of images. Textural cues are of particular relevance in two different, but related, image analysis problems, namely, the problems of segmentation and classification of images. The proposed project will deal with both of these problems. It will involve (i) investigating existing texture analysis methods from the point of view of their theoretical soundness as textural measures and (ii) investigating their practical applicability.
Individual dolphin dorsal fin identification (24 or 12 pts)
Sid Ray, David Dowe and Kate Charlton (School of Biological Sciences)
Project-id: Ray-fin
Port Phillip Bay, Victoria, is home to a small and genetically unique population of bottlenose dolphins, Tursiops sp. Individuals within this population have been identified via photo-identification of the dolphins' dorsal fin. Dorsal fins show a unique shape and through time develop permanent marks and notches. Currently, the digital images are being assessed for these permanent marks and assigned an identity by human 'eye'. This can be incredibly time consuming, requires permanent notches on the fin and a trained eye to pick-up small notches. The aim of the project is to create a methodology/software that not only uses the notches but shape of the fin to assign identity with high levels of accuracy.
Some references:
1. http://www.dolphinresearch.org.au

2. J. D. Adams, T. Speakman, E. Zolman, and L. H. Schwacke, "Automatic
Image Matching, Cataloging, and Analysis for Photo-Identification
Research," Aquatic Mammals, Vol. 32, No. 3, pp. 374-384, 2006.
(pdf file available from Sid on request)

3. C. Gope, N. Kehtarnavaz, G. Hillman, and B. Wursig, "An Affine
Invariant curve matching method for photo-identification of marine
mammals, Pattern Recognition, Vol. 38, pp. 125-132, 2005.
(pdf file available from Sid on request)


Development of collection-specific texture features for image retrieval (12 or 24-pts).
David Squire
Project-id: Squire

Independent Component Analysis (ICA) is a statistical technique for discovering hidden factors underlying a set of random variables. ICA can be used to discover filters that can be used to characterize visual textures in a set of images. The filters produced by ICA are adapted to the statistics of the set of images used to derive them. It should thus be possible to use ICA to derive a set of filters specifically tuned to the textures in a collection of images from a restricted domain. We will investigate the discovery and selection of a set of ICA filters for image collections from domains such as dermatology.

The system will be developed within the framework of the GNU Image Finding Tool (GIFT) (http://www.gnu.org/software/gift/). The GIFT is an open framework for content-based image retrieval. Use of the GIFT framework will means that researchers working on a project can focus on the problems of immediate interest and importance to the project, rather than having to develop and entire image retrieval system, including user interface, feature extractors, indexing tools, database accessor, user interface, and feedback/learning system from scratch.

[See also Squire/Tischer project]

 


Hierarchical Segmentation for Interactive Image Manipulation (12 or 24pts)
Peter Tischer and David Squire
Project-id: Tischer-iim

The fundamental problem in getting computer programs to process images
is to recognise which groups of pixels belong together because they
represent some important element in the image. We term such a group of
pixels, a cluster. If the pixels in a cluster are spatially connected,
we term such a cluster, a segment. In a hierarchical segmentation,
we recognise that some segments may be merged with neighbouring segments
to form a larger segment which may be merged with other segments in turn.

Often when an image is to be processed, the user is interested in specifying
different regions in the image and actions whose effects will be confined to
those regions. The aim of this project is to allow a user to change part of
an image using a hierarchical segmentation of the image to identify
different regions within the image. This kind of approach can be applied to
identifying possible sites for skin cancer lesions in images of skin
and to problems like identifying how much of an image of the sea
is covered by sea-ice or open water.

Triangle-Projection and its Use in Scientific Visualisation (12 or 24-pts)
Peter Tischer
Project-id: Tischer-vis

In Scientific Visualisation we are interested in mapping large amounts of
high-dimensional data to a lower dimensional space so that a human observer
can look for interesting patterns in the original data. The aim of this
project is to use an approach which takes a small number of points in
original space and maps them to a lower dimensional space in such a way that
relationships between the points in the original space are preserved.

In particular, with triangle projection a group of 3 separate points
in a high dimensional space that do not form a straight line, form a
triangle. We can project those three points to a 2D space so that the
projections of the points form a triangle that is congruent to the triangle
formed using the original three points.

With triangle projection, each point is projected by forming a new triangle
with two points which have already been projected. The aim of this project
is to investigate the use of a number of different schemes that use triangle
projection in exploring high dimensional data sets. Although the rendering
of the projected data will involve some use of 2D, and possibly, 3D computer
graphics, it will not be necessary to have taken a graphics unit in order
to carry out this project.

Minimal Cost Spanning Graphs for hierarchical clustering and segmentation (12 or 24-pts)
Peter Tischer
Project-id: Tischer-mcst

Minimal Cost Spanning trees (MCSTs) have many uses among them include their
use for hierarchical clustering of data and for hierarchical segmentation of
images. MCSTs can be computed efficiently enough for them to be used in
interactive image segmentation. Minimal Cost Spanning Graphs (MCSGs) are a
generalisation of MCSTs and they share many of the useful properties that
MCSTs have. In particular, they can be computed quickly.

This project will investigate the use of MCSGs for hierarchical clustering
and for interactive image segmentation.

Bi-Clustering of Microarray Data (12 or 24-pts)
Peter Tischer
Project-id: Tischer-cluster

In DNA microarrays the data is a two-dimensional array of values.
The values along a row might all belong to one subject while those along a
column might be for a particular gene. Thus, the brightness of a spot in a
certain row and column represents the activity of a particular gene
for a particular person. The aim of a bi-clustering is to recognise clusters
of people who have clusters of genes that show the same kind of behaviour.
One way to approach this problem is to re-order the rows and columns
so that the value in a particular row and column is most likely to be similar
to values in adjacent rows and columns.

Bi-clustering is a new kind of clustering problem and research in this area
is still at a preliminary stage. The project will involve investigating
the literature to survey existing methods and implementing an approach
for re-ordering rows and columns.

Coding Segment Maps (12 or 24-pts)
Peter Tischer
Project-id: Tischer-maps

A segment in an image is a group of spatially connected pixels that share
some homogeneity property. Often segments represent objects, or parts of
objects, in a scene. A segment map is something that allows us to determine
for each pixel, which segment it is in. Often a segment map can be regarded
as a 2D array with as many rows or columns as the image and where the entry
in a particular row or column gives the identity of the segment to which the
corresponding pixel in the image belongs.

In a number of areas of image processing such as fractal image coding or
content-based image retrieval, it is necessary to store segment maps.
The aim of this project is to investigate a way of storing segment maps
which retains all the segment information but takes as few bits as possible.

 

Inferring biological function from the evolutionary tree (24-pts)
Geoff Webb
Project-id: Webb
This project will involve working with a team of computer scientists and biologists to analyse biological data about proteins and how they have evolved in order to infer how they function. This is an advanced data mining project that will provide a stepping stone toward a career in commercial data mining (such as developing next generation web search technologies), or in data mining or bioinformatics research.
Data mining of protein refolding databases (24-pts)
Geoff Webb and Ashley Buckle (Medicine)
Project-id: Webb-Buckle

The aim of this project is to analyse a large dataset of proteins that are used in refolding experiments, in order to improve our understanding of the refolding process. We hope to discover relationships between a protein's characteristics and its behaviour in the laboratory. We are also very interested using this information to provide scientists with real experimental protocols in a completely automated fashion. See http://refold.med.monash.edu.au. This is an advanced data mining project and will provide a stepping stone toward a career in commercial data mining (such as developing next generation web search technologies), data mining or bioinformatics research.

Last update: Wednesday, March 5, 2008 2:36 PM