Recent presentation at SPIE DCS 2016 conference

We have presented a paper at the following SPIE Defence + Commercial Sensing (DCS) conference:

Automatic Target Recognition XXVI

Our conference proceedings paper may be found through the link above or downloaded here (see below):

Learned Filters for Object Detection in Multi-object Visual Tracking
Victor Stamatescu, Sebastien Wong, Mark D. McDonnel, David Kearney


Conference attendances in 2014-2015

We have recently presented our research in the areas of multi-object visual tracking, heterogeneous computing and feature selection at the following conferences:

SPIE Defense + Security 2015: Automatic Target Recognition XXV

The following SPIE proceedings paper may be found through the following link or downloaded here (see below):

Mutual information for enhanced feature selection in visual tracking
Victor Stamatescu, Sebastien Wong, David Kearney, Ivan Lee, Anthony Milton

Winner of the Lockheed Martin 2015 ATR BEST PAPER AWARD


Image and Vision Computing New Zealand (IVCNZ) Conference 2014

The following two ACM proceedings papers may be found through the following link: 

A Competitive Attentional Approach to Mitigating Model Drift in Adaptive Visual Tracking
Sebastien Wong, Adam Gatt, David Kearney, Anthony Milton, Victor Stamatescu
Pages: 1-6

The CACTuS multi-object visual tracking algorithm on a heterogeneous computing system
Anthony Milton, Sebastien Wong, David Kearney, Simon Lemmo
Pages: 19-24


Honours projects in 2015

Tracking multiple objects with UAV and 360º cameras

Prerequisites: Coding skills in C/C++ are required, knowledge of MATLAB a plus

The Reconfigurable Computing Lab is offering two related honours research projects that involve collecting video data with a quad-copter mounted camera or with a 360º Bublcam for use in multi-object tracking. Visual tracking systems work by adaptively learning an object’s position, velocity and shape. The following video demonstrates the capabilities of the CACTuS-FL visual tracker:


Panormamic vision: tracking multiple objects with a 360º camera
New data sets for visual tracking will be obtained using a 360º (full spherical) field of view Bublcam ( The project involves collecting positional and video data, interfacing with the camera via its API, generating ground truth annotations (e.g. bounding boxes around each object of interest), and applying multi-object tracking software to the data. An additional objective will be the development of new semi-automated ground truth annotation tools.
Video Deblurring for Unmanned Aerial Vehicles
Video captured from unmanned aerial vehicles (UAVs) often suffers from motion blur due to movement (see [1] for examples). The objective for this project is to investigate motion deblur and video stabilisation techniques for UAVs. The student will initially be provided with a pre-recorded video captured on a UAV and will apply deblurring algorithms such as [2] to improve video quality. The student will then apply the deblurring algorithm to an off-the-shelf quad-copter and a head mounted display in order to demonstrate an improved First-person view (FPV) video.
 [1]   Jinhai Cai, and Ivan Lee, “The stitching of aerial videos from UAVs,” 2013 28th International Conference of Image and Vision Computing New Zealand (IVCNZ), pp.448-452, 27-29 Nov. 2013 doi: 10.1109/IVCNZ.2013.6727056
[2]   Sunghyun Cho, Jue Wang, and Seungyong Lee, “Video deblurring for hand-held cameras using patch-based synthesis.” ACM Transactions on Graphics (TOG) 31.4 (2012): 64

Lab presents at DSTO Mini-Conference

Members of the RCL lab presented their recent research and results at a collaborative mini-conference held at DSTO Edinburgh today. Alumni lab member Dr Adam Gatt introduced the CACTuS tracker and the feature learning variant he developed during his time in the lab, while PhD candidate Anthony Milton presented his research into developing a heterogeneous computer architecture for CACTuS. Following these two presentations, lab director David Kearney presented introductory material on deep learning, for a reading group on the topic of deep learning.

Copies of the presentations can be found under the presentations and publications section of this website. For more information on deep learning, see David’s ‘Deep Learning – An Annotated Bibliography‘.

Deep Learning – An Annotated Bibliography

RCL lab director Assof/Prof David Kearney has made the following information available in preparation of the deep learning reading group to be held on Wednesday 15th January 2014 at DSTO Edinburgh.


Deep Leaning – An annotated bibliography by David Kearney

This is a work in progress please send feedback to David dot Kearney at unisa dot edu dot au


This bibliography aims to focus on material that provides conceptual understanding without resorting to assumed knowledge of advanced probability and professional mathematical concepts that not everyone who wants to understand deep learning might have. I also emphasise simple software tools and video presentations that are easily accessible to those without specialist mathematical backgrounds. This is not to under rate mathematical treatments but more to recognize that mathematical treatments in the literature are often completely inaccessible without years of study.

What’s in a name?

There are a number of words that have entered the jargon of deep learning. These include:

Deep belief networks,  HMAX, Deep architectures, SIFT, hierarchical models, deep networks, structural SVMs, Convolutional networks, Hierarchical Temporal Memory, hierarchical sparse coding.


The Coursera course from Geoff Hinton and his group is highly recommended. Although the course is finished you can still enrol and watch the videos. Depending on your current knowledge of neural networks you could skip the early lectures and start in the middle with Hopfield nets; which are more relevant to deep learning.

If you want to hear from the experts in the field all in one place then you should go to the UCLA Institute for Pure and Applied Mathematics Graduate Summer School: Deep Learning, Feature Learning July 9 – 27 2012.

Andrew Ng from Stanford has a good introductory lecture here.

I also found this tutorial helpful

Many other videos are listed on the deep learning web site:


There are yet to appear dedicated textbooks of deep learning. The text book that receives high ratings on probabilistic machine learning has a single last chapter on deep learning:

Historical Perspective

Geoff Hinton has provided a historical introduction to deep learning which contains good conceptual insights and almost no mathematics.

[To Recognize Shapes, First Learn to Generate Images - Geoffrey Hinton]

Specific topics

Hopfield nets

The Hebian learning rule “fire together wire together” used in Hopfield nets is explained well in these slides:

Restricted Boltzmann Machines (RBMs)

There is a good explanation of the key algorithms in probabilistic machine learning. This is the best conceptual description of Gibbs sampling that I have seen so far:

Convolutional deep belief networks

Convolution is introduced as a means of coding images so they are shift invariant before they are presented to the restricted Boltzmann machine. Unfortunately I have yet to find a good conceptual description of these types of deep learning networks. You can read an original paper by Andrew Ng’s team:

[Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations - Honglak Lee, Roger Grosse, Rajesh Ranganath, Andrew Y. Ng]

Training a deep belief network

Stacking several RBMs on top of one another and using the output of one layer as the input to the next is the main way that deep belief networks gain their classification power.

“A more general solution (for training) was proposed by Hinton and collaborators (Hinton et al., 2006). They showed that a deep network can be trained in two steps. First, each layer is trained in sequence by using an unsupervised algorithm to model the distribution of the input. Once a layer has been trained, it is used to produce the input to train the layer above. After all layers have been trained in an unsupervised way, the whole network is trained by traditional back-propagation of the error (e.g., classification error), but the parameters are initialized using the weights learned in the first phase. Since the parameters are nicely initialized, the optimization of the whole system can be carried out successfully.”

[Learning Feature Hierarchies for Object Recognition - Koray Kavukcuoglu  (dissertation)]

[Unsupervised Learning of Feature Hierarchies - Marc’Aurelio Ranzato (dissertation)]

Introducing temporal information into deep learning

Again it is hard to find a good easy to grasp conceptual explanation of how temporal information is included in a RBM and thus a deep belief network. The best available seems to be:

[Modeling human motion using binary latent variables. Advances in Neural Information Processing Systems - Taylor, G. W., Hinton, G. E. and Roweis, S]

Conditional Restricted Boltzmann Machines

See Taylor, Hinton and Roweis NIPS 2006, JMLR 2011:

There is a set of slides and a presentation from the IPAM grad course:

You can watch the video at:


Simple software examples:

Hopfield nets

There is a Java based simulation of a Hopfield net that illustrates its ability to store patterns and recover input patterns contaminated by noise:

Restricted Boltzman Machines

A simple python example of a Restricted Boltzmann machine learning movie preferences is provided here:

Complete software examples

There is a lot of software available but each item often requires complex installation and support. The following two have been tried out (on a macintosh running 10.9) and found to be relatively straightforward to install or have adequate installation instructions.

This tracking example requires Matlab and uses numerous open source packages which are supplied with the code. It is interesting in that it provides a tracking example. However understanding the tracking example means extra knowledge not attempted to be covered in this bibliography.

The deep learning site has a complete example of a stacked set of restricted boltzman machines in Python (also known as a Deep Belief Net or DBN):

To train this is in a reasonable time requires the configuration of your GPU to work with the software. This is not covered well in the documentation. The example requires the Theano python expression compiler as well.