Currently offered Projects, Fall 2011 (updated September 5, 2011)
Building an autonomous motorboat
Athenians Data Project
Three-Dimensional Context from Linear Perspective for Video Surveillance Systems
Estimating Pedestrian and Vehicle Flows from Surveillance Video
Tandem repeat detection using spectral methods
Touch- and Gesture-based Text Entry With Automatic Error Correction
Early Breast Cancer Detection based on MRI’s.
Developing Fast Speech Recognition Engine using GPU
Solving Polynomials
MF7114 Assembler
MF7114 Debugger
Web Crawlers Behaving Like Humans: Are We There Yet!?
GFI Sandbox Analysis of Malware for DDoS
Network analysis of EEG data: Understanding connections in the brain
An Open Source Structural Equation Modeling Path Diagram to Syntax Application
YUsend Thermal Vacuum (TVAC) Test Manager

Currently offered Projects, Fall 2011 (updated September 5, 2011)

(Listed in order received.)

Building an autonomous motorboat

Supervisor: Michael Jenkin

Required Background: General CSE408x prerequisites

Recommended Background: Robotics

Description An opportunity exists for a small number of students to build an autonomous motorboat using a RC motorboat as a base and integrating computation and control in the form of a Beagleboard. Students will participate in lectures and labs associated with CSE6324 (Part I). Interested students should attend the first lecture of CSE6324. See the departmental schedule for time and place.

:

Athenians Data Project

Supervisor: Nick Cercone

Required Background: General CSE408x prerequisites

Recommended Background: Data Mining

Description The Athenians Project is a multi-year, ongoing project of compiling, computerizing and studying data about the persons of ancient Athens. Possible project ideas for this term span from simpler ones such as how to present data in the best possible way, add spatial characteristics to existing data, add multimedia data, improve text searching, etc. to more complex ideas such as filling missing parts for the “broken” words on the existing inscriptions. Filling text for the broken words has been done in the past using expert knowledge. Those experts have establish certain rules/guidelines that may be possible to extrapolate in some kind of expert system when talking in IT terminology. Furthermore, any hypotheses on word completion enters the database with some likelihood. Associating probabilities with hypotheses introduces another opportunity for research projects.

:

Three-Dimensional Context from Linear Perspective for Video Surveillance Systems

Supervisor: James Elder

Requirements: Good facility with applied mathematics

Description

To provide visual surveillance over a large environment, many surveillance cameras are typically deployed at widely dispersed locations. Making sense of activities within the monitored space requires security personnel to map multiple events observed on two-dimensional security monitors to the three-dimensional scene under surveillance. The cognitive load entailed rises quickly as the number of cameras, complexity of the scene and amount of traffic increases.

This problem can be addressed by automatically pre-mapping two-dimensional surveillance video data into three-dimensional coordinates. Rendering the data directly in three dimensions can potentially lighten the cognitive load of security personnel and make human activities more immediately interpretable.

Mapping surveillance video to three-dimensional coordinates requires construction of a virtual model of the three-dimensional scene. Such a model could be obtained by survey (e.g., using LIDAR), but the cost and time required for each site would severely limit deployment. Wide-baseline uncalibrated stereo methods are developing and have potential utility, but require careful sensor placement, and the difficulty of the correspondence problem limits reliability.

This project will investigate a monocular method for inferring three-dimensional context for video surveillance. The method will make use of the fact that most urban scenes obey the so-called “Manhattan-world” assumption, viz., a large proportion of the major surfaces in the scene are rectangles aligned with a three-dimensional Cartesian grid (Coughlan & Yuille, 2003). This regularity provides strong linear perspective cues that can potentially be used to automatically infer three-dimensional models of the major surfaces in the scene (up to a scale factor). These models can then be used to construct a virtual environment in which to render models of human activities in the scene.

Although the Manhattan world assumption provides powerful constraints, there are many technical challenges that must be overcome before a working prototype can be demonstrated. The prototype requires six stages of processing: 1)The major lines in each video frame are detected. 2) These lines are grouped into quadrilaterals projecting from the major surface rectangles of the scene. 3) The geometry of linear perspective and the Manhattan world constraint are exploited to estimate the three-dimensional attitude of the rectangles from which these quadrilaterals project. 4) Trihedral junctions are used to infer three-dimensional surface contact and ordinal depth relationships between these surfaces. 5) The estimated surfaces are rendered in three-dimensions. 6) Human activities are tracked and rendered within this virtual three-dimensional world.

The student will work closely with graduate students and postdoctoral fellows at York University, as well as researchers at other institutions involved in the project. The student will develop skills in using MATLAB, a very useful mathematical programming environment, and develop an understanding of basic topics in image processing and vision.

For more information on the laboratory: http://www.elderlab.yorku.ca

:

Estimating Pedestrian and Vehicle Flows from Surveillance Video

Supervisor: James Elder

Requirements: Good facility with applied mathematics

Description

Facilities planning at both city (e.g., Toronto) and institutional (e.g., York University) scales requires accurate data on the flow of people and vehicles throughout the environment. Acquiring these data can require the costly deployment of specialized equipment and people, and this effort must be renewed at regular intervals for the data to be relevant.

The density of permanent urban video surveillance camera installations has increased dramatically over the last several years. These systems provide a potential source of low-cost data from which flows can be estimated for planning purposes.

This project will explore the use of computer vision algorithms for the automatic estimation of pedestrian and vehicle flows from video surveillance data. The ultimate goal is to provide planners with accurate, continuous, up-to-date information on facility usage to help guide planning.

The student will work closely with graduate students and postdoctoral fellows at York University, as well as researchers at other institutions involved in the project. The student will develop skills in using MATLAB, a very useful mathematical programming environment, and develop an understanding of basic topics in image processing and vision.

For more information on the laboratory: http://www.elderlab.yorku.ca

—- :

Tandem repeat detection using spectral methods

Supervisor: Suprakash Datta

Required Background: The student should have completed undergraduate courses in Algorithms and Signals and Systems.

Recommended Background: Some background in Statistics is desirable but not essential.

Description DNA sequences of organisms have many repeated substrings. These are called repeats in Biology, and include both exact as well as approximate repeats. Repeats are of two main types: interspersed repeats (which are spread across a genome) and tandem repeats, which occur next to each other. Tandem repeats play important roles in gene regulation and are also used as markers that have several important uses, including human identity testing.

Finding tandem repeats is an important problem in Computational Biology. The techniques that have been proposed for it fall into two classes: string matching algorithms and signal processing techniques. In this project, we will explore fast, accurate algorithms for detecting tandem repeats and evaluate the outputs of the algorithms studied by comparing their outputs with those of available packages, including mreps (http://bioinfo.lifl.fr/mreps/), SRF (http://www.imtech.res.in/raghava/srf/) and TRF (http://tandem.bu.edu/trf/trf.html).

The student will implement existing spectral algorithms based on Fourier Transforms and on an autoregressive model. He will then make changes suggested by the supervisor, and evaluate the effect of the modifications. Throughout the course, the student is required to maintain a course Web site to report any progress and details about the project.

:

Touch- and Gesture-based Text Entry With Automatic Error Correction

Supervisor: Scott Mackenzie

Required Background: CSE3461 (or equivalent), CSE3311 (or equivalent), CSE4441 (or equivalent) A student wishing to do this project must be well versed in Java, Eclipse, and developing java code for the Android operating system.

Recommended Background: Possession of an Android touch-based phone or tablet would be an asset, but is not essential.

Description This project involves extending a touch-based text entry method to include automatic error correction. The method, as is, uses Graffiti strokes entered via a finger on a touch-based Android tablet. The stroke recognizer works fine, but it is not perfect. Some strokes are mis-recognized while others are un-recognized. The fault is sometimes attributable to the recognizer, but, often, the fault is simply that the user's input was sloppy. The work involves developing, integrating, and testing software. The core software is already written, but automatic error correction is lacking. The primary task of the added software is to receive a sequence of characters representing a word and matching the sequence with words in a dictionary. If a match is found, all is well (presumably). If a match is not found, the search is extended to find a set of candidate words that are “close” to the inputted sequence. “Close”, here, involves using a minimum string distance algorithm (provided). The user interface must be modified to present the user with alternative words in the event an error occurred. The user selects the desired word by tapping on a word in the list. The project will involve testing the new input method in a small user study and writing up a report describing the work and presenting the results of the user study.

:

Early Breast Cancer Detection based on MRI’s.

Supervisor: Amir Asif

Required Background: General CSE408x prerequisites

Recommended background: Signal processing, i.e. CSE3451

Project Description: This research will develop advanced computer-aided, signal processing techniques for early detection of breast cancer using the available modalities. In particular, we propose to develop time reversal beamforming imager, based on our earlier work in time reversal signal processing, for detecting early stage breast cancer tumours from MRI data. Our preliminary work has illustrated the type of results that are possible for breast cancer detection by applying time reversal signal processing on MRI breast data. In this research, we propose to extend these results to provide a quantitative understanding of the practical gains provided by time reversal in MRI based breast cancer detection and its limitations. This will be accomplished a local hospital, and running our algorithms on these datasets. The first step is important to check the validity of our algorithms. The next step is to compare the estimated locations of the tumours (as derived with our algorithms) to their precise locations as identified by the pathologists. The second step will quantify the accuracy of our estimation algorithms.

:

Developing Fast Speech Recognition Engine using GPU

Supervisor: Hui Jang

Required Background: General prerequisites

Description

Recently, Graphics Processing Units (GPU's) have been widely used as an extremely fast computing vehicle for a variety of real-world applications. Many software programs have been developed for GPU's to take advantage of its multi-core parallel computing architecture (see gpgpu.org). In the past few years, we have developed a state-of-the-art speech recognition engine using anti-C at York and it runs very well in a normal CPU-based platform. In this project, you are required to port this engine (the C source code is available) based on the standard CUDA or OpenCL library to make it run in GPU's. It has been reported that this may lead to a speedup of at least 10 times faster in many speech recognition tasks [1][2].

During the recent years, there is an increasing demand in the job market for programmers who can use GPU's for general purpose computing tasks. This project will serve as a perfect vehicle for you to learn such a cutting-edge programming skill.

References

[1] Kisun You, Jike Chong, Youngmin Yi, Gonina, E., Hughes, C.J., Yen-Kuang Chen, Wonyong Sung, Keutzer, K., “Parallel Scalibility in Speech Recognition: inference engines in large vocabulary continuous speech recognition,” IEEE Signal Processing Magazine, pp.124-135, No. 6, Vol 26, Nov 2009.

[2] Jike Chong, Ekaterina Gonina, Youngmin Yi, Kurt Keutzer, “A Fully Data Parallel WFST-based Large Vocabulary Continuous Speech Recognition on a Graphics Processing Unit,” Proc. of Interspeech 2009, Brigton, UK, 2009.

:

Solving Polynomials

Supervisor: Mike McNamee

Required Background: General prerequisites plus course in Numerical Methods, and knowledge of programming, preferably Fortran

Description

In this project you will compare several efficient methods for solving polynomials.

:

MF7114 Assembler

Supervisor: Zbigniew Stachniak

Required Background: Some knowledge of microprocessor architecture and assembly programming

Description

Every microprocessor is supported by a variety of software tools, such as assemblers, disassemblers, and debuggers to allow the development and testing of application programs destined for that microprocessor. The purpose of an assembler is to translate a program written in the target CPU's assembly language into that CPU's machine language. The objective of this project is to write an assembler for the MF7114 microprocessor and test it on a recently written MF7114 emulator.

Background Information: The MF7114 CPU was the first microprocessor designed and manufactured in Canada (by Microsystems International Ltd, or MIL) and one of the earliest microprocessors ever produced. The microprocessor was used, among other applications as the CPU of the CPS-1 microcomputer. Although none of the CPS/1 computers (nor MF7114 software) have survived, technical information about the microprocessor and the CPS-1 has been preserved. This makes the design and implementation of an assembler possible. More information on

http://www.cse.yorku.ca/museum/collections/MIL/MIL.htm

:

MF7114 Debugger

Supervisor: Zbigniew Stachniak

Required Background: Some knowledge of microprocessor architecture and assembly programming

Description

Every microprocessor is supported by a variety of software tools, such as assemblers,disassemblers, and debuggers to allow the development and testing of application programs destined for that microprocessor. The purpose of an MF7114 debugger is to debug programs written in the assembly language of the MF7114 microprocessor. The objective of this project is to write an MF7114 debugger and test it on a recently written MF7114 emulator.

Background Information: The MF7114 CPU was the first microprocessor designed and manufactured in Canada (by Microsystems International Ltd, or MIL) and one of the earliest microprocessors ever produced. The microprocessor was used, among other applications as the CPU of the CPS-1 microcomputer. Although none of the CPS/1 computers (nor MF7114 software) have survived, technical information about the microprocessor and the CPS-1 has been preserved. This makes the design and implementation of a debugger possible. More information on

http://www.cse.yorku.ca/museum/collections/MIL/MIL.htm

:

Web Crawlers Behaving Like Humans: Are We There Yet!?

Supervisor: Natalija Vlajic

Required Background: General prerequisites

Description

Distributed Denial of Service (DDoS) attacks are recognized as one of the most serious threats to today's Internet due to the relative simplicity of their execution and their ability to severely degrade the quality at which Web-based services are offered to the end users. An especially challenging form of DDoS attacks are the so-called Application-Layer DDoS attacks. Namely: 1) In Application-Layer DDoS attacks, the attackers utilize a flood of legitimate-looking Layer-7 network sessions (i.e., sessions that are generally hard to detect and/or filter out by a firewall or an IDS system); 2) Increasingly, these sessions comprise HTML requests generated by a cleverly programmed crawler that executes a semi-random walk over the web site links, thereby attempting to appear as a legitimate human visitor.

The goal of this project is to investigate the state of the art in malicious web crawler design. In particular, the project will look into the challenges of designing a smart-DDoS-crawler from the attacker point of view - one of these challenges being the estimation of web-page popularity assuming no a priori access to the web-logs of the victim web-site.

:

GFI Sandbox Analysis of Malware for DDoS

Supervisor: Natalija Vlajic

Required Background: General prerequisites.

Description

GFI Sandbox is a sophisticated industry-leading tool for quick and safe analysis of malware behaviour. The goals of this project are: 1) familiarize yourself with the operation of GFI Sandbox; 2) using readily available GFI Sandbox Feeds (i.e., ThreatTrack Feeds), build a database of malware designed specifically for execution of DDoS-attacks - the so-called botnet malware; 3) examine the behaviour of the collected malware 'upon execution'; 4) propose and build an environment - comprising the standard freeware security tools - for longer term (beyond immediate execution) analysis of the collected malware.

:

Network analysis of EEG data: Understanding connections in the brain

Supervisor: Andrew Eckford

Required Background: CSE 3213 (Computer Networks), CSE 3451 (Signals and Systems), and MATH 2030 (Elementary Probability); or equivalents

Preferred: At least a B in all of the above courses

Description Electroencephalogram (EEG) data indicates electrical activity at particular locations in the brain. Using EEG data from multiple sensors, it is possible to find correlations among the measurements, and identify “networks” of activity in the brain. These networks help researchers to determine exactly how the brain processes various stimuli.

The tools that are used to analyze communication networks can also be used to analyze brain networks. In this interdisciplinary project, you will work with a collection of EEG data to identify correlated measurements, and determine network-type relationships based on those measurements. To do so, you will apply skills you learned in courses on Signals and Systems, Computer Networks, and Probability. Your work may lead to a research publication.

:

An Open Source Structural Equation Modeling Path Diagram to Syntax Application

Supervisor: Jeff Edmonds

Required Background: JAVA

Recommended Background: GUI Development

Description The software required is an application that allows researchers to define their hypothesized models visually and will output the correct syntax for the analytical software of their choosing.

To date a promising functional application has been developed in JAVA by a Computer Science student as a 4080 project. The existing software allows the user to draw a path diagram and outputs code for the R package sem. There are a number of improvements to be made (refinements and additions to graphical user interface) and then the application needs to be extended to output syntax appropriate for additional software applications (openMX, MPlus and EQS).

This a cross-disciplinary project with the Quantitative Methods division of the Department of Psychology. As such, the student will be working with individuals with expertise in the relevant statistics but are not themselves software developers, which is reflective of real-world situations. The student is not expected to have any familiarity with statistics or the software packages mentioned above, this background will be provided.

:

YUsend Thermal Vacuum (TVAC) Test Manager

Supervisor: Rob Allison (co-supervised with Hugh Chesser, Space Engineering)

Required Background: General CSE408x prerequisites, familiarity with C++ and Windows software tools

Description The YUsend (York University Space Engineering Nanosatellite Demonstration) Lab has procured a Windows XP-based industrial computer and temperature acquisition card (as well as other hardware) for performing TVAC testing of nanosatellites in the CSIL Lab (PSE 003). A “TVAC Test Manager” application written using LabView's G programming language will oversee the acquisition of temperatures (thermal test outputs) and control of IR lamps (thermal test inputs) during the rather long periods (4 or more days, 24 hours a day) of a TVAC test.

Specific tasks include: 1. Writing temperature acquisition card (OMEGA Engineering CIO-DAS-Temp) drivers for LabView - should be written in Visual C++ or similar and compiled into SubVI format. 2. Write LabView VI's (“Virtual Instrument”) to perform (a) Test set-up activities - checkout of sensor and lamps, assigning neumonics to temperature sensors, setting of alarm conditions for sensors and lamps (b) Acquire and monitor temperature data and control lamp voltage during test, raise operator alarms for temperature or IR lamp anomalous conditions as required © Store temperature and control data for subsequent analysis and reporting. 3. (Optional) Interface the Test Manager with an orbital simulation tool which would be used to compute IR lamp inputs based on a simulation of the nanosatellite's orbital position and attitude (eg - in the sun, lamps on, in eclipse lamps off). The simulation tool is a package called Satellite Toolkit (STK) which has an TCP/IP-based API.

:

CSE4080

Table of Contents

Currently offered Projects, Fall 2011 (updated September 5, 2011)

Building an autonomous motorboat

Athenians Data Project

Three-Dimensional Context from Linear Perspective for Video Surveillance Systems

Estimating Pedestrian and Vehicle Flows from Surveillance Video

Tandem repeat detection using spectral methods

Touch- and Gesture-based Text Entry With Automatic Error Correction

Early Breast Cancer Detection based on MRI’s.

Developing Fast Speech Recognition Engine using GPU

Solving Polynomials

MF7114 Assembler

MF7114 Debugger

Web Crawlers Behaving Like Humans: Are We There Yet!?

GFI Sandbox Analysis of Malware for DDoS

Network analysis of EEG data: Understanding connections in the brain

An Open Source Structural Equation Modeling Path Diagram to Syntax Application

YUsend Thermal Vacuum (TVAC) Test Manager