Parallel Computation in Learning
We explore algorithms for implementing large scale learning algorithms
as parallel computation.
We are currently developing parallelization approaches for increasing the
ability of SVM (Support Vector Machines) to solve large-scale
problems. As target systems, we consider shared memory processors,
clusters of processors, vector processors, and SIMD (Single
Instruction Multiple Data) processors. On a given system the speed of
an SVM is limited by the compute performance of the processor as well
as by the size of the memory. Efficient parallelizations have to
overcome both of these limitations while not getting bogged down in
communication overhead.
Online Learning
Online learning gives the promise of dealing with very
large datasets because data can be streamed off of a
disk or another source, and the entire dataset does not
have to be held in memory. We investigate fast online
algorithms that achieve good generalization ability
after only one pass of the data.
Large Scale Transduction
Transduction and semi-supervised learning methods can help
improve generalization ability in learning problems through
the use of the test labels, or unlabeled data, during learning.
However, many
algorithms are unfeasibly slow. We investigate how to make
large scale algorithms in this domain.