The department has developed neural network learning algorithms for over a decade, and several of the most successful algorithms in use today have been created here. This includes original algorithms for image/video interpretation as well as for text analysis.
The main focus is on flexible algorithms that can handle any type of data and can deal with large scale problems.
Deep learning architectures handle various natural language processing tasks, including parsing, part-of-speech tagging, chunking, named entity recognition, and semantic role labeling, achieving or exceeding state-of-the-art performance. Instead of depending on hand-crafted features that are engineered for specific tasks, our system learns internal representations from mostly unlabeled training data.
This makes the system very flexible, since it can be trained for a different language simply by training with text data from that language. So far we developed systems for English, Japanese and Chinese, achieving state-of-the-art or better performance in all cases.
Semantic Analysis and Reasoning
We develop several types of algorithms for high-level semantic analysis, used for tasks such as scene interpretation, document retrieval or question-answering systems. For text interpretation a syntactic analysis first extracts relevant elements, followed by concept interpretation. To combine different data types, a data-specific module generates first meta-data representations that are integrated in a deep learning network.
Supervised Sequence Embedding (SSE) is a simple and efficient technique to interpret shorter segments of text, such as product reviews, or e-mail messages. Short phrases (n-grams) are modeled in a latent space. The phrases are then combined to form document-level latent representations, where position of an n-gram in the document is used to compute corresponding combining weight. The resulting two-stage supervised embedding is then coupled with a classifier to form an end-to-end system that we apply to the large-scale sentiment classification task. SSE does not require feature selection, and its parameter space grows only linearly with the size of the n-grams.
High-Order Feature Learning
High-order feature interactions can capture intuitively understandable structures in the data of interest. The ‘High-Order Parametric Embedding’ (HOPE) is an efficient algorithm to determine high order features, generating data embeddings suitable for visualization. Compared to deep embedding models with complicated architectures, HOPE is considerably more effective in learning high-order feature mappings, and it can also synthesize a small number of exemplars to represent the entire data set in a low-dimensional representation. To compute this efficiently we have developed novel techniques based on tensor factorization.
This finds applications in a wide variety of problems where interpretation of the trained models is important.
In problems with a large number of labels, most multilabel and multiclass techniques incur a significant computational burden at test time. This is because, for each test instance, they need to systematically evaluate every label to decide whether it is relevant for the instance or not. We address this problem by designing computationally efficient label filters that eliminate the majority of the labels from consideration before the base multiclass or multilabel classifier is applied. The proposed label filter projects a test instance on a filtering line, and eliminates all the labels that had no training instances falling in the vicinity of this projection. The filter is learned directly from data by solving a constraint optimization problem, and it is independent of the base multilabel classifier. Experiments show that the proposed label filters can speed up prediction by orders of magnitude without significant impact on performance.
Accurate and fast diagnosis based on histological samples is crucial for prevention, early detection and treatment of cancer. NEC has developed a digital pathology system, where images of tissues are analyzed with machine learning algorithms to assist in cancer diagnosis.
This system is used in major diagnostics laboratories in Japan for quality control to ensure the correct pathological diagnosis.
Machine Learning Parallelization
Large scale data analytics is compute intensive and requires parallelization of algorithms as well as optimization of the data flow. We develop various types of parallelizations for multi-core systems and clusters. In addition we also work with heterogeneous systems that include GPU’s or vector processors.
MALT is one of our projects to enable parallelization over a large number of processors through virtual shared memory. MALT provides abstractions for fine-grained in-memory updates using one-sided RDMA, limiting data movement costs during incremental model updates. Developers can specify the dataflow while MALT takes care of communication and representation optimizations. ML applications, written in C, C++ and Lua, are supported that are based on SVM, matrix factorization and deep learning. Beside speedup, MALT also provides fault tolerance and guarantees network efficiency. We are implementing various new distributed optimization algorithms on MALT such as RWDDA and support for multiple GPUs.
Machine Learning Development Environment
Data analytics requires a wide range of tools for handling tasks from data collection, cleaning and labeling to model training and testing, plus, finally presenting the results.
Our system is designed for highest performance, but also for ease of use. The software architecture is cleanly structured into layers with the lowest one providing math and statistics functions, followed by a layer of machine learning and data handling tools. Templates for application domains are contained in the next layer, implemented in script language (Lua). These templates are adapted to customer requirements by field engineers when deploying applications at a customer premise.
Torch 7: Open Source System
Torch 7 provides a powerful environment for state-of-the-art machine learning algorithms. It is easy to use and provides a very efficient implementation, thanks to the easy and fast scripting language (Lua) and underlying C implementations.