Classification is a fundamental supervised machine learning task where the goal is to assign predefined labels or categories to input data points. The learning process involves training a model on a labeled dataset, and the trained model can then predict the class of new, unseen instances. Classification is used in various applications, including image recognition, spam filtering, and medical diagnosis.

Posts

Matching Confidences and Softened Target Occurrences for Calibration

The problem of calibrating deep neural networks (DNNs) is gaining attention, as these networks are becoming central to many real-world applications. Different attempts have been made to counter the poor calibration of DNNs. Amongst others, train-time calibration methods have unfolded as an effective class for improving model calibration. Motivated by this, we propose a novel train-time calibration method that is built on a new auxiliary loss formulation, namely multiclass alignment of confidences with the gradually softened ground truth occurrences (MACSO). It is developed on the intuition that, for a class, the gradually softened ground truth occurrences distribution is a suitable non-zero entropy signal whose better alignment withthe predicted confidences distribution is positively correlated with reducing the model calibration error. In our train-time approach, besides simply aligning the two distributions, e.g., via their means or KL divergence, we propose to quantify the linear correlation between the two distributions, which preserves the relations among them, thereby further improving the calibration performance. Finally, we also reveal that MACSO posses desirable theoretical properties. Extensive results on several challenging datasets, featuring in and out-of-domain scenarios, class imbalanced problem, and a medical image classification task, validate the efficacy of our method against state-of-the-art train-time calibration methods.

Real-time Intrusion Detection and Impulsive Acoustic Event Classification with Fiber Optic Sensing and Deep Learning Technologies over Telecom Networks

We review various use cases of distributed-fiber-optic-sensing and machine-learning technologies that offer advantages to telecom fiber networks on existing fiber infrastructures. Byleveraging an edge-AI platform, perimeter intrusion detection and impulsive acoustic event classification can be performed locally on-the-fly, ensuring real-time detection with low latency.

Ordinal Quadruplet: Retrieval of Missing Labels in Ordinal Time Series

In this paper, we propose an ordered time series classification framework that is robust against missing classes in the training data, i.e., during testing we can prescribe classes that are missing during training. This framework relies on two main components: (1) our newly proposed ordinal quadruplet loss, which forces the model to learn latent representation while preserving the ordinal relation among labels, (2) testing procedure, which utilizes the property of latent representation (order preservation). We conduct experiments based on real world multivariate time series data and show the significant improvement in the prediction of missing labels even with 40% of the classes are missing from training. Compared with the well known triplet loss optimization augmented with interpolation for missing information, in some cases, we nearly double the accuracy.

Spatio-Temporal Attentive RNN for Node Classification in Temporal Attributed Graphs

Node classification in graph-structured data aims to classify the nodes where labels are only available for a subset of nodes. This problem has attracted considerable research efforts in recent years. In real-world applications, both graph topology and node attributes evolve over time. Existing techniques, however, mainly focus on static graphs and lack the capability to simultaneously learn both temporal and spatial/structural features. Node classification in temporal attributed graphs is challenging for two major aspects. First, effectively modeling the spatio-temporal contextual information is hard. Second, as temporal and spatial dimensions are entangled, to learn the feature representation of one target node, it’s desirable and challenging to differentiate the relative importance of different factors, such as different neighbors and time periods. In this paper, we propose STAR, a spatio-temporal attentive recurrent network model, to deal with the above challenges. STAR extracts the vector representation of neighborhood by sampling and aggregating local neighbor nodes. It further feeds both the neighborhood representation and node attributes into a gated recurrent unit network to jointly learn the spatio-temporal contextual information. On top of that, we take advantage of the dual attention mechanism to perform a thorough analysis on the model interpretability. Extensive experiments on real datasets demonstrate the effectiveness of the STAR model.

Baseline Needs More Love: On SimpleWord-Embedding-Based Models and Associated Pooling Mechanisms

Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring substantial number of parameters and expensive computations. However, there has not been a rigorous evaluation regarding the added value of sophisticated compositional functions. In this paper, we conduct a point-by-point comparative study between Simple Word-Embedding-based Models (SWEMs), consisting of parameter-free pooling operations, relative to word-embedding-based RNN/CNN models. Surprisingly, SWEMs exhibit comparable or even superior performance in the majority of cases considered. Based upon this understanding, we propose two additional pooling strategies over learned word embeddings: (i) a max-pooling operation for improved interpretability; and (ii) a hierarchical pooling operation, which preserves spatial (n-gram) information within text sequences. We present experiments on 17 datasets encompassing three tasks: (i) (long) document classification; (ii) text sequence matching; and (iii) short text tasks, including classification and tagging.