zachary izzo Archives | NEC Labs America

Domain-Guided Weight Modulation for Semi-Supervised Domain Generalization

March 3, 2025/in Publications/by NEC Labs America

Unarguably deep learning models capable of generalizing to unseen domain data while leveraging a few labels are of great practical significance due to low developmental costs. In search of this endeavor we study the challenging problem of semi-supervised domain generalization (SSDG) where the goal is to learn a domain-generalizable model while using only a small fraction of labeled data and a relatively large fraction of unlabeled data. Domain generalization (DG) methods show subpar performance under the SSDG setting whereas semi-supervised learning (SSL) methods demonstrate relatively better performance however they are considerably poor compared to the fully-supervised DG methods. Towards handling this new but challenging problem of SSDG we propose a novel method that can facilitate the generation of accurate pseudo-labels under various domain shifts. This is accomplished by retaining the domain-level specialism in the classifier during training corresponding to each source domain. Specifically we first create domain-level information vectors on the fly which are then utilized to learn a domain-aware mask for modulating the classifier’s weights. We provide a mathematical interpretation for the effect of this modulation procedure on both pseudo-labeling and model training. Our method is plug-and-play and can be readily applied to different SSL baselines for SSDG. Extensive experiments on six challenging datasets in two different SSDG settings show that our method provides visible gains over the various strong SSL-based SSDG baselines. Our code is available at github.com/DGWM.

Subgroup Discovery with the Cox Model

December 15, 2024/in Publications/by NEC Labs America

We study the problem of subgroup discovery with Cox regression models and introduce a method for finding an interpretable subset of the data on which a Cox model is highly accurate. Our method relies on two technical innovations: the emph (Unknown sysvar: (expected prediction entropy)), a novel metric for evaluating survival models which predict a hazard function; and the emph (Unknown sysvar: (conditional rank distribution)), a statistical object which quantifies the deviation of an individual point to the distribution of survival times in an existing subgroup. Because of the interpretability of the discovered subgroups, in addition to improving the predictive accuracy of the model, they can also form meaningful, data-driven patient cohorts for further study in a clinical setting.

NEC Labs America Team Attending NeurIPS24 in Vancouver

December 3, 2024/in Events/by NEC Labs America

NEC Labs America is proud to attend NeurIPS 2024 in Vancouver, Canada from December 10-15. Zachary Izzo will present Subgroup Discovery with the Cox Model, Shaobo Han will present VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks and Jonathan Warrell will present Discrete-Continuous Variational Optimization with Local Gradients.

Matching Confidences and Softened Target Occurrences for Calibration

November 27, 2024/in Publications/by NEC Labs America

The problem of calibrating deep neural networks (DNNs) is gaining attention, as these networks are becoming central to many real-world applications. Different attempts have been made to counter the poor calibration of DNNs. Amongst others, train-time calibration methods have unfolded as an effective class for improving model calibration. Motivated by this, we propose a novel train-time calibration method that is built on a new auxiliary loss formulation, namely multiclass alignment of confidences with the gradually softened ground truth occurrences (MACSO). It is developed on the intuition that, for a class, the gradually softened ground truth occurrences distribution is a suitable non-zero entropy signal whose better alignment withthe predicted confidences distribution is positively correlated with reducing the model calibration error. In our train-time approach, besides simply aligning the two distributions, e.g., via their means or KL divergence, we propose to quantify the linear correlation between the two distributions, which preserves the relations among them, thereby further improving the calibration performance. Finally, we also reveal that MACSO posses desirable theoretical properties. Extensive results on several challenging datasets, featuring in and out-of-domain scenarios, class imbalanced problem, and a medical image classification task, validate the efficacy of our method against state-of-the-art train-time calibration methods.

Introducing the Trustworthy Generative AI Project: Pioneering the Future of Compositional Generation and Reasoning

August 19, 2024/in News/by NEC Labs America

We are thrilled to announce the launch of our latest research initiative, the Trustworthy Generative AI Project. This ambitious project is set to revolutionize how we interact with multimodal content by developing cutting-edge generative models capable of compositional generation and reasoning across text, images, reports, and even 3D videos.

Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews

July 21, 2024/in Publications/by NEC Labs America

We present an approach for estimating the fraction of text in a large corpus which is likely to be substantially modified or produced by a large language model (LLM). Our maximum likelihood model leverages expert-written and AI-generated reference texts to accurately and efficiently examine real-world LLM-use at the corpus level. We apply this approach to a case study of scientific peer review in AI conferences that took place after the release of ChatGPT: ICLR 2024, NeurIPS 2023, CoRL 2023 and EMNLP 2023. Our results suggest that between 6.5% and 16.9% of text submitted as peer reviews to these conferences could have been substantially modified by LLMs, i.e. beyond spell-checking or minor writing updates. The circumstances in which generated text occurs offer insight into user behavior: the estimated fraction of LLM-generated text is higher in reviews which report lower confidence, were submitted close to the deadline, and from reviewers who are less likely to respond to author rebuttals. We also observe corpus-level trends in generated text which may be too subtle to detect at the individual level, and discuss the implications of such trends on peer review. We call for future interdisciplinary work to examine how LLM use is changing our information and knowledge practices.

Provable Membership Inference Privacy

April 9, 2024/in Publications/by NEC Labs America

In applications involving sensitive data, such as finance and healthcare, the necessity for preserving data privacy can be a significant barrier to machine learning model development.Differential privacy (DP) has emerged as one canonical standard for provable privacy. However, DPs strong theoretical guarantees often come at the cost of a large drop in its utility for machine learning; and DP guarantees themselves are difficult to interpret. In this work, we propose a novel privacy notion, membership inference privacy (MIP), as a steptowards addressing these challenges. We give a precise characterization of the relationship between MIP and DP, and show that in some cases, MIP can be achieved using less amountof randomness compared to the amount required for guaranteeing DP, leading to smaller drop in utility. MIP guarantees are also easily interpretable in terms of the success rate of membership inference attacks in a simple random subsampling setting. As a proof of concept, we also provide a simple algorithm for guaranteeing MIP without needing to guarantee DP.

NEC Labs America Team Heading to NeurIPS23 in New Orleans

December 7, 2023/in Events/by NEC Labs America

NEC Labs America is proud to be a Silver Sponsor for NeurIPS 2023 in New Orleans from December 10-16. Visit our booth to meet our team and learn about our intern opportunities in machine learning, data science, media analytics and integrated systems. Also, our Vijay Kumar.B.G, Samuel Schulter & Manmohan Chandraker, along with Zaid Khan, Northeastern University and Yun Fu, UC San Diego will present a paper, Exploring Question Decomposition for Zero-Shot VQA.

Zachary Izzo

Posts

Domain-Guided Weight Modulation for Semi-Supervised Domain Generalization

Subgroup Discovery with the Cox Model

NEC Labs America Team Attending NeurIPS24 in Vancouver

Matching Confidences and Softened Target Occurrences for Calibration

Introducing the Trustworthy Generative AI Project: Pioneering the Future of Compositional Generation and Reasoning

Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews

Provable Membership Inference Privacy

NEC Labs America Team Heading to NeurIPS23 in New Orleans

Contact Us

About Us

Our Pages

Read Our Blog Posts