Cornell University is a leading Ivy League research institution known for excellence in engineering, agriculture, and computer science. Its interdisciplinary focus and global partnerships drive innovation with societal impact. NEC Labs America collaborates with Cornell University on AI for telecommunications, anomaly detection, and scalable graph learning. Our work informs intelligent infrastructure monitoring and automated network analysis. Please read about our latest news and collaborative publications with Cornell University.

Posts

Top 10 Most Legendary College Pranks of All-Time for April Fools’ Day

At NEC Labs America, we celebrate innovation in all forms—even the brilliantly engineered college prank. From MIT’s police car on the Great Dome to Caltech hacking the Rose Bowl, these legendary stunts showcase next-level planning, stealth, and technical genius. Our Top 10 list honors the creativity behind pranks that made history (and headlines). This April Fools’ Day, we salute the hackers, makers, and mischief-makers who prove that brilliance can be hilarious.

Understanding Transcriptional Regulatory Redundancy by Learnable Global Subset Perturbations

Transcriptional regulation through cis-regulatory elements (CREs) is crucial for numerous biological functions, with its disruption potentially leading to various diseases. It is well-known that these CREs often exhibit redundancy, allowing them to compensate for each other in response to external disturbances, highlighting the need for methods to identify CRE sets that collaboratively regulate gene expression effectively. To address this, we introduce GRIDS, an in silico computational method that approaches the task as a global feature explanation challenge to dissect combinatorial CRE effects in two phases. First, GRIDS constructs a differentiable surrogate function to mirror the complex gene regulatory process, facilitating cross-translations in single-cell modalities. It then employs learnable perturbations within a state transition framework to offer global explanations, efficiently navigating the combinatorial feature landscape. Through comprehensive bench marks, GRIDS demonstrates superior explanatory capabilities compared to other leading methods. Moreover, GRIDS s global explanations reveal intricate regulatory redundancy across cell types and states, underscoring its potential to advance our understanding ofcellular regulation in biological research.

NodeMerge: Template Based Efficient Data Reduction For Big-Data Causality Analysis

Today’s enterprises are exposed to sophisticated attacks, such as Advanced Persistent Threats~(APT) attacks, which usually consist of stealthy multiple steps. To counter these attacks, enterprises often rely on causality analysis on the system activity data collected from a ubiquitous system monitoring to discover the initial penetration point, and from there identify previously unknown attack steps. However, one major challenge for causality analysis is that the ubiquitous system monitoring generates a colossal amount of data and hosting such a huge amount of data is prohibitively expensive. Thus, there is a strong demand for techniques that reduce the storage of data for causality analysis and yet preserve the quality of the causality analysis. To address this problem, in this paper, we propose NodeMerge, a template based data reduction system for online system event storage. Specifically, our approach can directly work on the stream of system dependency data and achieve data reduction on the read-only file events based on their access patterns. It can either reduce the storage cost or improve the performance of causality analysis under the same budget. Only with a reasonable amount of resource for online data reduction, it nearly completely preserves the accuracy for causality analysis. The reduced form of data can be used directly with little overhead. To evaluate our approach, we conducted a set of comprehensive evaluations, which show that for different categories of workloads, our system can reduce the storage capacity of raw system dependency data by as high as 75.7 times, and the storage capacity of the state-of-the-art approach by as high as 32.6 times. Furthermore, the results also demonstrate that our approach keeps all the causality analysis information and has a reasonably small overhead in memory and hard disk.