princeton university Archives

Quantitative Bounds for Length Generalization in Transformers

April 23, 2026/0 Comments/in Publications/by NEC Labs America

We study the problem of length generalization (LG) in transformers: the ability of a model trained on shorter sequences to maintain performance when evaluated on much longer, previously unseen inputs. Prior work by Huang et al. (2024) established that transformers eventually achieve length generalization once the training sequence length exceeds some finite threshold, but left open the question of how large it must be. In this work, we provide the first quantitative bounds on the required training length for length generalization to occur. Motivated by previous empirical and theoretical work, we analyze LG in several distinct problem settings: error control vs. average error control over an input distribution, infinite-precision softmax attention vs. finite-precision attention (which reduces to an argmax) in the transformer, as well as for one- or two-layer transformers. In all scenarios, we prove that LG occurs when the internal behavior of the transformer on longer sequences can be “simulated” by its behavior on shorter sequences seen during training. Our bounds give qualitative estimates for the required length of training data required for a transformer to generalize, and we verify these insights empirically. These results sharpen our theoretical understanding of the mechanisms underlying extrapolation in transformers, and formalize the intuition that richer training data is required for generalization on more complex tasks.

DISC: Dynamic Decomposition Improves LLM Inference Scaling

December 1, 2025/in Publications/by NEC Labs America

Inference scaling methods for LLMs often rely on decomposing problems into steps (or groups of tokens), followed by sampling and selecting the best next steps. However, these steps and their sizes are often predetermined or manually designed based on domain knowledge. We propose dynamic decomposition, a method that adaptively and automatically partitions solution and reasoning traces into manageable steps during inference. By more effectively allocating compute — particularly through subdividing challenging steps and prioritizing their sampling — dynamic decomposition significantly improves inference efficiency. Experiments on benchmarks such as APPS, MATH, and LiveCodeBench demonstrate that dynamic decomposition outperforms static approaches, including token-level, sentence-level, and single-step decompositions, reducing the pass@10 error rate by 5.0%, 6.7%, and 10.5% respectively. These findings highlight the potential of dynamic decomposition to improve a wide range of inference scaling techniques.

Integrated Optical-to-Optical Gain in a Silicon Photonic Modulator Neuron

December 1, 2025/in Publications/by NEC Labs America

Silicon photonic neural networks can achieve higher throughputs and lower latencies than digital electronic alternatives.However, recently reported implementations of such networks have lacked integrated signal gain, instead utilizingoff-chip amplifiers or co-processors to complete the signal processing pipeline. Photonic neural networks without gainface substantial limitations in network depth and inter-layer fan-out. Here, we demonstrate a fully integrated siliconphotonic modulator neuron capable of up to 14.1 dBgain, achieved by modeling and addressing self-heating behavior inour output PN-junction micro-ring modulator.We use our experimental neuron to emulate a small network subject tohigh loss, achieving superior accuracy on an automated modulation classification benchmark to that of an optimal linearsystem. Our high-gain neuron can serve as a building block vastly expanding the range of neural network architecturesthat can be implemented with silicon photonics.

NeurIPS 2025 in San Diego from Nov 30th to Dec 5th, 2025

November 19, 2025/in Events/by NEC Labs America

NEC Laboratories America is heading to San Diego for NeurIPS 2025, where our researchers will present cutting-edge work spanning optimization, AI systems, language modeling, and trustworthy machine learning. multi-agent coordination, scalable training, efficient inference, and techniques for detecting LLM-generated text.

Quantitative Bounds for Length Generalization in Transformers

November 10, 2025/in Publications/by NEC Labs America

We study the problem of length generalization (LG) in transformers: the ability of a model trained on shorter sequences to maintain performance when evaluated on much longer, previously unseen inputs. Prior work by Huang et al. (2025) established that transformers eventually achieve length generalization once the training sequence length exceeds some finite threshold, but left open the question of how large it must be. In this work, we provide the first quantitative bounds on the required training length for length generalization to occur. Motivated by previous empirical and theoretical work, we analyze LG in several distinct problem settings: error control vs. average error control over an input distribution, infinite-precision softmax attention vs. finite-precision attention (which reduces to an argmax) in the transformer, and one- vs. two-layer transformers. In all scenarios, we prove that LG occurs when the internal behavior of the transformer on longer sequences can be “simulated” by its behavior on shorter sequences seen during training. Our bounds give qualitative estimates for the length of training data required for a transformer to generalize, and we verify these insights empirically. These results sharpen our theoretical understanding of the mechanisms underlying extrapolation in transformers, and formalize the intuition that richer training data is required for generalization on more complex tasks.

Scalable Photonic Neurons for High-speed Automatic Modulation Classification

November 9, 2025/in Publications/by NEC Labs America

Automatic modulation classification (AMC) is becoming increasingly critical in the context of growing demands for ultra-wideband, low-latency signal intelligence in 5G/6G systems, with photonics addressing the bandwidth and real-time adaptability limitations faced by traditional radio-frequency (RF) electronics. This paper presents the first experimental photonicimplementation of AMC, achieved through a fully functional photonic neural network built from scalable microring resonators that co-integrate electro-optic modulation and weighting. Thiswork also represents a system-level deployment of such compact photonic neurons in a real photonic neural network, demonstrating the significant potential of photonic computing forlarge-scale, complex RF intellegence for next-generation wireless communication systems.

Neuromorphic Photonics-Enabled Near-Field RF Sensing with Residual Signal Recovery and Classification

November 9, 2025/in Publications/by NEC Labs America

We present near-field radio-frequency (RF) sensing using microwave photonic canceler (MPC) for residual signal recovery and neuromorphic photonic recurrent neural network (PRNN)chip and FPGA hardware to implement machine learning for high-bandwidth and low-latency classification.

Eric Blow Presents at the IEEE Photonics Conference Singapore on November 10th & 13th

November 6, 2025/in Events/by NEC Labs America

Eric Blow of NEC Labs will address how machine-learning methods applied to distributed acoustic-sensing data can monitor facility perimeters and detect intrusion via walk, dig, or drive events over buried optical fibre—for example achieving ~90% classification accuracy.

Emerging Integrated Photonic Technologies Leveraging Multimaterial Integration for AI and Datacenter Applications

October 6, 2025/in Publications/by NEC Labs America

Since the inception of integrated photonics, multimaterial integration has served as a primary avenue for new technology innovations. Now, with an ever-increasing demand for integrated photonics as a platform for both high-performance links from/within datacenters and AI acceleration, multimaterial integration has begun to play an even more critical role in pushing capabilities beyond their current limits. In this work, we review photonics for AI and datacenter applications, the current landscape of multimaterial integration in photonics, and the ways in which multimaterial integration techniques have been recently utilized to push the performance of modulators on silicon and chip-scale optical frequency combs.

Quantitative Bounds for Length Generalization in Transformers

July 19, 2025/in Publications/by NEC Labs America

We provide quantitative bounds on the length of sequences required to be observed during training for a transformer to length generalize, e.g., to continue to perform well on sequences unseen during training. Our results improve on Huang et al. [8], who show that there is a finite training length beyond which length generalization is guaranteed, but for which they do not provide quantitative bounds.

Posts

Quantitative Bounds for Length Generalization in Transformers

DISC: Dynamic Decomposition Improves LLM Inference Scaling

Integrated Optical-to-Optical Gain in a Silicon Photonic Modulator Neuron

NeurIPS 2025 in San Diego from Nov 30th to Dec 5th, 2025

Quantitative Bounds for Length Generalization in Transformers

Scalable Photonic Neurons for High-speed Automatic Modulation Classification

Neuromorphic Photonics-Enabled Near-Field RF Sensing with Residual Signal Recovery and Classification

Eric Blow Presents at the IEEE Photonics Conference Singapore on November 10th & 13th

Emerging Integrated Photonic Technologies Leveraging Multimaterial Integration for AI and Datacenter Applications

Quantitative Bounds for Length Generalization in Transformers

Contact Us

About Us

Our Pages

Recent Publications

Events

News