Open-SAT: LLM-Guided Query Embedding Refinement for Open-Vocabulary Object Retrieval in Satellite Imagery

In satellite applications, user queries often take the form of open-ended natural language, extending beyond a fixed set of predefined categories. This open-vocabulary nature poses significant challenges for retrieving relevant image tiles, as the retrieval system must generalize to a wide range of unseen objects and concepts. While vision-language models (VLMs) such as CLIP are widely used for text-image retrieval, even fine-tuned variants often struggle to accurately align such queries with satellite imagery. To address this, we propose Open-SAT, a training-free query embedding refinement algorithm that operates at inference time to improve alignment between user queries and satellite image content. Open-SAT uses VLMs to compute embeddings for image tiles, which are stored in a vector database for efficient retrieval. At query time, it leverages Large Language Models (LLMs) to refine the text embeddings by incorporating contextual information about objects of interest and their surroundings. A threshold-free retrieval mechanism further enhances accuracy and efficiency. Experimental results in three public benchmarks demonstrate that Open-SAT improves the F1 score by up to 16.04%, while retrieving a comparable number of image tiles. These results demonstrate the effectiveness of Open-SAT in open-vocabulary satellite image retrieval, leveraging LLM guidance without the need for additional training or supervision.

Event Classification by Physics-Informed Inpainting for Distributed Multichannel Acoustic Sensor with Partially Degraded Channels

Distributed multichannel acoustic sensing (DMAS) enables large-scale sound event classification (SEC), but performance drops when many channels are degraded and when sensor layouts at test time differ from training layouts. We propose a learning-free, physics-informed inpainting frontend based on reverse time migration (RTM). In this approach, observed multichannel spectrograms are first back-propagated on a 3D grid using an analytic Green’s function to form a scene-consistent image, and then forward-projected to reconstruct inpainted signals before log–mel feature extraction and transformer-based classification. We evaluate the method on ESC-50 with 50 sensors and three layouts (circular, linear, right-angle), where per-channel SNRs are sampled from ?30 to 0 dB. Compared with an AST baseline, scaling-sparsemax channel selection, and channel-swap augmentation, the proposed RTM frontend achieves the best or competitive accuracy across all layouts, improving accuracy by 13.1 points on the right-angle layout (from 9.7% to 22.8%). Correlation analyses show that spatial weights align more strongly with SNR than with channel–source distance, and that higher SNR–weight correlation corresponds to higher SEC accuracy. These results demonstrate that a reconstruct-then-project, physics-based preprocessing effectively complements learning-only methods for DMAS under layout-open configurations and severe channel degradation.

Learning to Tune OpticalWANs: A Field Deployment of Noise Models in Optical Networks

Accurately modeling optical signal transmission is critical foroptimizing network performance, particularly in large-scalefiber optic networks operated by Internet Service Providers.In this work, we develop a Gaussian Noise model for a NewYork state ISP’s optical backbone. Our model accounts for allmajor network components, including amplifiers, fiber spans,reconfigurable optical add-drop multiplexers, and transceivers.By accurately predicting end-to-end signal-to-noise ratio, ourmodel provides a foundation for network performance analysisand optimization. Then, we leverage hyperparameter searchtechniques—commonly used in machine learning—to identifyamplifier gain settings that improve signal quality. By treatingthe model as an opaque box, we systematically search foramplifier configurations that maximize the predicted end-to-end SNR while maintaining practical network constraints. Wevalidate our approach through a field deployment by applyingoptimized amplifier gain settings in a live ISP network. Ourresults show a significant improvement in optical signal quality,achieving a 2 dB increase in SNR on a single wavelength 1.

Mix-Clap: Adaptive Fusion of Knowledge-Distilled Audio Embeddings for Noise-Aware Audio-Language Models

Real-world deployment requires sound event and acoustic scene classification systems to remain reliable in noisy, diverse environments on resource-constrained devices. Although contrastive language-audio pretraining (CLAP) models with Transformer-based audio encoders achieve strong zero-shot performance, their computational cost hinders deployment. In this paper, we propose Mix-CLAP, a computationally efficient, noise-aware CLAP model with knowledge-distilled audio encoders. Our method includes: (1) a two-stage knowledge distillation from teacher embeddings to two lightweight student encoders?one on clean audio, the other on noisy audio, and (2) adaptive inference that combines their embeddings together with a fusion parameter and minimizes the parameterized entropy at test time. Experiments show that Mix-CLAP with MobileNetV3-based audio encoders greatly improves computational efficiency, while achieving a comparable average accuracy of 52.58% to the Transformer-based CLAP model at 52.83% on the recorded ESC50 datasets with different devices including microphones and fiber-optic distributed acoustic sensors under diverse conditions, making it suitable for real-world, resource-constrained applications.

GNPy as a Benchmark for Open and Disaggregated Optical Networks

The evolution toward open and partially disaggregated optical networks has introduced new, to our knowledge,requirements on how transmission performance is evaluated and compared across technologies, vendors, and deployment scenarios. In this context, sound benchmarking practices are essential to ensure that quality-of-transmission (QoT) assessments are reproducible, transparent, and meaningful beyond isolated experimental demonstrations. QoT estimation plays a central role in these practices, as it directly impacts network planning,commissioning, automation, and long-term technology selection in heterogeneous optical infrastructures. This paper discusses benchmarking practices for optical transmission in open networks using the open-source GNPy library as a reference digital model. The contribution of this work lies in formalizing how a transparent, vendor-agnostic QoT estimator can be used as a common benchmarking baseline across research and industry. Representative experimental validations spanning short-reach, multiband, and multi-vendor flex-grid transmission scenarios are reviewed and reframed as benchmarking baselines, establishing evidence-based expectations on achievable accuracy and applicability limits under realistic operating conditions. Finally, the paper illustrates how reference QoT models are employed in industry-facing benchmarking workflows,including closed-loop interactions with standardization bodies, multi-vendor planning and automation,procurement processes and strategic network evolution toward emerging architectures.

Quantitative Bounds for Length Generalization in Transformers

We study the problem of length generalization (LG) in transformers: the ability of a model trained on shorter sequences to maintain performance when evaluated on much longer, previously unseen inputs. Prior work by Huang et al. (2024) established that transformers eventually achieve length generalization once the training sequence length exceeds some finite threshold, but left open the question of how large it must be. In this work, we provide the first quantitative bounds on the required training length for length generalization to occur. Motivated by previous empirical and theoretical work, we analyze LG in several distinct problem settings: error control vs. average error control over an input distribution, infinite-precision softmax attention vs. finite-precision attention (which reduces to an argmax) in the transformer, as well as for one- or two-layer transformers. In all scenarios, we prove that LG occurs when the internal behavior of the transformer on longer sequences can be “simulated” by its behavior on shorter sequences seen during training. Our bounds give qualitative estimates for the required length of training data required for a transformer to generalize, and we verify these insights empirically. These results sharpen our theoretical understanding of the mechanisms underlying extrapolation in transformers, and formalize the intuition that richer training data is required for generalization on more complex tasks.

Agentic Placement of Microservices on the Computing Continuum

Deploying microservices across the computing continuum (edge–cloud) requires placement decisions that adapt to workload variation and heterogeneous infrastructure, yet existing solutions often rely on static policies or opaque heuristics. We present Bellona a system for reliable and auditable Large Language Model (LLM)-driven workflow execution that combines a declarative specification language with a runtime that orchestrates tool calls, conditional control flow, and structured LLM reasoning. Using Bellona, we implement an agentic placement workflow that automatically recommends edge or cloud execution. The workflow uses structured prompts and verifiable tool interactions to (i) parse placement and latency-report instructions, (ii) update the latency log, and (iii) select placements based on measured latency improvement thresholds. We evaluate the resulting agent on two representative microservices-based video analytics applications (human-attributes detection and face recognition) over two days of varying workload. Across 1,440 placement decisions per service, the agent achieves accuracies of 94.66%/84.94% (human-attributes detection, Day1/Day2) and 80.91%/96.53% (face recognition, Day1/Day2) with GPT-4o; with GPT-5, accuracy increases to 98.82%/99.45% (human-attributes detection) and 99.31%/99.8% (face recognition). These results demonstrate that Bellona can support practical, self-improving agentic control for placement of microservices on the computing continuum.

Learning to Route: A Rule-Driven Agent Framework for Hybrid-Source Retrieval-Augmented Generation

Large Language Models (LLMs) have shown remarkable performance on general Question Answering (QA), yet they often struggle in domain-specific scenarios where accurate and up-to-date information is required. Retrieval-Augmented Generation (RAG) addresses this limitation by enriching LLMs with external knowledge, but existing systems primarily rely on unstructured documents, while largely overlooking relational databases, which provide precise, timely, and efficiently queryable factual information, serving as indispensable infrastructure in domains such as finance, healthcare, and scientific research. Motivated by this gap, we conduct a systematic analysis that reveals three central observations: (i) databases and documents offer complementary strengths across queries, (ii) naively combining both sources introduces noise and cost without consistent accuracy gains, and (iii) selecting the most suitable source for each query is crucial to balance effectiveness and efficiency. We further observe that query types show consistent regularities in their alignment with retrieval paths, suggesting that routing decisions can be effectively guided by systematic rules that capture these patterns. Building on these insights, we propose a rule-driven routing framework designed specifically for hybrid-source RAG. A routing agent scores candidate augmentation paths based on explicit rules and selects the most suitable one; a rule-making expert agent refines the rules using QA feedback to produce more comprehensive and reliable decision criteria; and a path-level meta-cache reuses past routing decisions for semantically similar queries to reduce latency and cost. Experiments on three QA datasets demonstrate that our framework consistently outperforms static strategies and learned routing baselines, achieving higher accuracy while maintaining moderate computational cost.

Uncertainty-Guided Latent Diagnostic Trajectory Learning for Sequential Clinical Diagnosis

Clinical diagnosis requires sequential evidence acquisition under uncertainty. However, most Large Language Model (LLM) based diagnostic systems assume fully observed patient information and therefore do not explicitly model how clinical evidence should be sequentially acquired over time. Even when diagnosis is formulated as a sequential decision process, it is still challenging to learn effective diagnostic trajectories. This is because the space of possible evidence-acquisition paths is relatively large, while clinical datasets rarely provide explicit supervision information for desirable diagnostic paths. To this end, we formulate sequential diagnosis as a Latent Diagnostic Trajectory Learning (LDTL) framework based on a planning LLM agent and a diagnostic LLM agent. For the diagnostic LLM agent, diagnostic action sequences are treated as latent paths and we introduce a posterior distribution that prioritizes trajectories providing more diagnostic information. The planning LLM agent is then trained to follow this distribution, encouraging coherent diagnostic trajectories that progressively reduce uncertainty. Experiments on the MIMIC-CDM benchmark demonstrate that our proposed LDTL framework outperforms existing baselines in diagnostic accuracy under a sequential clinical diagnosis setting, while requiring fewer diagnostic tests. Furthermore, ablation studies highlight the critical role of trajectory-level posterior alignment in achieving these improvements.

Leveraging Deployed Telecom Cables for Distributed Fiber Sensing Topologies and Applications

Distributed fiber optic sensing (DFOS) has emerged as a promising technology for wide-area monitoring by utilizing existing telecom cables as large-scale sensing media. This paper explores three sensing modalities, backscattering-based sensing, forward-transmission-based sensing, and hybrid sensing, and discusses their respective benefits, challenges, and application domains. Backscattering sensing demonstrates strong potential for applications such as road traffic monitoring, pavement condition assessment, intrusion detection, and cabledamage prevention but is constrained in amplified dense wavelength division multiplexing (DWDM) networks. Forward-transmission sensing enables sensing over operational telecom links with in-line amplification, extending sensing reach, although it involves trade-offs in spatial resolution and localization accuracy. To address these challenges, a hybrid sensing architecture that integrates backscattering and forward-transmission techniques is introduced, achieving enhanced sensing distance while maintaining high sensitivity and localization performance.In addition, this work incorporates artificial intelligence (AI) through a locally adaptive anomaly detection (LAAD) framework based on self-supervised representation learning. By leveraging location-based pretext tasks and unlabeled data, the proposed AI approach enables efficient adaptation across heterogeneous fiber routes and operational environments, significantly reducing reliance on labeled data while improving cross-domain generalization. Field trials over deployed telecom networks validate the feasibility and effectiveness of the proposedsensing and AI framework, demonstrating scalable, telecom-compatible DFOS for real-world infrastructure monitoring and intelligent network operations.