Latest Stories
arXiv:2606.23699v1 Announce Type: new
Abstract: Instrumented bicycle studies have produced direct field evidence on vehicle passing behavior, but extracting overtaking events from continuous rear-facing video has remained dependent on manual, frame-by-frame annotation. This bottleneck constrains sample sizes and limits naturalistic cycling safety research. We present a geometry-informed computer vision pipeline that automates overtaking event detection from a single bicycle-mounted camera wi...
arXiv:2606.23698v1 Announce Type: new
Abstract: NVIDIA's Blackwell Ultra (B300) cuts FP64 vector throughput to ~1.3 TFLOPS per GPU, roughly 30x below B200 and well below the level at which bandwidth-limited FP64 workloads stay memory-bound. The Ozaki Scheme II framework recovers FP64-equivalent throughput by routing dense matrix multiply through FP8 tensor cores with a mantissa-sliced Chinese-remainder reconstruction. A companion Part (1) paper covers dense GEMM, batched GEMV, stencils, and ...
arXiv:2606.23697v1 Announce Type: new
Abstract: Semantic segmentation of code written in a C-family language remains a challenging problem, due to the language's complex syntax, macro expansion, and irregular structural patterns. Existing chunking methods, such as fixed-sized windows, heuristic splitting, and syntax-based tools, often fail to capture meaningful functional units, limiting the efficacy of retrieval and other downstream LLM driven tasks.
In this paper, we address the problem ...
arXiv:2606.23696v1 Announce Type: new
Abstract: Privacy obligations under GDPR increasingly shape software engineering. We synthesize 90 studies from 2018 to 2025 using a systematic review with thematic synthesis to chart privacy engineering. Thirteen dimensions form two recurrent cores: Privacy Enhancing Technologies (PETs) with Privacy Metrics (PM) and Verification and Testing (VT) and Governance and Accountability (GA) with Transparency and Communication (TC) and Organizational Measures (...
arXiv:2606.23695v1 Announce Type: new
Abstract: Retrieval-Augmented Generation (RAG) grounds Large Language Models in external knowledge, yet current evaluations rely on discrete heuristics that suffer from ''epistemic blindness'' - failing to distinguish genuine contextual information extraction from parametric memory recall. To address this, we introduce the Normalized Context Utilization (NCU) metric, leveraging continuous token log-probabilities across zero-shot, oracle, and adversarial ...
arXiv:2606.23694v1 Announce Type: new
Abstract: Graph-based text classification models typically rely on local neighborhood aggregation and overlook global community structure, despite semantic document graphs exhibiting strong class-consistent clustering. Ignoring this can blur class boundaries and lead to over-smoothing. We propose ModTGCN, a modularity-aware graph neural network for text classification that jointly optimizes cross-entropy and a modularity-based auxiliary objective to prom...
arXiv:2606.23693v1 Announce Type: new
Abstract: Text-to-SQL enables users to query databases using natural language by generating executable SQL queries. Recent methods have increasingly adopted Large Language Models based reinforcement learning (RL) to leverage execution feedback for training. However, existing RL methods assign uniform query-level rewards to all clauses in a SQL query, treating correct and incorrect clauses equally. This coarse-grained reward design leads to insufficient l...