bluesky
Latest Stories
"Fix" MacBook Neo Cursor Lag: Record 1 Pixel of the Screen Every 10 Seconds
"Fix" MacBook Neo Cursor Lag: Record 1 Pixel of the Screen Every 10 Seconds
San Diego photologs from the 1970s
San Diego photologs from the 1970s
Show HN: An ASCII 3D Rendering Engine
Sakana Fugu: a multi-agent system delivered as one model
Sakana Fugu: a multi-agent system delivered as one model
US AI stock sell-off shakes markets from Wall Street to Asia
US AI stock sell-off shakes markets from Wall Street to Asia
LineShine Debuts at No. 1 as the TOP500 Enters a New Global Exascale Era
The Teensy Executable Revisited
A Geometry-Informed Computer Vision Method for Detecting and Examining Overtaking Vehicles From A Bicycle
A Geometry-Informed Computer Vision Method for Detecting and Examining Overtaking Vehicles From A Bicycle
arXiv:2606.23699v1 Announce Type: new Abstract: Instrumented bicycle studies have produced direct field evidence on vehicle passing behavior, but extracting overtaking events from continuous rear-facing video has remained dependent on manual, frame-by-frame annotation. This bottleneck constrains sample sizes and limits naturalistic cycling safety research. We present a geometry-informed computer vision pipeline that automates overtaking event detection from a single bicycle-mounted camera wi...
FP8 is All You Need (Part 2): Efficient Ozaki-Bailey Style FFT Through Tensor-core Garner Reformulation and Kulisch Escape Route
FP8 is All You Need (Part 2): Efficient Ozaki-Bailey Style FFT Through Tensor-core Garner Reformulation and Kulisch Escape Route
arXiv:2606.23698v1 Announce Type: new Abstract: NVIDIA's Blackwell Ultra (B300) cuts FP64 vector throughput to ~1.3 TFLOPS per GPU, roughly 30x below B200 and well below the level at which bandwidth-limited FP64 workloads stay memory-bound. The Ozaki Scheme II framework recovers FP64-equivalent throughput by routing dense matrix multiply through FP8 tensor cores with a mantissa-sliced Chinese-remainder reconstruction. A companion Part (1) paper covers dense GEMM, batched GEMV, stencils, and ...
SemChunk-C: Semantic Segmentation for C Code
SemChunk-C: Semantic Segmentation for C Code
arXiv:2606.23697v1 Announce Type: new Abstract: Semantic segmentation of code written in a C-family language remains a challenging problem, due to the language's complex syntax, macro expansion, and irregular structural patterns. Existing chunking methods, such as fixed-sized windows, heuristic splitting, and syntax-based tools, often fail to capture meaningful functional units, limiting the efficacy of retrieval and other downstream LLM driven tasks. In this paper, we address the problem ...
Privacy Engineering: A Systematic Literature Review
Privacy Engineering: A Systematic Literature Review
arXiv:2606.23696v1 Announce Type: new Abstract: Privacy obligations under GDPR increasingly shape software engineering. We synthesize 90 studies from 2018 to 2025 using a systematic review with thematic synthesis to chart privacy engineering. Thirteen dimensions form two recurrent cores: Privacy Enhancing Technologies (PETs) with Privacy Metrics (PM) and Verification and Testing (VT) and Governance and Accountability (GA) with Transparency and Communication (TC) and Organizational Measures (...
Quantifying Prior Dominance in RAG Systems
Quantifying Prior Dominance in RAG Systems
arXiv:2606.23695v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) grounds Large Language Models in external knowledge, yet current evaluations rely on discrete heuristics that suffer from ''epistemic blindness'' - failing to distinguish genuine contextual information extraction from parametric memory recall. To address this, we introduce the Normalized Context Utilization (NCU) metric, leveraging continuous token log-probabilities across zero-shot, oracle, and adversarial ...
ModTGCN: Modularity-aware Graph Neural Networks for Text Classification
ModTGCN: Modularity-aware Graph Neural Networks for Text Classification
arXiv:2606.23694v1 Announce Type: new Abstract: Graph-based text classification models typically rely on local neighborhood aggregation and overlook global community structure, despite semantic document graphs exhibiting strong class-consistent clustering. Ignoring this can blur class boundaries and lead to over-smoothing. We propose ModTGCN, a modularity-aware graph neural network for text classification that jointly optimizes cross-entropy and a modularity-based auxiliary objective to prom...
EXPO-SQL: Execution-based Clause-level Policy Optimization for Text-to-SQL
EXPO-SQL: Execution-based Clause-level Policy Optimization for Text-to-SQL
arXiv:2606.23693v1 Announce Type: new Abstract: Text-to-SQL enables users to query databases using natural language by generating executable SQL queries. Recent methods have increasingly adopted Large Language Models based reinforcement learning (RL) to leverage execution feedback for training. However, existing RL methods assign uniform query-level rewards to all clauses in a SQL query, treating correct and incorrect clauses equally. This coarse-grained reward design leads to insufficient l...
From Heuristics to Transformers: A Comprehensive Survey of Type Inference from Stripped Binaries
From Heuristics to Transformers: A Comprehensive Survey of Type Inference from Stripped Binaries
arXiv:2606.23692v1 Announce Type: new Abstract: The recovery of high-level type information from stripped binaries-executables devoid of symbol tables and debugging information-is a cornerstone of software reverse engineering, vulnerability analysis, and decompilation. This survey tracks the evolution of binary type inference from early rule-based heuristics and static analysis to modern deep learning architectures. We analyze the shift from "duck typing" and constraint-solving techniques (e...
Exact vs approximate second-order derivatives in vertically-integrated ice sheet models
Exact vs approximate second-order derivatives in vertically-integrated ice sheet models
arXiv:2606.23691v1 Announce Type: new Abstract: Second order derivatives of model outputs with respect to input parameters are key to several applications in ice sheet modelling. For example, the ability to compute Hessian-vector products broadens the list of available optimisation methods, and facilitates certain kinds of parametric uncertainty quantification. Some modern ice sheet models are built on frameworks supporting algorithmic differentiation (AD), allowing for the computation of hi...
Beyond the Autoregressive Horizon: A Comprehensive Survey of Diffusion Models, World Modelling, and State Space Models for Code
Beyond the Autoregressive Horizon: A Comprehensive Survey of Diffusion Models, World Modelling, and State Space Models for Code
arXiv:2606.23690v1 Announce Type: new Abstract: Autoregressive (AR) language models have driven significant progress in automated software engineering, enabling powerful code generation and assistance systems. However, the next-token prediction paradigm introduces structural limitations for code reasoning, including restricted global planning, challenges in maintaining long-range dependencies, and limited grounding in program execution semantics. Noting the heavy skewness of existing literat...
Qwen-AgentWorld: Language World Models for General Agents
Qwen-AgentWorld: Language World Models for General Agents
I Read the Palantir Manifesto
I Read the Palantir Manifesto
DiffusionBench: Towards Holistic Evaluation of Generative Diffusion Transformers
DiffusionBench: Towards Holistic Evaluation of Generative Diffusion Transformers