Can a Small Language Model Predict Kernel Latency, Memory, and Model Accuracy from Code? A New Regression Language Model (RLM) Says Yes

Spread the love

Researchers from Cornell and Google introduce an integrated region language model (RLM), which directly predict numerical results from code strings-GPU kernel delays, program memory use, and even nervous network accuracy and delay-hand-engineer features. The 300 meter-parameter encoder-dicoder from T5-BEMMA using a single text-to-number decoder, receives strong rank correlations in odd functions and languages that emit digits with constrained decoding.

Table of Contents

What is really new?

Integrated Code-Setric Registration: An RLM high-level code (Python/C/C ++ and more) peak memory (i) peak memory (ii) delay for triton GPU kernels, and (iii) accuracy and hardware and hardware-distinguishing by reading ONNX graphs and reading the text representation from ONNX graphs and by doing dikod. No feature is required for engineering, graph encoder, or zero-altar proxy.
Concrete results: Reported correlations are included Spearman ρ ρ 0.93 Apps on Latekode Memory, ρ ρ ρ 0.52 For triton kernel delay, ρ> 0.5 average 17 Kodnet languagesAnd Kendall τ τ 0.46 Five classic NAS crosses the spaces and in some cases crossing graph-based predictions.

Multi-purpose decoding: Because the decoder is autoresiva, the position of the model later captures realistic trade-bands with Pareto fronts, on the first one (eg, accuracy → per-device deletion) on the metrics.

Why is this important?

Compiler, GPU kernel selection, and performing prediction pipelines in NAS usually rely on Bespok features, syntax trees, or GNN encoders that are brittle for new opes/languages. Treatment of regression as Next token prediction on numbers Stacks the stack: Tokens the input in the form of plain text (source code, triton IR, ONNX), then calibrated numeric strings digit-by-ankles with constrained sampling. This reduces the cost of maintenance and improves transfer to new tasks through fine-tuning.

Data and benchmarks

Code-ragration dataset (HF): Curated for support Codes Apps/Latcode Run, Triton Karnell Laquetan (Karnebuk-Report), and Kodnet Memori footprints spread.

NAS/Onnx Suite: Nasbench-101/20 ONNX text To predict accuracy and device-specific delay.

How does this work?

Backbone: Encoder -Decoder A T5 gemma Encoder Inquiology (~ 300 m param). Inputs are raw wires (code or ONNX). There are numbers emitted as output Sign/Exonyent/Mantisa Digit TokenForced decoding applies valid digits and supports uncertainty through samples.
Promise: (i) the language accelerates pretering convergence and improves the triton delays prediction; (ii) Decoder-keval numeric emissions Outperforms MSE regression head also with Y-normalization; (iii) Tokar learned by Tokar, especially effective references for ONNX operators; (iv) Help in long contexts; (v) Scale to a large gemma encoder improves correlation with adequate tuning.

Training code. Resurrection-lm The library provides text-to-text region utilities, resolution decoding and multi-task preparation/fine-tuning recipe.

Statistics

Apps (Python) Memory: Javelin ρ> 0.9,
Kodnet (17 languages) memory: average ρ> 0.5The strongest languages include C/C ++ (~ 0.74–0.75).
Triton kernel (a6000) delay: ρ ρ ρ 0.52,
Nas ranking: average Kendall τ τ 0.46 Nasnett, amoeba, PNAS, NAS, beyond darts; Competitive with Flan and GNN baseline.

key takeaways

Integrated Code-to-Ethic Regression Works. A single ~ 300m-parameter predicts T5GEMMA-Initialized Model (“RLM”): (A) High-level code from memory, (B) Triton GPU kernel delay, and (C) model accurate + device from delayed text from ONNX-Sathya, no hand-anarriarized features.
Research has shown on Spirman ρ> 0.9 apps memory, on triton latency .50.52,> 17 in Kodnet languages 0.5 average, and five NAS Kendall-τ τ 0.46 in spaces.
The numbers are decoded as a text with obstacles. Instead of a regression head, RLM emits numeric tokens with constrained decoding, multi-metric, autoragressive output (eg, accurate, accurate, accurates, accurate after multi-device delay) and sampling through sampling.

Code of Code The dataset apps/letkode unite the memory, triton kernel latency and codnet memory; Resurrection-lm The library provides training/decoding stack.

It is very interesting how this work changes the prediction of performance as a text-numbers: A compact T5GEMMA-Initialized RLM source (python/c ++), triton kernels, or onnx graphs reads and emit calibrated lenefs through constrained decoding. Reported correlations-Apps Memory (ρ> 0.9), RTX A6000 (~ 0.52) are quite strong for triton delays, and NAS Kendall-τ .46-Completeer Huristics, Kernel Pruning, and Multi-Multiple. Open datasets and library make the replica straight and reduce barriers for fine tuning on new hardware or languages.

[Recommended Read] VIPE (Video Pose Engine): A powerful and versatile 3D video anony tool for spatial AI

Check it paperGithb page And Dataset cardFeel free to check us Github page for tutorials, codes and notebooksAlso, feel free to follow us Twitter And don’t forget to join us 100k+ mL subredit More membership Our newspaperwait! Are you on Telegram? Now you can also include us in Telegram.

Asif razzaq is CEO of Marktechpost Media Inc .. As a visionary entrepreneur and engineer, ASIF is committed to using the ability of artificial intelligence for social good. His most recent effort is the launch of an Artificial Intelligence Media Platform, Marktekpost, which stands for his intensive coverage of machine learning and deep learning news, technically sound and easily understand by a comprehensive audience. The stage claims more than 2 million monthly ideas, reflecting its popularity among the audience.

🙌 Follow Marktechpost: Add us as a favorite source on Google.

Source link

Related Stories

Liquid AI Introduces LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M: Dense Bi-Encoder and Late-Interaction Models for Fast Multilingual Search Across 11 Languages

Access Denied

In game theory, generalists sometimes win out over specialists | MIT News

You may have missed

Gold On Track For Third Weekly Loss On Rate Hike Concerns

Taurus, Weekly Horoscope, June 21 to June 26, 2026: Love, career, and finances see favourable trends

MP Jasin injured in car accident early morning

Andy Burnham could soon challenge Keir Starmer as the Labour leader