LLM Distillation Multi-Level Tutorial

Multi-token prediction technique triples LLM inference speed without auxiliary draft models

With reported 3x speed gains and limited degradation in output quality, the method targets one of the biggest pain points in production AI systems: latency at scale.

Researchers baked 3x inference speedups directly into LLM weights — without speculative decoding

Researchers from the University of Maryland, Lawrence Livermore, Columbia and TogetherAI have developed a training technique that triples LLM inference speed without auxiliary models or infrastructure ...

Ars Technica

Attackers prompted Gemini over 100,000 times while trying to clone it, Google says

On Thursday, Google announced that “commercially motivated” actors have attempted to clone knowledge from its Gemini AI chatbot by simply prompting it. One adversarial session reportedly prompted the ...

Inc

Hackers Are Hammering Google’s Gemini With Prompts to Steal the LLM. Every AI Company Should Be Worried

Google called the attacks “model extraction,” a process Medium defines as: “an attacker distills the knowledge from your expensive model into a new, cheaper one they control.” It’s becoming an ...

Northwestern Media

Red Sift Brings Expert-Level Security Analysis to Any Team with Free LLM

Radar Lite delivers prioritized email, domain and web security assessments with clear fix guidance in under a minute LONDON, UNITED KINGDOM, January 12, 2026 ...

InfoWorld

Researchers propose a self-distillation fix for ‘catastrophic forgetting’ in LLMs

LLMs tend to lose prior skills when fine-tuned for new tasks. A new self-distillation approach aims to reduce regression and simplify model management. A new fine-tuning technique aims to solve ...

Bleeping Computer

Google says hackers are abusing Gemini AI for all attacks stages

State-backed hackers are using Google's Gemini AI model to support all stages of an attack, from reconnaissance to post-compromise actions. Bad actors from China (APT31, Temp.HEX), Iran (APT42), North ...

TechRadar

AI malware, Gemini lures and more: Google reveals how hackers are actually using AI

GTIG finds threat actors are cloning mature AI models using distillation attacks Sophisticated malware can use AI to manipulate code in real time to avoid detection State-sponsored groups are creating ...

CSOonline

Google fears massive attempt to clone Gemini AI through model extraction

Google detected and blocked a campaign involving more than 100,000 prompts that it claimed were designed to copy the proprietary reasoning capabilities of its Gemini AI model, according to a quarterly ...

Nature

Machine learning articles from across Nature Portfolio

Machine learning is the ability of a machine to improve its performance based on previous results. Machine learning methods enable computers to learn without being explicitly programmed and have ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results