LLM | The Coders Blog | Home

MaxText LLM fine-tuning SFT post-training AI models

Google Dev: MaxText Expands Post-Training with SFT Introduction

MaxText enhances its post-training capabilities by introducing Supervised Fine-Tuning (SFT) for LLMs.

The Coders Blog

May 6, 2026

Gemini embeddings multimodal AI RAG AI agents LLM computer vision retrieval augmented generation

Building with Gemini Embedding 2: Agentic Multimodal RAG

Harness Gemini Embedding 2 to create sophisticated agentic multimodal RAG systems for advanced AI applications.

The Coders Blog

May 6, 2026

LLM inference TPU Google AI acceleration performance machine learning large language models

3X Speed Boost: Supercharging LLM Inference on Google TPUs

Achieve a threefold increase in LLM inference speed by leveraging Google TPUs for optimized machine learning performance.

The Coders Blog

May 6, 2026

Gemma 4 MTP LLM AI model release new technology deep learning

Gemma 4 MTP Released: A New Era for AI Models

The release of Gemma 4 MTP signifies a potential advancement in AI model capabilities and architecture.

The Coders Blog

May 6, 2026

Qwen LLM quantization BF16 AI performance large language models

Qwen 3.6 27B Quantization: A Deep Dive into Quality

A detailed quality comparison of Qwen 3.6 27B quantizations, including BF16, explores performance trade-offs in large language models.

The Coders Blog

May 6, 2026

LLM inference Qwen MTP AI optimization

2.5x Faster LLM Inference: Qwen 3.6 27B Achieves Breakthrough with MTP

Achieve a significant speed-up in Large Language Model inference using Qwen 3.6 27B with the MTP optimization technique.

The Coders Blog

May 6, 2026

LLM bibtex research academic writing .bib files

Stop Letting LLMs Corrupt Your Research: Guarding Your .bib Files

Learn why letting LLMs edit your .bib files can be detrimental and how to prevent it.

The Coders Blog

May 6, 2026

AI LLM knowledge base documentation data management

Hallucinopedia: Taming AI-Generated Knowledge

Showcasing Hallucinopedia, a new tool designed to effectively manage and curate information from AI models.

The Coders Blog

May 6, 2026

Gemma 4 LLM AI inference performance optimization machine learning multi-token prediction deep learning

Gemma 4: Faster AI Inference Through Advanced Multi-Token Prediction

Explore how Gemma 4 achieves faster inference with innovative multi-token prediction techniques, boosting LLM performance.

The Coders Blog

May 6, 2026

LLM AI machine learning deep learning model training NLP artificial intelligence

From Zero to LLM: The Technical Journey of Training Models from Scratch

A comprehensive guide to the data, compute, and architectural considerations involved in building your own Large Language Model.

The Coders Blog

May 5, 2026

LLM Quantization AI inference Deep learning Model compression Performance Optimization Intel AutoRound

Beyond Brute Force: Advanced LLM Quantization for Production AI [2026]

Don't let massive LLMs cripple your compute budget. Explore Intel's AutoRound, a cutting-edge quantization algorithm crucial for efficient, performant AI. Optimize your models today!

The Coders Blog

May 1, 2026

Grok LLM AI Models x.ai API Development Generative AI Model Performance AI Trends

Grok 4.3: Is x.ai's Latest LLM a Real Leap or Just More Hype? [2026]

Grok 4.3 is here. We dive deep into x.ai's new model, dissecting its technical advancements, API changes, and what developers should know. Read our sharp take now!

The Coders Blog

May 1, 2026

LLM Code Generation Vendor Lock-in AI Policy Developer Workflow Trust API Economy Cloud Services

The Hidden Cost of AI Code: When LLMs Become Gatekeepers [2026]

An AI code generator refusing requests or charging extra for specific keywords highlights opaque vendor policies. Developers, beware the hidden costs of AI tools. Read more.

The Coders Blog

May 1, 2026

ChatGPT LLM AI monetization Ad tech Prompt engineering AI ethics Machine Learning Platform strategy

[AI Monetization]: The Invisible Hand of ChatGPT's Ad Machine [2026]

Unpack the hidden mechanics of how ChatGPT delivers ads and what it means for developers, users, and the future of AI. Understand the attribution loop. Read more!

The Coders Blog

Apr 29, 2026

Google Dev: MaxText Expands Post-Training with SFT Introduction

Building with Gemini Embedding 2: Agentic Multimodal RAG

3X Speed Boost: Supercharging LLM Inference on Google TPUs

Gemma 4 MTP Released: A New Era for AI Models

Qwen 3.6 27B Quantization: A Deep Dive into Quality

2.5x Faster LLM Inference: Qwen 3.6 27B Achieves Breakthrough with MTP

Stop Letting LLMs Corrupt Your Research: Guarding Your .bib Files

Hallucinopedia: Taming AI-Generated Knowledge

Gemma 4: Faster AI Inference Through Advanced Multi-Token Prediction

From Zero to LLM: The Technical Journey of Training Models from Scratch

Beyond Brute Force: Advanced LLM Quantization for Production AI [2026]

Grok 4.3: Is x.ai's Latest LLM a Real Leap or Just More Hype? [2026]

The Hidden Cost of AI Code: When LLMs Become Gatekeepers [2026]

[AI Monetization]: The Invisible Hand of ChatGPT's Ad Machine [2026]

Converters

Formatters

Encoder / Decoder

Generators

Design & Utility

Join out mailing list