Tutorials

Run and Serve Faster VLMs Like Pixtral and Phi-3.5 Vision with vLLM

Understanding how much memory you need to serve a VLM

Sep 19 •

Multimodal RAG with ColPali and Qwen2-VL on Your Computer

Retrieve and exploit information from PDFs without OCR

Sep 16 •

GuideLLM: Is Your Server Ready for LLM Deployment?

Simulate real-world inference workloads with GuideLLM

Sep 12 •

Falcon Mamba, Jamba, RWKV... Can You Use Them on Your Computer?

A close look at quantization and parameter-efficient fine-tuning (LoRA/QLoRA) for SSMs, RWKV, and hybrid models

Sep 5 •

Run Qwen2-VL on Your Computer with Text, Images, and Video, Step by Step

Your local multimodal chat model

Sep 2 •

Run Llama 3.1 70B Instruct on Your GPU with ExLlamaV2 (2.2, 2.5, 3.0, and 4.0-bit)

Is ExLlamaV2 Still Good Enough?

Aug 29 •

Mistral-NeMo: 4.1x Smaller with Quantized Minitron

How Pruning, Knowledge Distillation, and 4-Bit Quantization Can Make Advanced AI Models More Accessible and Cost-Effective

Aug 26 •

Fine-tuning Phi-3.5 MoE and Mini on Your Computer

With code to quantize the models with bitsandbytes and AutoRound

Aug 22 •

QLoRA with AutoRound: Cheaper and Better LLM Fine-tuning on Your GPU

Bitsandbytes is not your only option

Aug 19 •

SmolLM: Full Fine-tuning and Aligning Tiny LLMs on Your Computer

With supervised fine-tuning and distilled DPO

Aug 8 •

Multi-GPU Fine-tuning for Llama 3.1 70B with FSDP and QLoRA

What you can do with only 2x24 GB GPUs, and a lot of CPU RAM

Aug 5 •

Serve Multiple LoRA Adapters with vLLM

Without any increase in latency

Aug 1 •

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts