vmstat and iostat: Finding Where Your System Is Stuck

Updated 8 hours ago

Your system is slow. Something is stuck. But what?

vmstat and iostat answer this question from different angles. vmstat watches where time goes—CPU, memory, I/O wait. iostat watches what the disks are doing—throughput, queue depth, utilization. Neither tool alone tells the whole story. Together, they triangulate bottlenecks.

vmstat: Where Is Time Going?

Run vmstat with an interval:

vmstat 5

This updates every 5 seconds. But here's something most people miss: the first line lies. It shows averages since boot—ancient history. Every line after that shows what's happening right now. Ignore line one.

The output tells you:

Procs (processes):

r: Runnable processes waiting for CPU
b: Blocked processes waiting on I/O

Memory:

swpd: Swap space in use
free: Idle memory
buff/cache: Memory used for buffers and cache

Swap:

si: Memory swapped in from disk (per second)
so: Memory swapped out to disk (per second)

I/O:

bi: Blocks read from disk (per second)
bo: Blocks written to disk (per second)

CPU:

us: User time (your applications)
sy: System time (the kernel)
id: Idle time
wa: Waiting for I/O
st: Stolen time (in VMs)

Reading vmstat: What Each Pattern Means

High r (runnable processes): CPU saturation. If r consistently exceeds your core count, processes are fighting for CPU time.

High b (blocked processes): I/O bottleneck. Processes are stuck waiting for disk or network.

Non-zero si/so (swap activity): Memory pressure. The system is shuffling data between RAM and disk. This murders performance. Any sustained swapping means you need more RAM or something is leaking memory.

High wa (I/O wait): The CPU is idle because it's waiting for I/O to complete. This is where vmstat and iostat work together—high wa tells you I/O is the problem, iostat tells you which device.

High st (steal time): In virtual machines, this means the hypervisor is giving your CPU time to other VMs. You're being throttled at the virtualization layer.

iostat: What Are the Disks Doing?

iostat requires the sysstat package:

sudo apt install sysstat      # Debian/Ubuntu
sudo yum install sysstat      # Red Hat/CentOS

Run with extended statistics:

iostat -x 5

The metrics that matter:

r/s, w/s: Reads and writes per second
rkB/s, wkB/s: Throughput in kilobytes per second
avgqu-sz: Average queue depth (requests waiting)
await: Average time requests spend waiting (milliseconds)
%util: How busy the device is (0-100%)

Reading iostat: Identifying Disk Bottlenecks

%util near 100%: The disk is saturated. It can't handle more requests.

High await times: Requests are waiting too long. Either the disk is slow or the queue is backed up.

High avgqu-sz with plateaued throughput: You've hit the device's limit. More requests are arriving than it can process.

High utilization with low throughput: The disk is busy but not moving much data. This usually means random I/O—lots of small requests scattered across the disk. SSDs handle this better than spinning drives.

The Real Power: Cross-Referencing Both Tools

Run them side by side:

# Terminal 1
vmstat 5

# Terminal 2
iostat -x 5

Now you can triangulate:

vmstat shows high wa, iostat shows high %util: Confirmed disk bottleneck. The CPU is waiting because the disk can't keep up.

vmstat shows high wa, iostat shows LOW %util: The bottleneck isn't the disk. It's probably network I/O—NFS mounts, database connections to remote servers, network storage. Disk statistics won't show network waits.

vmstat shows high r, iostat shows low activity: CPU-bound workload. The disks aren't the problem; you need more CPU.

vmstat shows swap activity, iostat shows disk activity: Memory pressure is causing disk I/O. Adding RAM would reduce both problems.

Memory Deep Dive

For detailed memory statistics:

vmstat -s

This shows cumulative stats since boot—total memory, used memory, swap usage. Useful for understanding capacity, not real-time behavior.

To see active vs. inactive memory:

vmstat -a 5

Active memory is in use right now. Inactive memory was recently used and can be reclaimed if needed.

Watching Specific Devices

Focus iostat on particular disks:

iostat -x sda
iostat -x nvme0n1 nvme1n1

Useful when you have multiple disks and need to identify which one is struggling.

Collecting Data for Analysis

Capture a specific time window:

vmstat 5 10    # 10 samples, 5 seconds apart (50 seconds total)
iostat -x 5 10

Establish baselines during normal operation:

vmstat 60 1440 > vmstat-baseline.txt    # 24 hours of 1-minute samples
iostat -x 60 1440 > iostat-baseline.txt

When something goes wrong, compare current output to your baseline. Deviations point to what changed.

Historical Analysis with sar

The sar command (also from sysstat) logs statistics over time:

sar -u 5 10    # CPU usage
sar -r 5 10    # Memory usage
sar -b 5 10    # I/O statistics

Replay historical data:

sar -f /var/log/sysstat/sa15    # Stats from the 15th of the month

This lets you analyze performance during past incidents—what was happening at 3 AM when alerts fired?

Quick Reference: Common Scenarios

Symptom	vmstat shows	iostat shows	Likely cause
Slow everything	High `wa`	High `%util`	Disk bottleneck
Slow everything	High `wa`	Low `%util`	Network I/O bottleneck
Slow computation	High `r`	Low activity	CPU saturation
Periodic slowdowns	`si`/`so` activity	Spike in I/O	Memory pressure, swapping
VM performance issues	High `st`	Normal	Hypervisor contention

Frequently Asked Questions About vmstat and iostat

Was this page helpful?

😔

🤨

😃