VOD Deep Dive Part 6: Adaptive Bitrate — How Players Auto-Switch Quality

How ABR works under the hood: throughput-based, buffer-based (BBA), BOLA, MPC, and Pensieve algorithms. Plus practical engineering advice for bitrate ladders and short-form video.

zhuermu · · 20 min
vodstreamingabrbolaadaptive-bitrate

This is Part 6 of the VOD Streaming Deep Dive series.


Why ABR Matters

Imagine watching a show on the subway. Your connection fluctuates between 4G, weak 4G, and 5G:

  • If the app locks to 1080p → it freezes and buffers every time the signal dips.
  • If the app locks to 360p → you get a blurry image even when 5G is available.

The ideal: the app automatically adjusts quality based on network conditions, letting you watch without interruption. That’s ABR (Adaptive Bitrate).


The ABR Decision Loop

After downloading each segment, the player asks itself:

┌──────────────────────────┐
│  How fast did the last    │
│  segment download?        │
│  How much buffer remains? │
│  What's my bandwidth      │
│  estimate?                │
│  Can the device handle    │
│  the current tier?        │
└──────────────────────────┘


       ┌──────────┐
       │ Pick next │
       │ tier      │
       └──────────┘


     Download next segment

This loop runs every 1–4 seconds (triggered after each segment download).

Key signals: recent throughput, buffer level (seconds of pre-downloaded content), current bitrate, and the available bitrate ladder from the manifest.


Strategy 1: Throughput-Based

The most intuitive approach:

Estimated bandwidth = average download speed of recent N segments
Next tier bitrate ≈ estimated bandwidth × 0.8   (20% safety margin)

Example: Recent speed is 5 Mbps → 5 × 0.8 = 4 Mbps → from [360p(0.5) / 720p(2.5) / 1080p(5) / 4K(15)], pick the highest tier ≤ 4 Mbps → 720p.

Pros: Simple and intuitive.

Cons: HTTP/TCP throughput measurements are noisy (slow start, head-of-line blocking, TLS handshake). Leads to frequent quality oscillation — “clear one moment, blurry the next.”

Used in: early hls.js bandwidthFraction strategy.


Strategy 2: Buffer-Based (BBA)

A different philosophy: ignore throughput entirely, just look at buffer level.

Buffer < 10s  → download lowest tier (prevent stalling)
Buffer 10-30s → download mid tier
Buffer > 30s  → download highest tier (plenty of runway)
Bitrate tier

Max │                    ┌────────
    │                   ╱
Mid │              ┌───╱
    │             ╱
Min │────────────┘
    └─────────────────────────► Buffer (seconds)
    0   10s      30s      60s

Proposed by Huang et al. at Stanford/Netflix in their 2014 SIGCOMM paper A Buffer-Based Approach to Rate Adaptation.

Pros: Immune to throughput noise, stable quality (fewer switches).

Cons: Too aggressive at dropping quality during startup (buffer is initially empty); doesn’t fully utilize available bandwidth.


Strategy 3: BOLA (Lyapunov Optimization)

BOLA = Buffer Occupancy based Lyapunov Algorithm

Published in 2016 INFOCOM by Spiteri, Urgaonkar, and Sitaraman. One of the default algorithms in dash.js.

Core Idea

BOLA models ABR as a multi-objective optimization problem:

maximize:  average_quality  -  V × (stall_penalty + switching_penalty)

V is a trade-off parameter: larger V = more conservative (prioritize avoiding stalls); smaller V = more aggressive (prioritize quality).

Using Lyapunov optimization theory, the authors prove that looking only at the current buffer level yields near-optimal decisions.

Simplified Pseudocode

def bola_select_bitrate(buffer_level, bitrates, V):
    best_bitrate = None
    best_score = float('-inf')

    for r in bitrates:
        utility = log(r)  # Diminishing returns on quality
        score = buffer_level * utility - V * r
        if score > best_score:
            best_score = score
            best_bitrate = r

    return best_bitrate

In Practice

dash.js dynamically blends BOLA with throughput rules: use throughput-based during startup (buffer low), switch to BOLA once the buffer is healthy.


Strategy 4: MPC (Model Predictive Control)

Published in 2015 SIGCOMM by Yin, Jindal, Sekar, and Sinopoli.

Idea

Instead of deciding one segment at a time, predict future bandwidth and jointly optimize the next K segments:

Currently at segment n.
Predict bandwidth for the next 5 segments: T1, T2, T3, T4, T5.
Try all bitrate combinations and pick the one that maximizes:
  (total quality - stall penalty - switching penalty)

Bandwidth prediction typically uses harmonic mean (more sensitive to slow samples than arithmetic mean) or EWMA (exponentially weighted moving average).

Pros: Smarter than BOLA (looks ahead). Cons: Slightly more compute; garbage in, garbage out when predictions are wrong.


Strategy 5: Pensieve (AI-Driven ABR)

Published in 2017 SIGCOMM by Mao, Netravali, and Alizadeh (MIT).

Idea

Train a neural network with reinforcement learning (A3C) to make ABR decisions.

Input features: past K seconds of throughput, current buffer, last bitrate, remaining segments, available bitrate ladder.

Output: Which bitrate tier for the next segment.

Training: Simulate millions of sessions on real/synthetic network traces until the network learns an optimal policy.

Outperforms BOLA and MPC on specific test sets, but has limitations: generalization to unseen network patterns is uncertain, interpretability is low, and continuous online fine-tuning is needed.

Top-tier companies (Netflix, YouTube) have built similar learning-based ABR internally. For most platforms, BOLA or MPC is more than sufficient.


What the Industry Actually Does

  • Netflix: Client-side BBA-like strategy + server-side hints (“CDN is loaded, please downgrade”) + VMAF-optimized bitrate ladder.
  • YouTube: Custom algorithm incorporating watch-time prediction; extensive A/B testing.
  • Everyone else: Use the default ABR in open-source players (hls.js, Shaka Player, ExoPlayer) with parameter tuning.

Short-Form Video: Special ABR Considerations

Short-form vertical video differs from long-form VOD in key ways:

  • Time to First Frame (TTFF) is paramount — if users see no image within 300ms of swiping, they leave.
  • Episodes are short (60–90s): ABR may not have time to ramp up before the episode ends.
  • Pre-loading multiple episodes: The app may download segments for N+1 and N+2 in the background.

Common Adjustments

  • Lower startup tier (fast initial playback)
  • Slower upward switching (prevent stall right after startup)
  • Pre-load at low quality (save bandwidth for background downloads)
  • Currently playing episode gets high quality (foreground priority)

Common ABR Problems and Engineering Advice

Why does quality suddenly drop mid-playback?

Usually: bandwidth measurement dipped (cellular signal change, router channel switch), CDN edge node issue, or device thermal throttling.

Why is it stuck on low quality even on good Wi-Fi?

Startup strategy too conservative, throughput history contaminated by old slow samples, or the manifest doesn’t have high tiers (transcoding didn’t generate 1080p).

Why does quality oscillate constantly?

Bitrate ladder tiers too close together, algorithm oversensitive to throughput noise. Fix: space tiers at ~1.5× apart, add switching penalty, or switch from throughput-based to BOLA.

The most important ABR configuration isn’t the algorithm — it’s the bitrate ladder.

A bad ladder defeats any algorithm. Design principles:

  • Adjacent tiers should be 1.5–2× apart in bitrate (too close = meaningless switches; too far = jarring jumps)
  • Lowest tier must be low enough for 2G/weak 3G users
  • Highest tier shouldn’t waste bandwidth (1080p at 8 Mbps looks almost identical to 5 Mbps on a phone)

Per-title encoding customizes the optimal ladder per title.


Key Takeaways

  1. ABR = adaptive bitrate. After each segment, the player decides which tier to download next.
  2. Five classic strategies:
    • Throughput-based: Simple but noisy
    • Buffer-based (BBA): Stable but conservative
    • BOLA: Near-optimal, production-proven (dash.js default)
    • MPC: Looks ahead, smarter but prediction-dependent
    • Pensieve: AI-driven, used by top-tier companies
  3. The bitrate ladder matters more than the algorithm.
  4. Short-form video ABR must be tuned for TTFF and pre-loading.
  5. Tier spacing ~1.5–2×, low enough minimum, reasonable maximum.

Previous: Part 5: Streaming Protocols

Next: Part 7: CDN Distribution