Jellyfin Transcode Benchmarks
- Updated: AMD RDNA3 iGPU system using the Z1 Extreme (7840U) and Apple Silicon system with M1 Pro (1st gen, one encoder). Apple M4 Mac mini.
- Primarily using 4K 120Mbps (Unlimited) and “720p” (actually 1080p) 8Mbps presets
- Use HDR tone mapping by default
- Using Defualt CRF of 23 for AVC and 28 for HEVC.
- I prefer subtitle burn-in since it eleminates any subtitle sync issue in browsers. For HDR contents I prefer direct streaming using 3rd party apps to deal with subtitles.
There’s a myth that encoder and decoder alone determines transcoding efficiency. However, with morden 4K HDR videos and styled/image subtitles, transcoding is in many cases a synthetic workload for CPU, GPU and media engine, especailly for low powered NAS CPUs. Here I have multiple test footages on my NAS transcoded via Jellyfin installed on these machines:
- Low powered machine with a 4 E-core Intel N100 CPU, 8GB DDR4 memory and the powerful Intel Gen12 media encoders.
- High powered machine with a 16-Core AMD Ryzen 9 7950X3D CPU, 64GB DDR5 memory and NVIDIA RTX 4090 graphics card, eliminating any possible bottleneck on CPU and GPU 3D engine.
- ROG Ally with AMD Ryzen Z1 Extreme, 16GB LPDDR5 memory and Radeon 780M RDNA3 GPU to simulate 7840-like AMD mini PCs. I only did a quick test of JW4 and SW2 4Kto4K subtitle burn-in on this setup.
- Apple MacBook Pro 2021 with M1 Pro, 32GB LPDDR5 memory and 14-core Apple GPU (single encoder) to simulate M1 Mac minis.
- Apple Mac mini with M4, 16GB LPDDR5 memory and 10-core Apple GPU.
Takeaway
The result is somewhat unexpected, the intel N100 is constantly having its fast media engines underloaded when burning subtitles in Linux, and virtually everywhere in Windows. In Windows ffmpeg
process can easily saturate the CPU just pulling files via SMB. On the other hand, the Intel media engine, even fully loaded, is 3-4x slower than the latest NVENC, and that’s only one of the total of 2.
While Intel’s low power iGPU is still my go for always-on home media servers, a beefier setup is certainly useful if you want to serve multiple users simultanously.
The AMD setup is somewhere in between. It has a powerful 8-core Zen 4 CPU and 12CU RDNA3 GPU to eliminate any other bottlenecks. It is nearly 2x as fast as the N100 setup, though the media codec of the latter is not fully utilized due to GPU bottleneck.
Apple Silicon is a black box hardware that’s most fascinating. It’s blazing fast and wildly efficient. On same transcoding tasks, the M4 Mac mini yields over twice the framerate of the N100 system while consuming less than half the power. However, a persistent myth is somewhat validated: Apple’s Media Engine tends to output crap footage. At a 8Mbps preset, the image quality barely outperforms Intel’s 6Mbps. Apple’s Media Engine can output 1440p from 8Mbps and beyond, while other systems typically only handle 1080p or 4K. This may give it an edge with higher presets, like 15Mbps, maybe not because it’s low quality is beyond redemption. Another thing do need mention is, it’s significantly faster at encoding HEVC than H.264.
For N100
- VAAPI makes no difference
- Low-power encoder is slightly faster
- VPP tone-mapping is useless unless you want some power saving
- Tone mapping and subtitle burning may cause GPU 3D engine bottleneck and impact performance badly.
- Audio transcoding will also saturate CPU, capped at 5-6x.
- Encoding in HEVC has 10%-20% performance hit compared to H.264, however, it’s only a pain point if your browser supports HEVC and you want to watch 4K HDR tone-mapped content in your home network.
- For some videos FFMPEG have to probe through the entire video if one subtitle stream has no offset (common for ASS or SRT streams), better to turn on burn-in for small videos only.
- Not able to handle 4K in Windows.
For Nvidia Cards
- RTX 4090 could pull over 150W power when transcoding and hit maximium frequency.
- Since there is no 3D and CPU bottleneck, one NVENC should constantly hit full load. For certain GPUs with multiple NVENC, since it’s a single session and NVENC split frame encoding is for HEVC and AV1 only, one should not see full utilization.
For AMD 7840U-like SoCs:
- No bottleneck on CPU or GPU
- Draws ~25w power with CPU at 3.2GHz (turbo boost off) and GPU at 0.8-1.6GHz.
For Apple M1 family
Faster than the N100 by a little bit with same level of low power consumption.
Not cost effective but certainlly power saving.
Faster when encode to HEVC.
For Apple M4
- Up to 2x faster than N100, while consuming significantly less power. Light years ahead in terms of efficiency.
- 1080p transcoding performance is not scaled linearly. Also the ME prefers to output 1440p footage.
- Poor image quality makes low bitrate presets barely usable.
- When bandwidth is not super limited and fine with macOS bundled Jellyfin app, this chip is the prime choice for JF transcoding.
Intel N100 with Intel UHD Graphics, 24EUs@750MHz, Linux
JW4
Bitrate: 67.4Mbps
Purpose: Remux 4K HDR10
4Kto4K, VPP tone mapping, no sub burn-in: ~50fps
4Kto4K, non-VPP tone mapping, no burn-in: 57fps
4Kto4K, VPP tone mapping burin-in: 27fps
- Lower CPU and GPU (both 3D and codec) usage
- ~1W GPU, ~7w CPU power
4Kto4K, non-VPP tone mapping burin-in: 43fps
- ~1.8w GPU, ~10w CPU power
4Kto1080p, subtitle burn-in: 82fps
GunBuster
Bitrate: 11.7Mbps
Purpose: 1080 * 1440 video
Original: ~200fps
1080pto1080p 8Mbps: ~270fps
Mazouku
Bitrate: 3.7Mbps
Purpose: 1080p SDR WEBDL anime, with pre-burnt-in subtitles
Original: ~230fps
Madoka
Bitrate: 11.7Mbps
Purpose: High bitrate Bluray Anime
Original: 117fps
to 8Mbps: 114fps
SW2
Bitrate: 41.5Mbps
Purpose: High Bitrate Bluray 4K HDR10
4Kto4K, sub burn-in: ~50fps
4Kto4K, no subtitle: ~75fps
4Kto1080p 8Mbps, sub burn-in: 94fps
Euphoria
Bitrate: 22.7Mbps
Purpose: High Bitrate HDR10 TV show
4Kto4K, sub burn-in: ~50fps
4Kto4K, no sub: ~70fps
to 1080p 8Mbps, with sub: ~110fps
LongSeason (No subtitle)
Bitrate: 7.1Mbps
Purpose: Low bitrate WEBDL 4K HDR (DV)
4Kto4K: ~55fps (3D engine bottleneck, video render at ~60%)
to 1080p 6Mbps: ~130fps
Cyberpunk
Bitrate: 5.7Mbps
Purpose: 1080p HDR
Original, with sub burn-in: ~120fps
without sub burn-in: ~220fps
Intel N100 with Intel UHD Graphics, 24EUs@750MHz, Windows
Notes
- Windows 11 23H2 fresh install
- New Intel drivers break tone mapping, using driver 5085.
JW4
Bitrate: 67.4Mbps
Purpose: Remux 4K HDR10
4Kto4K, VPP tone mapping, no sub burn-in: ~30fps
4Kto4K, burin-in: 17fps
4Kto1080p, subtitle burn-in: ~40fps
GunBuster
Bitrate: 11.7Mbps
Purpose: 1080 * 1440 video
Original: ~200fps
1080pto1080p 8Mbps: ~210fps
Mazouku
- Bitrate: 3.7Mbps
- Purpose: 1080p WEBDL anime, with pre-burnt-in subtitles
- Original: ~160fps
Madoka
Bitrate: 11.7Mbps
Purpose: High bitrate Bluray Anime
Original, burn-in: 89fps
to 6Mbps: 92fps
SW2
Bitrate: 41.5Mbps
Purpose: High Bitrate Bluray 4K HDR10
4Kto4K, sub burn-in: 24fps
4Kto4K, no subtitle: 36fps
4Kto1080p 8Mbps, sub burn-in: 52fps
Euphoria
Bitrate: 22.7Mbps
Purpose: High Bitrate HDR10 TV show
4Kto4K, sub burn-in: ~20fps
4Kto4K, no sub: ~30fps
to 1080p 8Mbps, with sub: ~50fps
LongSeason (No subtitle)
Bitrate: 7.1Mbps
Purpose: Low bitrate WEBDL 4K HDR (DV)
4Kto4K: 11fps
to 1080p 6Mbps: 21fps
Cyberpunk
Bitrate: 5.7Mbps
Purpose: 1080p HDR
Original, with sub burn-in: 53fps
without sub burn-in: ~100fps
NVIDIA RTX 4090, 16384CUDAs @2835MHz
JW4
Bitrate: 67.4Mbps
Purpose: Remux 4K HDR10
4Kto4K, tone mapping, no sub burn-in: ~200fps
4Kto1080p, subtitle burn-in: ~240fps
GunBuster
Bitrate: 11.7Mbps
Purpose: 1080 * 1440 video
1080pto1080p: ~800fps
1080pto1080p 8Mbps: ~900fps
Mazouku
- Bitrate: 3.7Mbps
- Purpose: 1080p WEBDL anime, with pre-burnt-in subtitles
- Original: 837fps
Madoka
- Bitrate: 11.7Mbps
- Purpose: High bitrate Bluray Anime
- Original, burn-in: ~650fps
- to 6Mbps: ~700fps
SW2
Bitrate: 41.5Mbps
Purpose: High Bitrate Bluray 4K HDR10
4Kto4K, sub burn-in: ~250fps
4Kto4K, no subtitle: ~260fps
4Kto1080p 8Mbps, sub burn-in: ~400fps
Euphoria
Bitrate: 22.7Mbps
Purpose: High Bitrate HDR10 TV show
4Kto4K, sub burn-in: ~250fps
4Kto4K, no sub: ~260fps
to 1080p 8Mbps, with sub: ~300fps
LongSeason (No subtitle)
Bitrate: 7.1Mbps
Purpose: Low bitrate WEBDL 4K HDR (DV)
4Kto4K: 370fps
to 1080p 6Mbps: 710fps
Cyberpunk
Bitrate: 5.7Mbps
Purpose: 1080p HDR
Original, with sub burn-in: 860fps
without sub burn-in: 910fps
AMD Ryzen Z1 Extreme, 12CU RDNA3@0.8-2.7GHz
JW4
Bitrate: 67.4Mbps
Purpose: Remux 4K HDR10
4Kto4K, tone mapping, sub burn-in: 76fps
SW2
- Bitrate: 41.5Mbps
- Purpose: High Bitrate Bluray 4K HDR10
- 4Kto4K, sub burn-in: ~110fps
Apple M1 Pro, 14-core Apple GPU
JW4
Bitrate: 67.4Mbps
Purpose: Remux 4K HDR10
4Kto4K, tone mapping, sub burn-in: 48fps (to AVC), 70fps (to HEVC)
SW2
- Bitrate: 41.5Mbps
- Purpose: High Bitrate Bluray 4K HDR10
- 4Kto4K, sub burn-in: ~70fps (to AVC), ~90fps (to HEVC)
Apple M4, 10-core Apple GPU
JW4
Bitrate: 67.4Mbps
Purpose: Remux 4K HDR10
4Kto4K, tone mapping, sub burn-in, CPU only, AVC: 58fps
4Kto4K, tone mapping, sub burn-in, GPU, AVC: 61fps
4Kto4K, tone mapping, sub burn-in, GPU, Apple Video Toolbox tone mapping, AVC: 61fps
4Kto4K, tone mapping, sub burn-in, GPU, HEVC: 102fps
4Kto4K, tone mapping, sub burn-in, GPU, Apple Video Toolbox tone mapping, HEVC: 102fps
4Kto1080p, subtitle burn-in: ~140fps
SW2
- Bitrate: 41.5Mbps
- Purpose: High Bitrate Bluray 4K HDR10
- 4Kto4K, sub burn-in, HEVC: ~130fps
Euphoria
Bitrate: 22.7Mbps
Purpose: High Bitrate HDR10 TV show
4Kto4K, sub burn-in:
4Kto4K, no sub:
to 1080p 8Mbps, with sub
本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!