Local LM servers test results

Models

  • Qwen Family: 7b, 14b, 32b
  • LLama: 8b, 70b
  • Phi 4

Machines

  • Apple M4: 10C Apple GPU (family 9), 120GB/s memory
  • Apple M1 Pro: 14C Apple GPU (family 9). 200GB/s memory
  • NVIDIA RTX 4090 GPU: 128 Ada SMs, 1100GB/s memory
  • AMD Ryzen R9 7950X3D CPU: 16C CPU, ~90GB/s memory

Token/s vs Power

Chip power draw only

Machine 7B/8B 14B 32B 70B Power
M4 16G 20-22 8 - - 15W
M1 Pro 32G 23-26 11 6 - 21W
RTX 4090 24G ~90 57 35 - ~350W
7950X3D 64G 15 - - ~3 ~100W

本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!