I see the raise of popularity of Linux laptops so the hardware compatibility is ready out of the box.
However I wonder how would I build PC right know that has budget - high end specification. For now I'm thinking
Case: does not matter
Fans: does not matter
PSU: does not matter
RAM: does not matter I guess?
Disks: does not matter I guess?
CPU: AMD / Intel - does not matter but I would prefer AMD
GPU: AMD / Intel / Nvidia - for gaming and Wayland - AMD, for AI, ML, CUDA and other first supported technologies - Nvidia.
And now the most confusing part for me - motherboard... Is there even some coreboot or libreboot motherboard for PC that supports "high end" hardware?
Let's just say also a purpose of this Linux PC. Choose any of these
Blender 3D Animation rendering
Gaming
Local LLM running
If you have some good resources on this also let me know.
Basically the only thing that matters for LLM hosting is VRAM capacity. Hence AMD GPUs can be OK for LLM running, especially if a used 3090/P40 isn't an option for you. It works fine, and the 7900/6700 are like the only sanely priced 24GB/16GB cards out there.
I have a 3090, and it's still a giant pain with wayland, so much that I use my AMD IGP for display output and Nvidia still somehow breaks things. Hence I just do all my gaming in Windows TBH.
CPU doesn't matter for llm running, cheap out with a 12600K, 5600, 5700x3d or whatever. And the single-ccd x3d chips are still king for gaming AFAIK.
Basically the only thing that matters for LLM hosting is VRAM capacity
I'll also add that some frameworks and backends still require CUDA. This is improving but before you go and buy an AMD card, make sure the things you want to run will actually run on it.
For local LLM hosting, basically you want exllama, llama.cpp (and derivatives) and vllm, and rocm support for all of them is just fine. It's absolutely worth having a 24GB AMD card over a 16GB Nvidia one, if that's the choice.
The big sticking point I'm not sure about is flash attention for exllama/vllm, but I believe the triton branch of flash attention works fine with AMD GPUs now.
These days, there are amazing "middle sized" models like Qwen 14B, InternLM 20B and Mistral/Codestral 22B that are such a massive step over 7B-9B ones you can kinda run on CPU. And there are even 7Bs that support a really long context now.
IMO its worth reaching for >6GB of VRAM if LLM running is a consideration at all.
Yeah, AMD is lagging behind Nvidia in machine learning performance by like a full generation, maybe more. Similar with raytracing.
If you want absolute top-tier performance, then the RTX 4090 is the best consumer card out there, period. Considering the price and power consumption, this is not surprising. It's hardly fair to compare AMD's top-end to Nvidia's top-end when Nvidia's is over twice the price in the real world.
If your budget for a GPU is <$1600, the 7900 XTX is probably your best bet if you don't absolutely need CUDA. Any performance advantage Nvidia has goes right out the window if you can't fit your whole model in VRAM. I'd take a 24GB AMD card over a 16GB Nvidia card any day.
You could also look at an RTX 3090 (which also has 24GB), but then you'd take a big hit to gaming/raster performance and it'd still probably cost you more than a 7900XTX. Not really sure how a 3090 compares to a 7900XTX in Blender. Anyway, that's probably a more fair comparison if you care about VRAM and price.