32-thread build on budget

With long time interest in general tech and mostly computer hardware, I am always very enthusiastic when I get to build something.
And twice more when it has two CPUs. Not only is it very enjoyable process from choosing and buying parts to assembling, it’s rather useful endeavor and the result gets put to (mostly rendering in Corona) work from moment it’s finished.

We use quite few machines in the office, ranging from your common 6-core i7s (a bunch of 3930/4930k’s) to dual 10-core Xeons (2680v2) which originally intended as nodes were repurposed to general workstations once I realized how enjoyable the quick render tests can be when you don’t even need to wait for Distributed rendering to kick-up. This week I completed another dual Xeon build, for very affordable price, which the following post will detail, and I am already looking forward to building another dual Xeon monster in next two weeks, although of very different sort.

Consider everything in this blog post to be my personal opinion, with weighted personal preferences, and not something set in stone or completely best for you. Although I strive to get as close to that too.
As a resident within European market, and to make this guide rather universal, prices are average after-market with common 20perc. VAT included. Hold in mind you can build it cheaper based on your residence and business incorporation. With that in mind let’s start.

2670

A powerful 16-core/32-thread rendering node on sub-900 euro budget.

Recently I came across, rather randomly, a Reddit post (link) detailing a flood of cheap Xeon units from decommissioned data centers ( a rumor pointing towards Facebook ). The phenomenon sprung up in October last year, but the prices kept falling down to their current minimum, and they probably won’t go lower. No need, we’re talking about 8-core, 2.6 Ghz (3.0 Ghz effective clock due to all-core turbo feature) E5 2670 (v1) Xeons going for 65 euro a piece.

Their performance ( Cinebench R15 of 1000 points per single cpu) puts them in vicinity of 500 euro i7 six-cores, but you can use two of these in pair, making this extremely budget friendly and high value/performance option. Due to their lower single-thread performance, they don’t hold up to i7 in workstation related tasks as good, although they wouldn’t do bad job either. But for this very nature, I will only suggest them as the perfect budget node.

If you feel uneasy about buying used CPUs (which I presume could have seen 3 years of use), do not. The current crop of Intel CPUs, have been for quite some time, literally indestructible. More so given these spent their former life in cooled server rooms, without option of over-clocking, running on conservative voltage and low 115W consumption. In short, they will keep running another 10 years easily. Mine came in post looking like new.

So what are these ? They’re LGA-2011 first generation Sandy Bridge-E units, of C1/2 stepping ( “revision level”) so late production. Depending on seller, you will be able to buy C1 (SR0H8), which lacks Intel VT-d (virtualization tech) and C2 (SR0KX), which is full-featured. For our purpose ( 3D rendering), lack of virtualization would not matter, but based on your intended use, you might look for seller that sells the latter. In my build I will use C2/SR0KX. They go for the same price.
Buying advice: You don’t need to buy these paired, but for peace of mind, it’s ideal to get them both from same seller and same stepping. You would be guaranteed to get CPUs from same batch, which saw same length of use.

WP_20160603_10_30_19_Pro

Mine build and your build

First thing first, my particular build, isn’t the very cheapest you can go for. Not only did I wanted to build just single unit for curiosity, I heavily prefer the comfort of mind knowing utmost reliability was primary factor, not value, and perfect acoustics given our PCs don’t reside in separate room. This might seem like it contradicts a little bit the budget-ness of this phenomenon, but the difference is little and I will write all the available options, so you can choose how cheap you personally want to go. I also already had some spare parts lying around, which dictated choice of motherboard. So here it is:

2x Xeon E5-2670v1 LGA 2011, C2(SR0H8)
Asus Z9PE-D8 WS (motherboard)
4x16gb DIMM* DDR3 1600Mhz, Crucial ECC Registered (* choice of ECC or non-ECC will depend on your choice of MB)
256GB SSD Samsung 840 EVO
EVGA 760W GQ 80+ Gold
nVidia GT 720, passive cooled
2x Noctua U14s
Fractal Arc XL

The price of this build came in under 900 euros (including parts I already had) because of being able to source the motherboard rather cheaply, and that will be the biggest obstacle for you too. It requires a bit of patience and shopping around. This build is after all, for adventurous types ;- ).

2670_2

What parts to choose

Motherboard “Cheapest one you can get”:
There are quite few options here, both WS-oriented and Server oriented. The difference is the server boards can be bought much cheaper, but can lack features as USB 3.0, audio chip and enough PCI-Express slots or other features This matters very little for a node, but does for workstation where you have have to buy additional PCI cards to supplement the missing features. Lastly, they require the use of ECC memory (registered or non) in most cases. Thankfully, it’s possible to buy extremely cheap Unbuffered DDR3 ECC memory from decommissioned servers as well, even going as far as 100 euros for 64GB of memory sticks, which is incredible value for price. Some choices that you can go with:

Workstation boards: Asus Z9PE-D8 WS, expensive unless you can source used or opened. Best choice in case you would like this as workstation. I bought it since I found it rather cheaply. This is the most superior board of the bunch, as it has all the features you can imagine, but also conventional consumer layout which enables usage of any cooler type and size, and great multiple gpu-setup.

Server boards: Asus Z9PA-D8, Asrock EP2C602 & EP2C602-4L/D16,  INTEL S2600CP family ( S2600CP2 /S2600CP2J / S2600CPIOC ), Supermicro C602 family ( X9DAI; X9DRH-iF, etc. ) Do a bit of shopping around, depending on your residence, only limited options might be available for you. Go for the cheapest you find, there is quite a battle raging on for them :- ).

The boards I name (and link) above are all ATX, E-ATX, or SSI-EEB/CEB (which is size and layout compatible with E-ATX with exception of two standoffs that you can ignore when mounting) in format. There are quite more types, many proprietary or rack-specific, and these should be avoided unless you know how to put them to use (which may require owning a very specific case, or some modifications).

CPU Coolers
So what can keep two 115W Xeons running in good temperature ? Pretty much anything. Solid narrow heatsink with 90-120mm fan from reputable brand will do just fine, since they don’t output a lot of heat.

It’s important to note case and board size compatibility. The case dictates the height of the tower (how far the cooler tower heatsink or fans would reach), but the board and its layout can also affect the size and type of your cooler. Compact ATX-sized server boards may feature narrow layout, which requires a different mounting plate ( see link for difference ), with memory sockets running parallel instead of vertically (server board layout) and very close to CPU sockets. Bigger heatsinks (and their pipes) or fans could obstruct memory modules or affect each other. Buying memory with low profile (most server memory is) helps, so does coolers with higher seating heatsink that would clear such memory, but you might opt for compact sized coolers like Noctua U12DX or U9DX, which are specified directly for server boards and come with correct mounting plate directly (offering both Square and Narrow ILM) and guarantee they will fit.

With conventional layout of Workstation board like Asus Z9DP8 WS which spaces CPU sockets far apart from each other and memory sockets, you have the luxury to use any cooler on the market. Since acoustics are important to me I went for high quality Noctua coolers, which have great lifespan of bearings, and will run silently for years. A powerful 150mm fans in Noctua U14s in my case (my E-ATX board and Big Tower have plenty of clearance for it), can spin at minimal RPM and be literally noise-less. For narrower cases, U12s would be equally good contender.

In either category, there are a lot of cheaper options you can go with, although I am not too versed which ones to endorse most. I might add some list here later.

2670

Memory “Only the quantity you want matters”
Let’s first start with choice of ECC and non-ECC memory. While ECC (regardless of type) will go into any of the boards mentioned, non-ECC will only go into WS-grade boards that state so on their website. To play it safe, opt to buy cheap unbuffered ECC memory on auctions. DDR3 UDIMM, 1333-1600Mhz. Doesn’t matter what brand, neither do latency/timing.
Another thing to account for, is the memory channeling. LGA-2011 is quad-channel platform, which in order to utilize, requires memory modules to be put into multiple of 4s, (4x8gb for example in case of i7), but for dual-socket, this doubles, since each CPU needs its own channels (but they do share the capacity). So to utilize quad-channel, in dual-socket Xeon build, you would need at least 8 memory modules (so for 64gb memory, it would be 8x 8gb sticks, for 32gb, 8x 4gb). Do not worry about this much at all, since the performance difference, at least in actual multi-threaded crunching, is extremely minimal or downright non-existing. I couldn’t find a discernible difference between dual-channel and quad-channel when doing many back-to-back tests. So if you are able to source only 4 sticks, your build will run in dual-channel just fine.

Harddrive
We’re living in age of Solid state drives, and there’s good reason for that. They’re fast, with instant latency and noise-free. And getting very affordable, to a point where I am using them in file-server as well.
For a node, a 120gb would suffice. After all it will only hold few software installations and files that get transferred during distributed network rendering. But if you want to play it safe or be able to repurpose the machine for workstation when necessary, opt for 180-256gb. The price difference is minimal. Buy a brand of your preferred choice, for me that is Samsung EVO-series.

Graphic card
For node, only if your motherboard doesn’t have one built-in on chipset, but most server boards do. Since mine did not (WS boards never do), I bought a 30 euro cheapo passive cooled unit. I still opted for 2GB version, just in case I would like to open a scene there in some eventuality and be able to.
While this guide doesn’t delve into workstation build advice, there are lot of choices currently on the market to satisfy everyone needs. I will prepare another article that will focus on choice of gpu and will feature comparison between popular choices ( from 750ti to Titan-X/1080GTX, and of course, Quadros).

PSU
This is bit tricky, since 2x115W Machine will consume very little, but 2 CPUs require 2x8pin EPS connectors (and don’t confuse them with 6+2 PCI-E connectors, they’re not interchangeable), which common consumer-grade ATX PSUs often stock at higher capacities, roughly from 760W.

I personally never cheap-out on PSU, it is in my humble opinion the single most important part. I always go for 80+ gold units from reputable vendors (Seasonic, Superflower, Enermax, Silverstone, Evga, CWT and Seasonic made Corsair series,etc..). No because of efficiency which matters little here, but because of quality of internals like capacitators, contributing to high reliability of the whole node, something that will matter to you when it will be crunching 24/7 while you’re month away on vacation. Good PSU can last many years. No need to go high-end here, but buy at least a bronze PSU from reputable vendor (Seasonic and Superflower are my top picks, but Corsair (Seasonic or CWT made), EVGA, Silverstone, BeQuite!,etc.. are equally good choices), you can get them well under 100 euros.

Case, last but not least

There is only one thing to look for in part that’s otherwise judged by personal preference. The boards for this build range between ATX and E-ATX size ( SSI CEB being compatible with E-ATX standard). While Big Tower can take all of these, Mid Towers vary, with some having the option for E-ATX and others not. Few of those who don’t list, can still fit it, but you have to do a research yourself.

I personally chose Fractal Arc XL, which is proper big tower, solid and good looking, with high quality build to it, like is standard for Fractal, although at cheaper price than Define XL series (100 vs 140 euro).

If your server board is ATX size only (like Asus Z9PA), you can go for cases as cheap as 40 euros. Do consider though, that more expensive cases, already include up to 3 fans in the price, often making for better value.

2670_4

General building notes

You can see from pictures I always go for horizontal push pull for airflow in my case with fans mounted only in front and back. This is the more silent option, and I run it even for the hottest machines we have (4.5Ghz clocked i7 with Titan-X) without any issue. I am very sensitive to any kind of PC noise, and like them to be literally inaudible.
I also strip Case of all cages, there will be no DVD-mechanic, or HDD-drive. The 2.5” SSD can be mounted in back part of the case and your case will be all nice and tidy. With superior air-flow.
I regulate the fans to run at low rpms through voltage ( 5V for case fans, which yields 500 +/- rpms down from 1200 at 12V) , while still maintaining sub 55 Celsius temperature during crunch.

Benchmarks

Rather solid and self-explanatory. Not bad for something at this price. I would just add that Cinebench (link) does not report correctly on two things, the cpus are in fact running at all-core turbo of 3.0 Ghz (2.99..) which Corona benchmark (link) reported correctly. I also use Windows 10, not 8.
The score of 2000 isn’t the highest you can achieve today, but it is very favourable considering for similar price you can build only 4-core/8 treads i7s, which even when overclocked into 4.5+ Ghz, don’t reach half of multithreaded performance of this build.
While I built this node mostly for interest of this phenomenon, it’s nice addition to my ever growing (but with fewer and fewer machines) farm, currently attacking 16 000 cumulative R15 points.

R15

CoronaBench

Last words

This is not all encompassing guide, rather just general commentary on interesting build I did. It doesn’t dive into extensive comparisons and in-depth explanation which would only further confuse most people, something I plan on different occasion. I might even proof-read it one day (after I am done revising it). For any questions, please feel free to ask under the post on Facebook, or rather in Corona forum thread in Hardware section(link ).