top of page
server-parts.eu

server-parts.eu Blog

HPC Server Clusters: Build High-Performance Systems with Refurbished Hardware

High-Performance Computing (HPC) server clusters power cutting-edge AI, simulations, and data-intensive workloads. Whether you're deploying machine learning models, financial simulations, or scientific research, a well-planned HPC infrastructure ensures maximum performance and cost efficiency.


Refurbished HPC Servers and Server Parts

This article provides a step-by-step approach, covering:


  • Hardware selection (GPUs, CPUs, memory, storage, networking)

  • Performance metrics (teraflops, bandwidth, latency)

  • Refurbished vs. new options for cost efficiency

  • Deployment and optimization best practices


Build a high-performance HPC cluster with expert guidance from server-parts.eu. Learn how to choose the best GPUs, CPUs, memory, and storage for AI, machine learning, and scientific computing. Compare new vs. refurbished HPC servers to maximize performance and cost efficiency. Discover key performance metrics like teraflops, bandwidth, and latency to optimize your enterprise HPC infrastructure.

 

Step 1: Define HPC Server Cluster Requirements


Before choosing hardware, outline your workload needs.


Key Questions to Answer:

✔ What is the primary use case? (AI, simulations, rendering, etc.)

✔ How much compute power do you need? (FLOPS, core count, memory bandwidth)

✔ Will you run workloads on-premise or use cloud HPC?

✔ What is your budget, and are you open to refurbished hardware?

✔ How will you handle cooling and power requirements?

HPC Workload Type

Recommended Hardware

Deep Learning & AI

NVIDIA H100, A100, AMD MI300

Scientific Computing

Intel Xeon Platinum, AMD EPYC, NVIDIA A40

Financial Modeling

NVIDIA A30, V100, or AMD Instinct GPUs

Rendering & Simulation

RTX 6000 Ada, Quadro GPUs


 

Step 2: Select Compute Hardware for HPC Server Clusters


1. CPU Selection: Intel vs. AMD

HPC clusters require high core count CPUs optimized for parallel processing.

Feature

Intel Xeon

AMD EPYC

Max Cores

60 (Xeon Sapphire Rapids)

128 (EPYC Genoa)

Memory Bandwidth

480 GB/s

460 GB/s

PCIe Lanes

80+

128

Power Efficiency

Lower

Higher

HPC Use Cases

Scientific computing, traditional workloads

AI, data analytics, virtualization

Best choice: AMD EPYC for AI & data workloads, Intel Xeon for traditional HPC.


 

2. GPU Selection: NVIDIA vs. AMD

GPUs significantly accelerate AI, ML, and parallel workloads.

GPU Model

Memory

TFLOPS (FP64)

Bandwidth

Use Case

NVIDIA H100

80GB HBM3

60 TFLOPS

3.3 TB/s

AI/ML, HPC

NVIDIA A100

40/80GB HBM2

19.5 TFLOPS

2.0 TB/s

AI, cloud HPC

AMD MI300

128GB HBM3

47 TFLOPS

5.3 TB/s

AI, simulations

NVIDIA A40

48GB GDDR6

7.1 TFLOPS

696 GB/s

Visualization, VDI

RTX 6000 Ada

48GB GDDR6

5.8 TFLOPS

960 GB/s

Engineering, CAD

Best choice: NVIDIA H100 for AI, AMD MI300 for memory-heavy workloads.


 

3. Memory Considerations

✔ Use ECC DDR5 RAM for reliability.

✔ AI workloads need 512GB+ RAM per node.

✔ Scientific computing prefers 1TB+ RAM for large datasets.


 

Step 3: Storage & Networking for HPC Server Cluster


1. Storage Solutions

NVMe SSDs are essential for high-speed data access.

✔ Use parallel file systems (Lustre, GPFS) for scalability.

✔ Hybrid setups combine SSD caching + HDD storage for cost savings.

Storage Type

Read Speed

Write Speed

Best for

NVMe SSD (PCIe 4.0)

7,000 MB/s

6,500 MB/s

AI/ML, DBs

SAS SSD (12Gb/s)

2,100 MB/s

1,800 MB/s

Mixed workloads

HDD (SAS 12Gb/s)

250 MB/s

250 MB/s

Bulk storage

Best choice: NVMe SSD for compute nodes, SAS SSDs for storage nodes.


 

2. Networking for HPC

InfiniBand HDR (200Gb/s) or NDR (400Gb/s) for low latency.

Ethernet 100GbE for cost-effective setups.

✔ Use RoCE (RDMA over Converged Ethernet) for lower CPU overhead.

Network Type

Speed

Latency

Use Case

InfiniBand NDR

400Gb/s

<1μs

AI, real-time HPC

InfiniBand HDR

200Gb/s

<2μs

General HPC

Ethernet 100GbE

100Gb/s

10μs

Budget-friendly

Best choice: InfiniBand NDR for top-tier AI, 100GbE for budget setups.


 

Step 4: Power, Cooling & Cluster Management for HPC Server Cluster


✔ HPC racks require 3-phase power (208V or 400V).

Liquid cooling is essential for high-density GPU clusters.

✔ Use SLURM, Kubernetes, or OpenStack for job scheduling.

Best choice: Immersion cooling for AI clusters, air cooling for standard HPC.


 

Step 5: Optimize Costs with Refurbished HPC Server Cluster Hardware


Enterprises can reduce costs by up to 80% using refurbished servers. Ensure:


Minimum 3-year warranty

Tested performance benchmarks

Trusted supplier

No upfront payment—test first, pay later

Fast availability—avoid long vendor lead times


Recommended Refurbished Models:

Dell PowerEdge XE8545 (4U, AMD EPYC, NVIDIA A100/H100, NVLink)

  • Best for AI/ML training, deep learning, and simulations

  • Up to 4x NVIDIA A100 80GB or H100 GPUs, delivering over 3.2 petaflops FP16 compute

  • Direct NVLink support for ultra-fast GPU communication


 

HPE Apollo 6500 Gen10 Plus (4U, HPC & AI GPU Server)

  • Designed specifically for HPC and AI clusters

  • Up to 8x NVIDIA A100, H100, or AMD Instinct MI250X GPUs

  • Dual AMD EPYC 7003/9004 or Intel Xeon Scalable CPUs

  • High-bandwidth PCIe 4.0/5.0 with liquid cooling options


 

Lenovo ThinkSystem SR670 V2 (GPU-Dense HPC Node, 2U)

  • Perfect for AI supercomputing and scientific research

  • Supports up to 8x NVIDIA H100, A100, or AMD MI250X GPUs

  • Designed for scalable HPC clusters, optimized for FLOPS-intensive workloads


 

Supermicro 9029GP-TNVRT (8x GPU, 10 PetaFLOPS AI Training Server)

  • One of the highest teraflops per U solutions for HPC

  • Up to 10 petaflops FP16 compute with NVIDIA H100/A100 GPUs

  • Optimized for AI training, CFD, and large-scale simulations


 

Step 6: Deployment & Benchmarking of HPC Server Clusters


Install Rocky Linux, AlmaLinux, or Ubuntu LTS.

Use LINPACK & MLPerf to benchmark performance.

Tune BIOS for HPC mode (disable C-states, enable NUMA).


 

Refurbished HPC Servers and Server Parts

Kommentare


bottom of page