Platform Architecture

Engineered for Performance,
Designed for Resilience

Two compute tiers. Two storage models. One design principle: performance where it matters, affordability where it fits, and predictable recovery backed by a strict 50% maximum-provisioning ceiling.

MCC (Gen4) GPC (Gen3) Proxmox VE Ceph + HPE Alletra

Inside the Cloud Propeller Platform

Cloud Propeller is a purpose-built enterprise cloud that favors predictability over scale.

What runs beneath the VMs matters. This page walks through the layers of Cloud Propeller’s platform architecture: Proxmox VE at the foundation, the host architecture with data, control, and storage planes distributed across four NICs for layered redundancy, and the compute and storage tiers that deliver two distinct price/performance options under one shared design philosophy.

Cloud Propeller Stack Cloud Manager Single-Pane Management Portal · MultiPortal Tenant VMs virtualized via KVM Containers coming soon… Proxmox VE 9.1.x Web UI · REST API · Cluster Manager (corosync) KVM Hypervisor LXC Engine Linux Kernel bonding · bridging · multipath · Ceph · iSCSI MCC · Gen 4 Mission Critical Xeon 6745P · DDR5 ECC 400G/host (4 × 100 Gbps) NVMe Ceph Cluster triple-mirror replication GPC · Gen 3 General Purpose Xeon 6246R · DDR4 ECC 40G/host (4 × 10 Gbps) HPE Alletra iSCSI SAN triple-parity RAID Layer 2 + Layer 3 Network Edge routing · peering · ToR fabric Open, Debian-Based · No Per-Socket Tax · Premium Enterprise Support
Hypervisor Foundation

After Nine Years on VMware, We Rebuilt on Something Better

For Cloud Propeller’s first nine years, every host we operated ran VMware ESXi. In mid-2024, just a few months after Broadcom’s acquisition of VMware, we set ESXi aside and re-tested every serious hypervisor platform on the market. What followed was a twelve-month evaluation across raw performance, operational behavior under load, licensing economics, and day-to-day operator experience — the kind of test cycle most providers never run unless they are forced to come up with a new solution.

Proxmox VE emerged as the clear winner — not just in our testing, but in the real-world test workloads our clients helped validate.

One of Cloud Propeller’s core differentiators has always been our focus on high-frequency CPU performance over hyperscale-style density. When we built our previous-generation platform (Gen3), now serving as our General Purpose Compute (GPC) tier, we chose Intel® Xeon® Gold 6246R processors because they delivered an unusually high 3.4 GHz base clock for a 16-core server CPU — exactly the kind of per-core performance profile we wanted, while still keeping core counts practical under VMware’s core-based licensing model.

The move to Proxmox VE came at exactly the right time. As we were designing our next-generation Mission Critical Compute (MCC) platform, Intel introduced the kind of processor we had been waiting for: the Intel® Xeon® 6745P, a 32-core, all-performance-core CPU capable of running at a 3.6 GHz base clock in Intel® SST-PP compute-optimized mode. In the server world, that combination is exceptional: high core count without giving up high base frequency. VMware’s per-core licensing would have penalized that choice precisely because it delivered more cores. Proxmox VE turned that equation around, letting us choose the CPU architecture we actually wanted.

That combination — Proxmox VE’s licensing model and the 6745P’s performance profile — lets our MCC platform deliver exceptional per-core performance, more total compute headroom, and better economics without falling back to slower, density-first CPU choices designed around provider consolidation instead of client performance.

On top of that foundation, our Cloud Manager portal, based on MultiPortal, gives clients a cleaner single pane of glass for everyday cloud operations: faster provisioning, simpler VM lifecycle management, clearer resource visibility, and a more agile operating experience than the legacy VMware Cloud Director model allowed.

Host Architecture

Four NICs, Three Planes, One Design

Every Cloud Propeller host has four NICs — deliberately wired so the busiest, most critical traffic gets the most bandwidth and fault-tolerance, and any failure that does occur stays small, bounded, and predictable by design.

Both of our Platforms, Mission Critical Compute (MCC) and General Purpose Compute (GPC), are wired identically. What differs between them (besides compute) is NIC speed (100 Gbps vs. 10 Gbps), and primary storage (Ceph vs. iSCSI).

The first two NICs are bonded into bond0, an LACP/MLAG across two Arista DCS-7060CX2-32S top-of-rack switches. Bond0 carries the most critical and busiest traffic on the host — VM tenant data, the Management VLAN and Cluster Heartbeat Ring 0, and (on MCC) the Ceph Public VLAN — all on the same high-bandwidth, dual-NIC fault-tolerant path.

Third and fourth NICs are unbonded. NIC 3 carries a second Management VLAN, Cluster Heartbeat Ring 1, and iSCSI Fabric A, and NIC 4 is in charge of Cluster Heartbeat Ring 2 and iSCSI Fabric B.

On MCC, Ceph is the production storage. The Ceph Public VLAN rides bond0; the Ceph Private VLAN, used for OSD↔OSD replication, rides NIC 4. If NIC 4 drops, replication and backfill on that node stop and its OSDs are marked degraded until the link returns, however, production I/O on bond0 keeps running because Ceph Public traffic is unaffected. iSCSI Fabrics A and B are wired and available, but reserved as tier-2 (optional).

On GPC, iSCSI is the active and only storage path — multipathed across both fabrics to the HPE Alletra SAN. Ceph isn’t deployed; NIC 4 simply carries Cluster Heartbeat Ring 2 and iSCSI Fabric B.

Conceptually, we have three intentional traffic planes sharing NICs by design: a data plane on bond0, a control plane spread across all four NICs, and a storage plane on bond0 for Ceph, and NICs 3 and 4 for iSCSI. Our clusters are designed to tolerate the loss of any physical NIC (both ports), any optic, or even an entire top-of-rack switch — and continue to function as if nothing had happened.

Bonded Tenant Data Path

  • bond0 — 2 NICs in LACP/MLAG across two Arista ToRs
  • Carries VM data, mgmt, cluster ring 0, Ceph Public VLAN
  • MTU 9000 jumbo frames end-to-end
  • Hitless single-NIC failure, switch-level fault tolerance

Triple-Redundant Cluster

  • Corosync rings 0/1/2, one per NIC class (bond0, NIC 3, NIC 4)
  • Each ring intentionally shares hardware with a different traffic class
  • Rings can’t all fail to a single physical fault
  • Quorum survives any single-NIC outage

Ceph Storage Plane Design

  • Ceph Public rides bond0 — client-facing I/O on the redundant path
  • Ceph Private on NIC 4 — OSD↔OSD replication
  • Triple-replica writes; if NIC 4 drops, replication stops and OSDs go degraded — production I/O keeps running
  • Replication storms can’t crowd tenant traffic

iSCSI Storage Multipath

  • NIC 3 → iSCSI Fabric A; NIC 4 → iSCSI Fabric B
  • Multipath to HPE Alletra across both fabrics
  • 200 Gbps (8 × 25 Gbps) per controller, MTU 9000
  • Automatic Controller Failover (< 18 seconds I/O pause)
  • Survives loss of either fabric end-to-end
Compute Tiers

Two Tiers. Same Foundation. Different Economics

MCC and GPC are powered by the same hypervisor, have similar host architecture, and ride over the same network fabric. Silicon, memory, storage architecture, and host networking are the levers that differentiate them.

Hardware Comparison

MCC vs GPC, side by side

Specification Gen 4 MCC Mission Critical Compute Gen 3 GPC General Purpose Compute
Platform generation 4th Generation Cloud Propeller Architecture
6th-gen Intel® Xeon® Family
3rd Generation Cloud Propeller Architecture
2nd-gen Intel® Xeon® Scalable Family
Hypervisor Proxmox VE 9.1.x (KVM + LXC, native Linux)
CPU Intel® Xeon® 6745P (Granite Rapids, “P”-variant high-clock, all performance cores — no efficiency or low-priority cores) — Two CPUs per host Intel® Xeon® Gold 6246R (Cascade Lake Refresh, all performance cores — no efficiency or low-priority cores) — Two CPUs per host
CPU speed 32 cores @ 3.6 GHz base (running in SST-PP compute-optimised mode, 4.1 GHz max turbo) 16 cores @ 3.4 GHz base (4.1 GHz max turbo)
Memory DDR5 ECC · 6400 MT/s
2.3 TB (96 GB × 24 DIMMs) per host
DDR4 ECC · 2933 MT/s
1 TB (64 GB × 16 DIMMs) per host
Storage architecture All-flash NVMe Ceph, triple-mirror replication
180 TB (12 × 15 TB SSDs) per host
HPE Alletra NVMe over iSCSI, triple+ parity RAID
Dual-controllers, 8 × 25 Gbps per controller
Storage performance TBD — disk IOPS / throughput TBD — disk IOPS / throughput
Host networking 4 × 100 Gbps NICs (2 bonded into 200G LAG, 2 dedicated to storage + cluster paths) 4 × 10 Gbps NICs (2 bonded into 20G LAG, 2 dedicated to storage + cluster paths)
Top-of-Rack Arista DCS-7060CX2-32S Switches (redundant)
Uplink to core 200G ToR-to-core uplink
L3 Core Extreme Networks MLXe-8 Routers (redundant)
Provisioning ceiling 50% (hard-cap, no over-provisioning)
Uptime SLA 99.9999% (six nines) 99.99% (four nines)
Billing models Pay-As-You-Go (5-min granularity) + Dedicated Capacity
Recommended for Mission-critical production, high-throughput, low-latency workloads Cost-sensitive, general-purpose enterprise workloads, dev/test, batch
Starting price $180 /month $90 /month
Storage Options

Triple-Mirrored on MCC. Enterprise SAN on GPC

Storage defines how a cloud platform behaves under pressure. We take two different paths between the tiers — both fast under load, both engineered to keep tenant I/O running through hardware failures.

Our Mission Critical Compute (MCC) platform runs all-flash NVMe Ceph with triple-mirror replication across the cluster. There is no central storage controller to fail and no RAID rebuild window to wait through — if a disk or even an entire host disappears, Ceph re-replicates the missing copies onto remaining capacity in the background while tenant I/O keeps running.

Capacity, throughput, and parallel client count all scale with cluster size — adding NVMe hosts grows the three together, with no central controller to bottleneck later. Each of our MCC hosts contains 12 × 15 TB Kioxia CM-7 enterprise NVMe SSDs and contributes 180 TB raw to the underlying cluster. Public I/O (tenant data) rides the 200 Gbps MLAG, while private I/O (back-end OSD↔OSD replication) runs on its own dedicated 100 Gbps NIC — isolating tenant traffic from replication storms so neither can crowd the other.

Our General Purpose Compute (GPC) platform runs on HPE Alletra (aka Nimble) NVMe SAN over iSCSI — mature enterprise storage with redundant controllers and triple+ parity protection. The recovery model is different (controller failover plus parity reconstruction) but the operational behavior is the same predictable, well-understood pattern enterprise storage teams already know and trust.

Alletra’s controller pair runs active/passive — one controller serves I/O at the full 200 Gbps front-end (8 × 25 Gbps) while the other stands by, ready to take over. Any active-controller fault triggers automated failover in under 18 seconds, with multipath drivers transparently rerouting I/O to the standby. HPE’s InfoSight platform watches the array’s behavior in production and surfaces the failure modes that matter before they impact tenant I/O.

Gen 4 — NVMe Ceph

  • All-flash NVMe, triple-mirror replication
  • 180 TB per host (12 × 15 TB SSDs)
  • No central storage controller — no SPOF
  • Auto-healing rebalance, no RAID rebuild window

Gen 3 — HPE SAN

  • HPE Alletra NVMe SAN over iSCSI
  • Redundant controllers with automatic failover
  • Triple+ parity RAID protection
  • Enterprise-class storage, well-understood operationally

Recovery Posture

  • Ceph self-heal (Gen 4) on disk or full-host loss
  • HPE controller failover (Gen 3) on controller fault
  • Both survive full-host failure with zero data loss
  • Optional Veeam backup & DR overlay
Our Design Principle

Why We Never Run Past 50% Utilization

Every cluster — MCC and GPC, Pay As You Go and Dedicated — is capped at 50% of its design capacity. Not as an aspirational target. As an actual line we do not cross.

Hyperscalers can get away with aggressive over-provisioning because, at their scale, spare capacity and noisy neighbors get averaged out statistically across enormous fleets. But that does not mean every individual workload gets a clean host, consistent neighbors, or predictable performance. Anyone who has rebooted an instance hoping to land on better underlying hardware understands the difference.

We do not operate at that scale, and we do not want to. Instead, we buy headroom into the design: a 50% ceiling means a host losing a peer does not tip anyone into contention, a maintenance window does not concentrate load, and an unexpected burst has room to breathe.

That headroom is what makes Cloud Propeller feel hyperscale to our own clients. A workload can double overnight — and in real cases, we have had clients grow many times beyond their original footprint within a span of just a few days — without our platform running into a capacity wall of any kind. The room is already built in.

It is more expensive to operate infrastructure with this much headroom, but it also means the platform behaves the way it was architected — under load, during maintenance, and in the worst five minutes of the worst day of the year.

Put the Platform to Work

Stand up a Pay As You Go VDC and evaluate Cloud Propeller under your own workloads — no long-term commitment, no setup fees.