Local GPU Compute for Ann Arbor Net Carbon-Negative

GBL Servers
Local, Regenerative GPU Infrastructure

High-performance NVIDIA GPUs located in Michigan for fast, secure AI development - while strengthening local forests and removing more carbon than we produce.

Ann Arbor–based | A division of GBL Services LLC
NVIDIA Ubuntu Docker Kubernetes PyTorch VMware
Why GBL Servers

Local compute, global-grade performance

Designed for teams that need real GPU power, local collaboration, and a tangible climate-positive impact - without hyperscaler complexity.

Local engineering & low latency
Deploy AI workloads with responsive, low-latency access from Michigan. Work directly with local engineers who understand your environment and constraints.
Ann Arbor presence Latency-optimized routing
🧠
NVIDIA GPU power for AI/ML
Datacenter-grade NVIDIA RTX Pro 6000 Server / L40-class GPUs, designed for training, fine-tuning, and high-throughput inference across modern AI stacks.
PyTorch & TensorFlow LLM & vision workloads
🌲
Regenerative & carbon-negative
Our model removes more carbon than we produce by funding Michigan forest restoration and protection projects - turning compute into a local climate asset.
Net carbon-negative Michigan forest restoration
Simple monthly and annual contracts - no surprise line items, no multi-cloud sprawl.
Hardware & Availability

8-GPU nodes, delivered fast

We start with a single 8-GPU node and scale to larger clusters over time. Annual clients can request custom GPU configurations and dedicated infrastructure.

Production-ready GPU node

A standard deployment is an 8-GPU node with RTX Pro 6000 Server / L40-class performance, ready in under 45 days for qualified clients. By leveraging NVIDIA MIG (Multi-Instance GPU) technology, we can partition each RTX Pro 6000 S into 4 independent 24GB GPU instances, effectively enabling us to serve up to 32 smaller clients or one larger client with greater GPU resource flexibility.

Provisioning timeline
Under 45 days
From signed agreement for a standard 8-GPU node.
Performance class
RTX Pro 6000 Server / L40
Optimized for modern training & inference workloads.
Scaling model
Cluster-ready
Scale from a single node to multi-node clusters as needed.
Custom builds
Annual clients
Custom GPU procurement and configurations available.
SLO-based access windows & maintenance
Secure connectivity & isolation options
Fit for your workloads

Ideal for:

  • LLM training, fine-tuning, and RAG systems.
  • Computer vision, robotics, and simulation.
  • High-throughput batch inference pipelines.
Security & connectivity

Options for dedicated or shared infrastructure, private networking, and integration with your existing security controls.

Local operator High-availability design
GPU Rentals

From a single GPU up to 24 GPUs

Start on one GPU and grow into multi-GPU and multi-node configurations as additional Blackwell-class nodes come online in Ann Arbor.

Single GPU or MIG slices
Tap into a single NVIDIA RTX PRO 6000 Blackwell Server Edition GPU or a MIG-partitioned slice for development, experimentation, and smaller production workloads. Ideal for teams validating models before scaling out.
96GB GDDR7 per GPU MIG-partitioned options PyTorch / TensorFlow ready
4–8 GPU training nodes
Reserve a dedicated 4–8 GPU slice of the node for end-to-end training, fine-tuning, and high-throughput inference. Designed for production-grade LLM and vision pipelines.
Up to 8 GPUs / node High-bandwidth PCIe 5.0 Local NVMe-backed storage
☁️
Clustered GPU pools (up to 24 GPUs)
As we bring additional nodes online, we’ll offer pooled clusters spanning up to 24 GPUs for large-scale training and parallel experimentation – all while preserving the same local, regenerative footprint.
Scale-out roadmap 1–24 GPUs per client Designed for multi-node LLM training
Tell us how many GPUs you need and for how long – we’ll map that to a clear reservation plan, from single-GPU development to multi-GPU cluster training.
Node Architecture

Blackwell-class GPU node design

Each node is built around dual AMD EPYC Turin/Genoa processors and eight NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs, with dense DDR5 memory and NVMe storage.

Core compute & accelerators

Our standard node is engineered for sustained, high-power AI workloads, integrating dual EPYC CPUs, dense DDR5 ECC memory, and eight Blackwell GPUs with modern PCIe 5.0 I/O.

CPUs
2× AMD EPYC 9554
64 cores @ 3.1 GHz each, 256 MB L3 per CPU, Socket SP5 – 128 cores total per node.
Memory footprint
24× 48GB DDR5 ECC
24 DDR5 slots up to 3TB; current config: 1,152GB Registered ECC at 6400 MT/s.
GPU configuration
8× RTX PRO 6000 Blackwell
96GB GDDR7 per GPU, passive-cooled, 600W max TDP, dual-slot, server edition.
PCIe layout
PCIe 5.0 topology
8× PCIe 5.0 x16 double-wide, 1× x8 FH/HL single-width, 1× x8 OCP 3.0 slot.
Primary storage
U.2 / M.2 NVMe
2× 7.68TB U.2 NVMe (PCIe 4.0 x4) plus 1× 2TB M.2 NVMe (PCIe 4.0 x4) for OS / fast scratch.
Drive bays
4× NVMe 2.5" bays
4× NVMe 2.5" bays (no SATA 2.5"), plus 2× M.2 NVMe for flexible boot and cache layouts.
Datacenter-depth 30.9" chassis, rails included
Optimized for sustained 600W-per-GPU workloads
Power & resiliency

Designed to handle dense GPU power envelopes safely and reliably.

  • 4× 2.7kW PSUs @ 200–240V (1,000W @ 100–127V), 80 PLUS Platinum, (3+1) redundant.
  • Auto-switching, 50–60Hz, C14 inlet – datacenter-friendly and future cluster ready.
  • No power cables included by default; rails provided for clean rack integration.
Networking & management

Built for secure, operator-friendly access in a managed Ann Arbor environment.

  • 2× 1GbE onboard interfaces for data plane traffic.
  • 1× dedicated 1GbE IPMI / management port for out-of-band control.
Local operator access Secure management isolation
CPU Hosting & VMs

Web hosting & CPU VMs on EPYC

In addition to GPU workloads, we carve out dedicated CPU and memory capacity on our nodes for web applications, APIs, and general-purpose virtual machines.

Hosting on dual EPYC 9554

Each node is backed by 2× AMD EPYC 9554 processors and dense DDR5 ECC, giving you modern CPU performance for web stacks, APIs, and background jobs alongside GPU workloads.

CPU platform
128 cores / node
Dual EPYC 9554, ideal for high-concurrency web services and data processing.
Memory pool
DDR5 ECC
Registered ECC DDR5 memory, tuned for predictable performance and reliability.
Use cases
• Web apps and APIs
• Internal tools and dashboards
• Data pipelines and microservices
• Supporting services for GPU workloads (vector DBs, orchestration, etc.)
Kubernetes & Docker friendly Linux-first hosting Local operator support

Sample VM & web hosting tiers

Tiers are examples only – we’ll right-size vCPU, memory, and storage to your workloads when we scope your deployment.

Starter web / API
2 vCPU · 8GB RAM
Light web apps, staging environments, small APIs. Includes NVMe-backed storage suitable for typical web workloads.
Core application tier
4 vCPU · 16GB RAM
Primary web applications, internal tools, and moderate concurrency services.
High-concurrency tier
8 vCPU · 32GB RAM
Heavier application servers, orchestration layers, or support systems for GPU pipelines.
Custom CPU nodes
Tailored allocations
We can design custom CPU / RAM / storage profiles aligned with your stack, including dedicated multi-VM groupings.
NVMe-based storage on every tier Transparent, monthly billing Workload-aligned sizing
Pricing Overview

Simple terms, tailored to your workloads

Final pricing is based on hardware configuration, reservation term, and workload profile. We start with a short call to match you to the right model.

Flexible commercial options

Choose how you engage with GBL Servers - from shorter pilot runs to fully reserved, dedicated capacity.

Monthly & quarterly
Project-based access
Ideal for evaluations, pilots, and bursty workloads where flexibility matters.
Annual reservations
Dedicated GPUs
Reserved, dedicated GPUs with the option for custom hardware procurement.
Annual = reserved/dedicated GPUs Custom GPU procurement available Transparent monthly invoicing
📅
15–30 minute intro call
In a short call we’ll:
  • Understand your workloads and timelines.
  • Review hardware options and reservation terms.
  • Outline security & connectivity requirements.
  • Provide a clear next-step proposal.

No obligation. We’ll only recommend a configuration we’d deploy for our own workloads.

About

Ann Arbor–built, regeneration-focused

GBL Servers is an Ann Arbor GPU compute provider delivering secure, local access to AI infrastructure - with a mission to improve Michigan’s natural systems.

GBL Servers is an Ann Arbor GPU compute provider delivering secure, local access to AI infrastructure. We serve teams that need real GPU power, close collaboration, and infrastructure they can understand end-to-end.

Our regenerative model removes more carbon than we produce by improving Michigan forest ecosystems and protecting local natural resources. A portion of profits directly supports forest restoration, conservation, and long-term stewardship in the region.

We treat each deployment as a long-term partnership: tuning hardware to workloads, aligning security and connectivity requirements, and ensuring that the compute you rely on is backing a healthier local environment.

Regenerative compute in practice
For every GPU we deploy, we commit a portion of profits to Michigan-based forest projects - ensuring your AI workloads fund real, measurable ecological improvements close to home.
Connect

Book a call & request access

A quick 15–30 minute conversation to align on workloads, timelines, and availability. From there, we move straight into planning a deployment.

Book a call

Use the calendar link below to choose a time that works for you. If you prefer, you can also call directly.

Or call: 📞 +1-828-888-8826
What to have ready
A rough sense of your workload type (LLMs, vision, simulation, etc.), timeline, and whether you prefer a flexible or reserved model. We’ll handle the infrastructure details.

Request access

Tell us about your organization and what you’re planning to run. We’ll follow up with availability and next steps.

Helpful details: frameworks, typical batch sizes, expected GPU hours per month, data locality or security constraints.