Back to Blog
Product

VGAC v4: Inference Analytics, NIXL, and Slurm Templates

February 15, 20264 min read
AE

Andrew Espira

Founder & Lead Engineer

v4

VGAC v4 is our biggest release yet. Here's everything new.

LLM Inference Dashboard

A dedicated dashboard for LLM serving workloads. Track prefill vs decode latency, phase imbalance ratios, per-model efficiency, and the cache health metrics that actually predict performance degradation.

NVIDIA NIXL Integration

First-class support for NVIDIA's NIXL protocol. Monitor cross-node KV cache transfers, track per-agent bandwidth utilization, get scaling recommendations, and see backend selection analysis (RDMA vs TCP vs shared memory).

HPC Policy Visibility

For Slurm-managed clusters: partition policies, fairshare status, reservation tracking, pending job explanation ("your job is waiting because partition X is at capacity and user Y has higher fairshare priority"), and topology-aware scheduling analysis.

Slurm Script Generator

Tell VGAC what you want to run, and it generates an optimized Slurm submission script based on current cluster state. Multi-GPU jobs get NCCL environment variables, torchrun setup, and log directories pre-configured. The script includes a lint score and warnings about common mistakes (over-requested resources, wrong partition, missing modules).

Pattern Detection v2

Upgraded AI pattern detection with four new pattern types: peak-hour contention, cascading delay chains, memory pressure precursors, and burst submission detection. Each pattern includes severity scoring and recommended actions.

Try v4 now

Clone the repo, run docker compose up, and explore every feature locally.

View on GitHub
Share this post