CosmicAC Logo

Introduction

CosmicAC provides managed compute for Machine Learning (ML) workloads.

Infrastructure setup delays execution and diverts attention from model development. CosmicAC abstracts this setup, running and scaling jobs immediately, as needed, without manual server reconfiguration.

Job Types

CosmicAC supports several job types for different ML workflows.

GPU Container

Access high-performance GPU containers for training, experimentation, and development.

GPU containers let you:

  • Run on-demand GPU compute without managing infrastructure

  • Access GPU hardware directly through secure device plugins

  • Work in Virtual Machine (VM)-level isolated environments for secure, dedicated compute

  • Maintain full control over your environment: install packages, run scripts, and configure as needed

  • Comprehensive CLI commands for Job Management:

    Quick reference common managed GPU CLI commands:
    npx cosmicac jobs init
    npx cosmicac jobs create
    npx cosmicac jobs list
    npx cosmicac jobs shell <jobId> <containerId>

    Full reference CLI Commands

Managed Inference

Run inference on open-source models like Qwen through a managed API.

Managed Inference lets you:

  • Access open-source models without deploying or managing serving infrastructure

  • Interact with your model directly from the dashboard or from the CLI

  • Comprehensive CLI commands for Managed Inference:

    Quick reference common Managed Inference CLI commands:
    npx cosmicac inference init
    npx cosmicac inference list-models
    npx cosmicac inference chat --message "Explain quantum computing."

    Full reference CLI commands for Managed Inference.

Continued Pretraining

Coming soon: Continued Pretraining

Extend base models on your own data for domain-specific tasks.

Continued Pretraining lets you:

  • Train on your own datasets
  • Save checkpoints at intervals during training

Why CosmicAC?

Minimal setup: Submit jobs via the CLI or web interface. CosmicAC provisions GPU resources and schedules your workload automatically, with no manual server requests or environment configuration.

Secure, isolated environments: Each workload runs inside a KubeVirt virtual machine, providing VM-level isolation while maintaining direct GPU access.

Fast provisioning: Start workloads in minutes, not days. CosmicAC replaces manual SLURM-based workflows with automated provisioning and scheduling.

Built-in inference serving: Deploy models instantly via the Managed Inference API. CosmicAC handles API key authentication, load balancing, and service discovery.

Real-time notifications: Receive email and push notifications when costs exceed thresholds or errors occur.

Who is CosmicAC for?

RoleUse Case
ML engineersTrain models, run experiments
Data scientistsDeploy inference pipelines
Software engineersIntegrate inference API into applications
DevOps teamsManage GPU infrastructure at scale

Core Architecture

CosmicAC uses Kubernetes for orchestration and KubeVirt for secure workload isolation. Kubernetes schedules containers, allocates GPU resources, and manages job lifecycle. KubeVirt runs each workload in an isolated virtual machine without requiring privileged containers, applying standard Kubernetes security controls (RBAC, Security-enhanced Linux, and network policies) while exposing GPU devices through secure device plugins.

Kubernetes Implementation

CosmicAC uses Kubernetes as its core orchestration layer, replacing manual SLURM-based workflows with automated provisioning and scheduling.

Before (SLURM)After (Kubernetes)
Request servers manuallySubmit jobs via CosmicAC
Configure SLURMProvision infrastructure automatically
Set up the environmentSchedule containers automatically
Wait days for setupStart workloads in minutes

See System Components for detailed documentation of the architecture.

Next steps

Get Started

Learn More

Create a GPU Container

Manage Inference

Consult the reference

On this page