Skip to content

Installation Guide

This guide covers all installation methods and requirements for Flexium.AI.

System Requirements

Hardware

Component Minimum Recommended
GPU NVIDIA GPU with CUDA support NVIDIA A100/H100 or consumer RTX 30xx/40xx
RAM 8 GB 16+ GB
Storage 1 GB for flexium SSD recommended

Software

Requirement Version Notes
Operating System Linux x86_64 Ubuntu 20.04+, RHEL 8+, Debian 10+
Python 3.8 - 3.12 3.10+ recommended
NVIDIA Driver 580+ Required for zero-residue migration
CUDA 12.4+ Required for driver 580+
PyTorch 2.0+ With CUDA support

Driver 580+ Required

Zero-residue migration requires NVIDIA driver version 580 or higher. Earlier drivers do not support the necessary migration features.

Verify Driver Version

nvidia-smi --query-gpu=driver_version --format=csv,noheader
# Expected output: 580.xx or higher

If your driver is older than 580, you'll need to update:

# Ubuntu/Debian
sudo apt update
sudo apt install nvidia-driver-580

# Or download from NVIDIA website:
# https://www.nvidia.com/Download/index.aspx

Installation Methods

pip install flexium

Method 2: From Source

# Clone the repository
git clone https://github.com/flexiumai/flexium.git
cd flexium

# Install in development mode
pip install -e .

# Or install with all extras
pip install -e ".[all]"

Method 3: From GitHub Release

pip install https://github.com/flexiumai/flexium/releases/download/v0.1.1/flexium-0.1.1-py3-none-any.whl

PyTorch Installation

Flexium requires PyTorch with CUDA 12.4+ support. Install PyTorch before installing flexium.

For CUDA 12.4+

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124

For Latest PyTorch

Visit pytorch.org/get-started to get the install command for your system. Make sure to select CUDA 12.4 or higher.

Verify PyTorch CUDA

python -c "import torch; print(f'PyTorch: {torch.__version__}'); print(f'CUDA available: {torch.cuda.is_available()}'); print(f'CUDA version: {torch.version.cuda}')"

Expected output:

PyTorch: 2.x.x+cu124
CUDA available: True
CUDA version: 12.4


Dependencies

Core Dependencies (Auto-installed)

Package Version Purpose
python-socketio[client] >=5.0.0 WebSocket communication
pynvml >=11.0.0 GPU monitoring
flask >=2.0.0 Web dashboard

Development Dependencies

pip install flexium[dev]
Package Purpose
pytest Testing
pytest-cov Coverage
mypy Type checking
ruff Linting

Environment Setup

# Create virtual environment
python -m venv flexium-env
source flexium-env/bin/activate

# Install PyTorch with CUDA 12.4
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124

# Install flexium
pip install flexium

Option 2: Conda Environment

# Create conda environment
conda create -n flexium python=3.10
conda activate flexium

# Install PyTorch with CUDA 12.4
conda install pytorch torchvision pytorch-cuda=12.4 -c pytorch -c nvidia

# Install flexium
pip install flexium

Option 3: System-wide Installation

# Not recommended, but possible
sudo pip install flexium

Configuration

Create ~/.flexiumrc:

# Server address with workspace
server: app.flexium.ai/myworkspace

# Default device
device: cuda:0

# Heartbeat interval (seconds)
heartbeat_interval: 3.0

Environment Variables

Variable Description Default
FLEXIUM_SERVER Server address with workspace (host:port/workspace) None (local mode)
GPU_DEVICE Default GPU device cuda:0
FLEXIUM_LOG_LEVEL Log level INFO
FLEXIUM_DEBUG Enable debug mode false

Example:

# Format: host:port/workspace
export FLEXIUM_SERVER="app.flexium.ai/myworkspace"
export GPU_DEVICE=cuda:0
export FLEXIUM_LOG_LEVEL=DEBUG

URL Format

The FLEXIUM_SERVER variable uses a token-in-path format: host:port/workspace. This routes your training jobs to the correct workspace orchestrator.

Project-Local Config

Create .flexiumrc in your project directory (takes precedence over ~/.flexiumrc):

server: app.flexium.ai/myworkspace
device: cuda:0

Verification

Step 1: Check Installation

# Verify flexium is installed
python -c "import flexium; print(f'Flexium version: {flexium.__version__}')"

# Verify module loads
python -c "import flexium.auto; print('OK')"

Step 2: Check GPU Access

python -c "
import torch
import pynvml
pynvml.nvmlInit()
device_count = pynvml.nvmlDeviceGetCount()
print(f'GPUs detected: {device_count}')
for i in range(device_count):
    handle = pynvml.nvmlDeviceGetHandleByIndex(i)
    name = pynvml.nvmlDeviceGetName(handle)
    print(f'  GPU {i}: {name}')
pynvml.nvmlShutdown()
"

Step 3: Test Server Connection

# Set your server and workspace
export FLEXIUM_SERVER="app.flexium.ai/myworkspace"

Step 4: Test Training Integration

# Create test script
cat > test_flexium.py << 'EOF'
import flexium.auto
import torch

with flexium.auto.run():
    x = torch.zeros(100, 100).cuda()
    print(f"Tensor on: {x.device}")
    print("Flexium integration working!")
EOF

# Run test
FLEXIUM_SERVER="app.flexium.ai/myworkspace" python test_flexium.py

Troubleshooting Installation

"CUDA not available"

# Check NVIDIA driver
nvidia-smi

# Check PyTorch CUDA
python -c "import torch; print(torch.cuda.is_available())"

Solutions: 1. Install NVIDIA driver: sudo apt install nvidia-driver-580 2. Reinstall PyTorch with CUDA 12.4: pip install torch --index-url https://download.pytorch.org/whl/cu124

"Module 'flexium' not found"

# Check installation
pip show flexium

# Reinstall
pip install --force-reinstall flexium

"python-socketio installation fails"

# Install build dependencies
sudo apt install build-essential python3-dev

# Then install flexium
pip install flexium

"pynvml fails to initialize"

This usually means the NVIDIA driver is not loaded:

# Check if driver is loaded
lsmod | grep nvidia

# Load driver if needed
sudo modprobe nvidia

"Permission denied" errors

# Add user to video group
sudo usermod -aG video $USER

# Log out and back in, or use newgrp
newgrp video

Next Steps