modal-gpu

@benchflow-ai/modal-gpu

benchflow-ai

230

165 forks

Updated 1/18/2026

View on GitHub

Run Python code on cloud GPUs using Modal serverless platform. Use when you need A100/T4/A10G GPU access for training ML models. Covers Modal app setup, GPU selection, data downloading inside functions, and result handling.

Installation

$skills install @benchflow-ai/modal-gpu

Claude Code

Cursor

Copilot

Codex

Antigravity

Details

Repositorybenchflow-ai/skillsbench

Pathtasks/mhc-layer-impl/environment/skills/modal-gpu/SKILL.md

Branchmain

Scoped Name@benchflow-ai/modal-gpu

Usage

After installing, this skill will be available to your AI coding assistant.

Verify installation:

skills list

Skill Instructions

name: modal-gpu description: Run Python code on cloud GPUs using Modal serverless platform. Use when you need A100/T4/A10G GPU access for training ML models. Covers Modal app setup, GPU selection, data downloading inside functions, and result handling.

Modal GPU Training

Overview

Modal is a serverless platform for running Python code on cloud GPUs. It provides:

Serverless GPUs: On-demand access to T4, A10G, A100 GPUs
Container Images: Define dependencies declaratively with pip
Remote Execution: Run functions on cloud infrastructure
Result Handling: Return Python objects from remote functions

Two patterns:

Single Function: Simple script with @app.function decorator
Multi-Function: Complex workflows with multiple remote calls

Quick Reference

Topic	Reference
Basic Structure	Getting Started
GPU Options	GPU Selection
Data Handling	Data Download
Results & Outputs	Results
Troubleshooting	Common Issues

Installation

pip install modal
modal token set --token-id <id> --token-secret <secret>

Minimal Example

import modal

app = modal.App("my-training-app")

image = modal.Image.debian_slim(python_version="3.11").pip_install(
    "torch",
    "einops",
    "numpy",
)

@app.function(gpu="A100", image=image, timeout=3600)
def train():
    import torch
    device = torch.device("cuda")
    print(f"Using GPU: {torch.cuda.get_device_name(0)}")

    # Training code here
    return {"loss": 0.5}

@app.local_entrypoint()
def main():
    results = train.remote()
    print(results)

Common Imports

import modal
from modal import Image, App

# Inside remote function
import torch
import torch.nn as nn
from huggingface_hub import hf_hub_download

When to Use What

Scenario	Approach
Quick GPU experiments	`gpu="T4"` (16GB, cheapest)
Medium training jobs	`gpu="A10G"` (24GB)
Large-scale training	`gpu="A100"` (40/80GB, fastest)
Long-running jobs	Set `timeout=3600` or higher
Data from HuggingFace	Download inside function with `hf_hub_download`
Return metrics	Return dict from function

Running

# Run script
modal run train_modal.py

# Run in background
modal run --detach train_modal.py

External Resources

Modal Documentation: https://modal.com/docs
Modal Examples: https://github.com/modal-labs/modal-examples

More by benchflow-ai

View all

fjsp-baseline-repair-with-downtime-and-policy

230

Repair an (often imperfect) Flexible Job Shop Scheduling baseline into a downtime-feasible, precedence-correct schedule while staying within policy budgets and matching the evaluator’s exact metrics and “local minimal right-shift” checks.

temporal-python-testing

230

Test Temporal workflows with pytest, time-skipping, and mocking strategies. Covers unit testing, integration testing, replay testing, and local development setup. Use when implementing Temporal workflow tests or debugging test failures.

locational-marginal-prices

230

Extract locational marginal prices (LMPs) from DC-OPF solutions using dual values. Use when computing nodal electricity prices, reserve clearing prices, or performing price impact analysis.

Managed Package Architecture

230

This skill should be used when the user asks to "design package structure", "create managed package", "configure 2GP", "set up namespace", "version management", or mentions managed package topics like "LMA", "subscriber orgs", or "package versioning". Provides comprehensive guidance for second-generation managed package (2GP) architecture, ISV development patterns, and package lifecycle management.