Run Python code on cloud GPUs using Modal serverless platform. Use when you need A100/T4/A10G GPU access for training ML models. Covers Modal app setup, GPU selection, data downloading inside functions, and result handling.
Installation
Details
Usage
After installing, this skill will be available to your AI coding assistant.
Verify installation:
skills listSkill Instructions
name: modal-gpu description: Run Python code on cloud GPUs using Modal serverless platform. Use when you need A100/T4/A10G GPU access for training ML models. Covers Modal app setup, GPU selection, data downloading inside functions, and result handling.
Modal GPU Training
Overview
Modal is a serverless platform for running Python code on cloud GPUs. It provides:
- Serverless GPUs: On-demand access to T4, A10G, A100 GPUs
- Container Images: Define dependencies declaratively with pip
- Remote Execution: Run functions on cloud infrastructure
- Result Handling: Return Python objects from remote functions
Two patterns:
- Single Function: Simple script with
@app.functiondecorator - Multi-Function: Complex workflows with multiple remote calls
Quick Reference
| Topic | Reference |
|---|---|
| Basic Structure | Getting Started |
| GPU Options | GPU Selection |
| Data Handling | Data Download |
| Results & Outputs | Results |
| Troubleshooting | Common Issues |
Installation
pip install modal
modal token set --token-id <id> --token-secret <secret>
Minimal Example
import modal
app = modal.App("my-training-app")
image = modal.Image.debian_slim(python_version="3.11").pip_install(
"torch",
"einops",
"numpy",
)
@app.function(gpu="A100", image=image, timeout=3600)
def train():
import torch
device = torch.device("cuda")
print(f"Using GPU: {torch.cuda.get_device_name(0)}")
# Training code here
return {"loss": 0.5}
@app.local_entrypoint()
def main():
results = train.remote()
print(results)
Common Imports
import modal
from modal import Image, App
# Inside remote function
import torch
import torch.nn as nn
from huggingface_hub import hf_hub_download
When to Use What
| Scenario | Approach |
|---|---|
| Quick GPU experiments | gpu="T4" (16GB, cheapest) |
| Medium training jobs | gpu="A10G" (24GB) |
| Large-scale training | gpu="A100" (40/80GB, fastest) |
| Long-running jobs | Set timeout=3600 or higher |
| Data from HuggingFace | Download inside function with hf_hub_download |
| Return metrics | Return dict from function |
Running
# Run script
modal run train_modal.py
# Run in background
modal run --detach train_modal.py
External Resources
- Modal Documentation: https://modal.com/docs
- Modal Examples: https://github.com/modal-labs/modal-examples
More by benchflow-ai
View allRepair an (often imperfect) Flexible Job Shop Scheduling baseline into a downtime-feasible, precedence-correct schedule while staying within policy budgets and matching the evaluator’s exact metrics and “local minimal right-shift” checks.
Test Temporal workflows with pytest, time-skipping, and mocking strategies. Covers unit testing, integration testing, replay testing, and local development setup. Use when implementing Temporal workflow tests or debugging test failures.
Extract locational marginal prices (LMPs) from DC-OPF solutions using dual values. Use when computing nodal electricity prices, reserve clearing prices, or performing price impact analysis.
This skill should be used when the user asks to "design package structure", "create managed package", "configure 2GP", "set up namespace", "version management", or mentions managed package topics like "LMA", "subscriber orgs", or "package versioning". Provides comprehensive guidance for second-generation managed package (2GP) architecture, ISV development patterns, and package lifecycle management.
