Leverage pre-built functions to pull and process public data from CommonCrawl, Github or S3 storage
Store all datasets (raw, in-progress, final) in Tracto Cypress prior to model training
Leverage serverless GPUs and CPUs to run data preparation tasks at scale
Train your model
Use your favorite fine-tuning frameworks, like PyTorch, Jax, Hugging Face
Store checkpoints in Tracto Cypress
Scale training jobs to hunderds of GPUs
Monitor results on W&B
Post training
Improve pretrained model to handle specific tasks.
Evaluate trained model via offline inference
Deploy your model
Infer your model at scale.
Download model weights.
Infrastructure
Access GPU infra from Python code
Choose from pre-built Docker images or bring your own image
Run multi GPU training jobs with a few lines of code