A distributed machine learning job orchestration platform built in Rust.
AtlasML provides a Redis-backed task queue, worker execution pipeline, and orchestration APIs for managing ML workloads. The project explores core distributed systems concepts including task scheduling, worker coordination, queue-based execution, and scalable infrastructure design.
- ⚡ High-performance Rust backend
- 🔄 Redis-backed distributed task queue
- 🧠 ML job orchestration API
- 👷 Independent worker execution system
- 📦 JSON-based job serialization
- 🔗 Decoupled producer-consumer architecture
- 🚀 Horizontal scaling through multiple workers
- 🛠 Built with modern Rust ecosystem
┌─────────────────┐
│ Client │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Orchestrator │
│ (Axum API) │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Redis │
│ Job Queue │
└────────┬────────┘
│
┌────────────┴────────────┐
▼ ▼
┌─────────────┐ ┌─────────────┐
│ Worker A │ │ Worker B │
└─────────────┘ └─────────────┘
- Rust
- Axum
- Tokio
- Redis
- Serde
- Serde JSON
- Distributed Worker Model
- Queue-Based Execution
- Producer-Consumer Architecture
- Rust
- Cargo
- Redis
brew services start redisVerify:
redis-cli pingExpected:
PONG
cd services/orchestrator
cargo runcd services/worker
cargo runcurl -X POST http://127.0.0.1:8080/jobs \
-H "Content-Type: application/json" \
-d '{"name":"bert-training"}'Worker:
🚀 worker started
⚙️ processing: bert-training
✅ completed: f7484c02-82ca-4e7c-8087-4fdcbfd68548
AtlasML is being developed as a learning-focused distributed systems project covering:
- Distributed task scheduling
- Queue-based execution systems
- Worker orchestration
- Infrastructure engineering
- Scalable backend architecture
- ML platform fundamentals
- Redis-backed job queue
- Distributed worker execution
- Orchestration API
- Job serialization pipeline
- Worker registration
- Worker heartbeats
- Retry engine
- Dead-letter queue
- Job priorities
- Metrics endpoint
- Multi-worker coordination
- Docker Compose deployment
- Integration testing
- Scheduler service
AtlasML draws inspiration from modern distributed compute and orchestration systems such as:
- Ray
- Celery
- Apache Airflow
- Kubernetes Jobs