Skip to content

compiledbyutkarsh/AtlasML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🚀 AtlasML

Rust Redis Axum Status

A distributed machine learning job orchestration platform built in Rust.

AtlasML provides a Redis-backed task queue, worker execution pipeline, and orchestration APIs for managing ML workloads. The project explores core distributed systems concepts including task scheduling, worker coordination, queue-based execution, and scalable infrastructure design.


✨ Features

  • ⚡ High-performance Rust backend
  • 🔄 Redis-backed distributed task queue
  • 🧠 ML job orchestration API
  • 👷 Independent worker execution system
  • 📦 JSON-based job serialization
  • 🔗 Decoupled producer-consumer architecture
  • 🚀 Horizontal scaling through multiple workers
  • 🛠 Built with modern Rust ecosystem

🏗 Architecture

                ┌─────────────────┐
                │     Client      │
                └────────┬────────┘
                         │
                         ▼
                ┌─────────────────┐
                │  Orchestrator   │
                │    (Axum API)   │
                └────────┬────────┘
                         │
                         ▼
                ┌─────────────────┐
                │      Redis      │
                │   Job Queue     │
                └────────┬────────┘
                         │
            ┌────────────┴────────────┐
            ▼                         ▼
     ┌─────────────┐          ┌─────────────┐
     │  Worker A   │          │  Worker B   │
     └─────────────┘          └─────────────┘

🛠 Tech Stack

Backend

  • Rust
  • Axum
  • Tokio

Queue & Messaging

  • Redis

Serialization

  • Serde
  • Serde JSON

Infrastructure

  • Distributed Worker Model
  • Queue-Based Execution
  • Producer-Consumer Architecture

🚀 Getting Started

Prerequisites

  • Rust
  • Cargo
  • Redis

Start Redis

brew services start redis

Verify:

redis-cli ping

Expected:

PONG

Run Orchestrator

cd services/orchestrator
cargo run

Run Worker

cd services/worker
cargo run

🧪 Submit a Job

curl -X POST http://127.0.0.1:8080/jobs \
-H "Content-Type: application/json" \
-d '{"name":"bert-training"}'

📋 Example Output

Worker:

🚀 worker started
⚙️ processing: bert-training
✅ completed: f7484c02-82ca-4e7c-8087-4fdcbfd68548

🎯 Project Goals

AtlasML is being developed as a learning-focused distributed systems project covering:

  • Distributed task scheduling
  • Queue-based execution systems
  • Worker orchestration
  • Infrastructure engineering
  • Scalable backend architecture
  • ML platform fundamentals

🗺 Roadmap

Current

  • Redis-backed job queue
  • Distributed worker execution
  • Orchestration API
  • Job serialization pipeline

Planned

  • Worker registration
  • Worker heartbeats
  • Retry engine
  • Dead-letter queue
  • Job priorities
  • Metrics endpoint
  • Multi-worker coordination
  • Docker Compose deployment
  • Integration testing
  • Scheduler service

📚 Inspiration

AtlasML draws inspiration from modern distributed compute and orchestration systems such as:

  • Ray
  • Celery
  • Apache Airflow
  • Kubernetes Jobs

About

Distributed ML job orchestration platform in Rust featuring a Redis-backed task queue, worker execution system, and scalable infrastructure primitives.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages