Skip to content
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 7 additions & 5 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,21 +80,23 @@ granite-switch/

## Installation (local/dev)

This project uses [uv](https://docs.astral.sh/uv/getting-started/installation/).

```bash
# Core package only (config)
pip install -e .
uv sync

# With HuggingFace backend
pip install -e ".[hf]"
uv sync --extra hf

# With vLLM backend
pip install -e ".[vllm]"
uv sync --extra vllm

# With compose tools
pip install -e ".[compose]"
uv sync --extra compose

# Everything (development)
pip install -e ".[dev]"
uv sync --extra dev
```

## Import Paths
Expand Down
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,10 @@ Thank you for your interest in contributing to Granite Switch!
```bash
git clone https://github.com/<your-username>/granite-switch.git
cd granite-switch
pip install -e ".[dev]"
uv sync --extra dev
```
3. Create a feature branch and make your changes
4. Run tests: `pytest tests/ -v`
4. Run tests: `uv run pytest tests/ -v`
5. Submit a pull request

## Contribution Guidelines
Expand Down
41 changes: 29 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Granite Switch — Build AI models like you build software

[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)

| [**Browse Adapters**](https://huggingface.co/collections/ibm-granite/granite-libraries) | [Models on HF](https://huggingface.co/ibm-granite/granite-switch-4.1-8b-preview) | [Tutorials](tutorials/README.md) |

Expand All @@ -20,19 +20,28 @@ Browse available libraries in the [Granite Libraries collection](https://hugging

### Install

This project uses [uv](https://docs.astral.sh/uv/getting-started/installation/) for dependency management. Install uv first (one-time setup), then:

```bash
python -m venv venv && source venv/bin/activate

# Granite-Switch installation is based on your usecase:
pip install "granite-switch[compose]" # Compose modular models
pip install "granite-switch[hf]" # HuggingFace inference
pip install "granite-switch[vllm]" # vLLM production inference (0.19.x)
pip install "granite-switch[vllm20]" # vLLM 0.20+ (requires CUDA 13+)
pip install "granite-switch[dev]" # Everything (uses vLLM 0.19.x by default)
pip install "granite-switch[dev-vllm20]" # Dev environment with vLLM 0.20+
git clone https://github.com/generative-computing/granite-switch.git
Comment thread
aviv1ron1 marked this conversation as resolved.
Outdated
cd granite-switch
uv sync
```

Requires Python 3.9+ and PyTorch 2.0+.
Then add the extra for your use case:

| Extra | Command | Use case |
|-------|---------|----------|
| `compose` | `uv sync --extra compose` | Compose modular models |
| `hf` | `uv sync --extra hf` | HuggingFace inference |
| `vllm` | `uv sync --extra vllm` | vLLM inference (CUDA 12.x) |
| `vllm20` | `uv sync --extra vllm20` | vLLM 0.20+ (CUDA 13+) |
| `dev` | `uv sync --extra dev` | Full dev environment (CUDA 12.x) |
| `dev-vllm20` | `uv sync --extra dev-vllm20` | Full dev environment (CUDA 13+) |

Requires Python 3.10+ and PyTorch 2.0+.

> **Installing from PyPI instead?** Use `pip install "granite-switch[hf]"` or `uv pip install "granite-switch[hf]"` (swap `hf` for any extra above).

> **vLLM version note:** This project currently defaults to vLLM 0.19.1 due to vLLM 0.20's
> dependency on CUDA 13.0+ (via PyTorch 2.11), which is incompatible with many existing
Expand Down Expand Up @@ -62,10 +71,18 @@ For convenience, you can find already composed Granite Switch models for the Gra

### Run Inference

> **Tip: pre-download the model for faster startup.** The first run will download several GB from Hugging Face, which can be slow. To download in advance using the fast transfer backend:
> ```bash
> uv pip install huggingface_hub[hf_transfer]
> huggingface-cli login # one-time, if not already logged in
> HF_HUB_ENABLE_HF_TRANSFER=1 hf download ibm-granite/granite-switch-4.1-3b-preview
> ```
> Subsequent runs will use the local cache automatically.

**vLLM + Mellea (recommended):**

```bash
pip install mellea
uv pip install mellea
# Example with the 3B model
python -m vllm.entrypoints.openai.api_server --model ibm-granite/granite-switch-4.1-3b-preview --port 8000
```
Expand Down
2 changes: 1 addition & 1 deletion docs/GIT_WORKFLOW.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ Fixes #123

Before committing:

1. **Run tests**: `pytest tests/ -v`
1. **Run tests**: `uv run pytest tests/ -v`
2. **Check comments match code** — stale comments are worse than no comments
3. **Update docs** if behavior changed

Expand Down
8 changes: 8 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,14 @@ conflicts = [
{ extra = "dev-vllm20" },
{ extra = "vllm" },
],
[
{ extra = "tutorials" },
{ extra = "vllm20" },
],
[
{ extra = "tutorials" },
{ extra = "dev-vllm20" },
],
]

[tool.setuptools.packages.find]
Expand Down
16 changes: 10 additions & 6 deletions tutorials/PREREQUISITES.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,30 +19,34 @@ Python 3.10+ is required.

### Base Installation

Install [uv](https://docs.astral.sh/uv/getting-started/installation/), then:

```bash
pip install granite-switch
git clone https://github.com/generative-computing/granite-switch.git
cd granite-switch
uv sync
```

### HuggingFace Backend

For direct model inference with HuggingFace Transformers:

```bash
pip install "granite-switch[hf,compose]"
uv sync --extra hf
```

This includes:
- `transformers` for model loading and generation
- `torch` with CUDA support
- `peft` for LoRA operations
- Compose tools for model building

### vLLM Backend

For production inference with vLLM:

```bash
pip install "granite-switch[vllm]"
uv sync --extra vllm # CUDA 12.x
uv sync --extra vllm20 # CUDA 13+ (requires PyTorch 2.11+)
```

This includes:
Expand All @@ -54,15 +58,15 @@ This includes:
Mellea provides high-level intrinsic functions for adapter invocation:

```bash
pip install mellea
uv pip install mellea
```

### Notebook Dependencies

For running Jupyter notebooks:

```bash
pip install jupyter chromadb tqdm httpx python-dotenv
uv pip install jupyter chromadb tqdm httpx python-dotenv
```

## Model Access
Expand Down
4 changes: 2 additions & 2 deletions tutorials/guides/compare_inference_throughput.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ The notebook runs both servers sequentially on a single A100 GPU and produces
- Two GPUs (one per server) for simultaneous mode, or one GPU for sequential mode
- Install dependencies:
```bash
pip install -e ".[vllm]"
pip install mellea chromadb rich tqdm transformers httpx
uv sync --extra vllm
uv pip install mellea chromadb rich tqdm transformers httpx
```
- Build the ChromaDB index (once):
```bash
Expand Down
2 changes: 1 addition & 1 deletion tutorials/guides/mellea_with_granite_switch.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ See [PREREQUISITES.md](../PREREQUISITES.md) for detailed setup instructions.

```bash
# Install Mellea from source
pip install "git+https://github.com/generative-computing/mellea.git@main"
uv pip install "git+https://github.com/generative-computing/mellea.git@main"
```

## Quick Example
Expand Down
Loading