Skip to content

Commit 5be7c80

Browse files
docs: auto-sync documentation [skip ci]
1 parent 4ba6fd0 commit 5be7c80

10 files changed

Lines changed: 50 additions & 12 deletions

docs/packages/openadapt-capture.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[![GitHub](https://img.shields.io/github/stars/OpenAdaptAI/openadapt-capture?style=social)](https://github.com/OpenAdaptAI/openadapt-capture)
44

5-
> *Auto-generated from [OpenAdaptAI/openadapt-capture](https://github.com/OpenAdaptAI/openadapt-capture). Last synced: 2026-03-29 15:50 UTC*
5+
> *Auto-generated from [OpenAdaptAI/openadapt-capture](https://github.com/OpenAdaptAI/openadapt-capture). Last synced: 2026-03-29 15:56 UTC*
66
77
---
88

docs/packages/openadapt-consilium.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[![GitHub](https://img.shields.io/github/stars/OpenAdaptAI/openadapt-consilium?style=social)](https://github.com/OpenAdaptAI/openadapt-consilium)
44

5-
> *Auto-generated from [OpenAdaptAI/openadapt-consilium](https://github.com/OpenAdaptAI/openadapt-consilium). Last synced: 2026-03-29 15:50 UTC*
5+
> *Auto-generated from [OpenAdaptAI/openadapt-consilium](https://github.com/OpenAdaptAI/openadapt-consilium). Last synced: 2026-03-29 15:56 UTC*
66
77
---
88

docs/packages/openadapt-crier.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[![GitHub](https://img.shields.io/github/stars/OpenAdaptAI/openadapt-crier?style=social)](https://github.com/OpenAdaptAI/openadapt-crier)
44

5-
> *Auto-generated from [OpenAdaptAI/openadapt-crier](https://github.com/OpenAdaptAI/openadapt-crier). Last synced: 2026-03-29 15:50 UTC*
5+
> *Auto-generated from [OpenAdaptAI/openadapt-crier](https://github.com/OpenAdaptAI/openadapt-crier). Last synced: 2026-03-29 15:56 UTC*
66
77
---
88

docs/packages/openadapt-desktop.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[![GitHub](https://img.shields.io/github/stars/OpenAdaptAI/openadapt-desktop?style=social)](https://github.com/OpenAdaptAI/openadapt-desktop)
44

5-
> *Auto-generated from [OpenAdaptAI/openadapt-desktop](https://github.com/OpenAdaptAI/openadapt-desktop). Last synced: 2026-03-29 15:50 UTC*
5+
> *Auto-generated from [OpenAdaptAI/openadapt-desktop](https://github.com/OpenAdaptAI/openadapt-desktop). Last synced: 2026-03-29 15:56 UTC*
66
77
---
88

docs/packages/openadapt-evals.md

Lines changed: 39 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[![GitHub](https://img.shields.io/github/stars/OpenAdaptAI/openadapt-evals?style=social)](https://github.com/OpenAdaptAI/openadapt-evals)
44

5-
> *Auto-generated from [OpenAdaptAI/openadapt-evals](https://github.com/OpenAdaptAI/openadapt-evals). Last synced: 2026-03-29 15:50 UTC*
5+
> *Auto-generated from [OpenAdaptAI/openadapt-evals](https://github.com/OpenAdaptAI/openadapt-evals). Last synced: 2026-03-29 15:56 UTC*
66
77
---
88

@@ -204,6 +204,44 @@ python scripts/run_full_eval.py \
204204

205205
The endpoint uses the UI-Venus native bounding-box prompt format (`[x1,y1,x2,y2]`) and is compatible with vLLM, Ollama, or any OpenAI-compatible server. Both `DemoExecutor` and `PlannerGrounderAgent` use the same prompt format for consistency.
206206

207+
### GRPO training with TRL (recommended)
208+
209+
The recommended path for RL training of VLM desktop agents uses TRL's `GRPOTrainer` with dense milestone rewards from WAA environments. This replaces the standalone GRPO trainer with a battle-tested implementation that supports Unsloth, vLLM, constrained decoding, and automatic telemetry.
210+
211+
```bash
212+
# Basic training against a live WAA VM
213+
python scripts/train_trl_grpo.py \
214+
--task-dir ./example_tasks \
215+
--server-url http://localhost:5001 \
216+
--model Qwen/Qwen2.5-VL-7B-Instruct \
217+
--output ./grpo_output
218+
219+
# With Unsloth (2x VRAM efficiency) + constrained decoding
220+
python scripts/train_trl_grpo.py \
221+
--task-dir ./example_tasks \
222+
--server-url http://localhost:5001 \
223+
--model Qwen/Qwen2.5-VL-7B-Instruct \
224+
--use-unsloth \
225+
--constrained-decoding \
226+
--output ./grpo_output
227+
228+
# Mock mode (validates full pipeline without VM or GPU)
229+
python scripts/train_trl_grpo.py \
230+
--task-dir ./example_tasks \
231+
--mock \
232+
--output ./grpo_output_mock
233+
234+
# With Weave tracing for experiment tracking
235+
python scripts/train_trl_grpo.py \
236+
--task-dir ./example_tasks \
237+
--server-url http://localhost:5001 \
238+
--model Qwen/Qwen2.5-VL-7B-Instruct \
239+
--weave-project openadapt-grpo \
240+
--output ./grpo_output
241+
```
242+
243+
Key flags: `--constrained-decoding` (Outlines regex, eliminates unparseable output), `--vision-loss-mode` (exclude/include/checkpoint), `--weave-project` (Weave tracing), `--use-vllm` (faster generation), `--loss-type` (grpo/dapo/dr_grpo).
244+
207245
### Parallel evaluation
208246

209247
```bash

docs/packages/openadapt-herald.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[![GitHub](https://img.shields.io/github/stars/OpenAdaptAI/openadapt-herald?style=social)](https://github.com/OpenAdaptAI/openadapt-herald)
44

5-
> *Auto-generated from [OpenAdaptAI/openadapt-herald](https://github.com/OpenAdaptAI/openadapt-herald). Last synced: 2026-03-29 15:50 UTC*
5+
> *Auto-generated from [OpenAdaptAI/openadapt-herald](https://github.com/OpenAdaptAI/openadapt-herald). Last synced: 2026-03-29 15:56 UTC*
66
77
---
88

docs/packages/openadapt-ml.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[![GitHub](https://img.shields.io/github/stars/OpenAdaptAI/openadapt-ml?style=social)](https://github.com/OpenAdaptAI/openadapt-ml)
44

5-
> *Auto-generated from [OpenAdaptAI/openadapt-ml](https://github.com/OpenAdaptAI/openadapt-ml). Last synced: 2026-03-29 15:50 UTC*
5+
> *Auto-generated from [OpenAdaptAI/openadapt-ml](https://github.com/OpenAdaptAI/openadapt-ml). Last synced: 2026-03-29 15:56 UTC*
66
77
---
88

docs/packages/openadapt-wright.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[![GitHub](https://img.shields.io/github/stars/OpenAdaptAI/openadapt-wright?style=social)](https://github.com/OpenAdaptAI/openadapt-wright)
44

5-
> *Auto-generated from [OpenAdaptAI/openadapt-wright](https://github.com/OpenAdaptAI/openadapt-wright). Last synced: 2026-03-29 15:50 UTC*
5+
> *Auto-generated from [OpenAdaptAI/openadapt-wright](https://github.com/OpenAdaptAI/openadapt-wright). Last synced: 2026-03-29 15:56 UTC*
66
77
---
88

docs/packages/openadapt.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[![GitHub](https://img.shields.io/github/stars/OpenAdaptAI/OpenAdapt?style=social)](https://github.com/OpenAdaptAI/OpenAdapt)
44

5-
> *Auto-generated from [OpenAdaptAI/OpenAdapt](https://github.com/OpenAdaptAI/OpenAdapt). Last synced: 2026-03-29 15:50 UTC*
5+
> *Auto-generated from [OpenAdaptAI/OpenAdapt](https://github.com/OpenAdaptAI/OpenAdapt). Last synced: 2026-03-29 15:56 UTC*
66
77
---
88

docs/whats-new.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# What's New
22

33
> *Auto-generated digest of recent changes across the OpenAdapt ecosystem.*
4-
> *Last updated: 2026-03-29 15:50 UTC*
4+
> *Last updated: 2026-03-29 15:57 UTC*
55
66

77

@@ -21,6 +21,8 @@
2121
## openadapt-evals
2222

2323

24+
- [feat: TRL GRPOTrainer migration with drop-in wrapper](https://github.com/OpenAdaptAI/openadapt-evals/pull/229) (#229) — merged
25+
2426
- [feat: Weave integration for LLM/agent tracing](https://github.com/OpenAdaptAI/openadapt-evals/pull/228) (#228) — merged
2527

2628
- [fix: loss diagnostic logging + training step test](https://github.com/OpenAdaptAI/openadapt-evals/pull/227) (#227) — merged
@@ -59,8 +61,6 @@
5961

6062
- [fix: update enrichment tests for new instruction format](https://github.com/OpenAdaptAI/openadapt-evals/pull/210) (#210) — merged
6163

62-
- [feat: document DemoExecutor + standalone trainer, add telemetry events](https://github.com/OpenAdaptAI/openadapt-evals/pull/209) (#209) — merged
63-
6464

6565

6666

0 commit comments

Comments
 (0)