Skip to content

feat(rl): add off-policy IS correction hook (current policy vs rollout)#2084

Open
EazyReal wants to merge 1 commit into
THUDM:mainfrom
EazyReal:off-policy-is
Open

feat(rl): add off-policy IS correction hook (current policy vs rollout)#2084
EazyReal wants to merge 1 commit into
THUDM:mainfrom
EazyReal:off-policy-is

Commits