GitHub - modelscope/RM-Gallery: A One-Stop Reward Model Platform

A unified platform for building, evaluating, and applying reward models.

News

2025-10-20 - Auto-Rubric: Learning to Extract Generalizable Criteria for Reward Modeling - We released a new paper on learning generalizable reward criteria for robust modeling.
2025-10-17 - Taming the Judge: Deconflicting AI Feedback for Stable Reinforcement Learning - We introduced techniques to align judge feedback and improve RL stability.
2025-07-09 - Released RM-Gallery v0.1.0 on PyPI

Installation

RM-Gallery requires Python 3.10 or higher (< 3.13).

pip install rm-gallery

Or install from source:

git clone https://github.com/modelscope/RM-Gallery.git
cd RM-Gallery
pip install .

Quick Start

from rm_gallery.core.reward.registry import RewardRegistry
from rm_gallery.core.data.schema import DataSample

# Choose from 35+ pre-built reward models
rm = RewardRegistry.get("safety_listwise_reward")

# Evaluate your data
sample = DataSample(...)
result = rm.evaluate(sample)

See the quickstart guide for a complete example, or try our interactive notebooks.

Features

Pre-built Reward Models

Access 35+ reward models for different domains:

rm = RewardRegistry.get("math_correctness_reward")
rm = RewardRegistry.get("code_quality_reward")
rm = RewardRegistry.get("helpfulness_listwise_reward")

View all reward models

Custom Reward Models

Build your own reward models with simple APIs:

from rm_gallery.core.reward import BasePointWiseReward

class CustomReward(BasePointWiseReward):
    def _evaluate(self, sample, **kwargs):
        # Your evaluation logic
        return RewardResult(...)

Learn more about building custom RMs

Benchmarking

Evaluate models on standard benchmarks:

RewardBench2 - Latest reward model benchmark
RM-Bench - Comprehensive evaluation suite
Conflict Detector - Detect evaluation inconsistencies
JudgeBench - Judge capability assessment

Read the evaluation guide

Applications

Best-of-N Selection - Choose optimal responses from candidates
Data Refinement - Improve dataset quality with reward signals
RLHF Integration - Use rewards in reinforcement learning pipelines
High-Performance Serving - Deploy models with fault-tolerant infrastructure

Documentation

Contributing

We welcome contributions! Please install pre-commit hooks before submitting pull requests:

pip install -e .
pre-commit install

See our contribution guide for details.

Citation

If you use RM-Gallery in your research, please cite:

@software{
title = {RM-Gallery: A One-Stop Reward Model Platform},
author = {The RM-Gallery Team},
url = {https://github.com/modelscope/RM-Gallery},
month = {07},
year = {2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 235 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
external		external
rm_gallery		rm_gallery
tests		tests
.flake8		.flake8
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
mkdocs.yml		mkdocs.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

A unified platform for building, evaluating, and applying reward models.

News

Installation

Quick Start

Features

Pre-built Reward Models

Custom Reward Models

Benchmarking

Applications

Documentation

Contributing

Citation

About

Uh oh!

Releases

Packages

Contributors 7

Languages

License

modelscope/RM-Gallery

Folders and files

Latest commit

History

Repository files navigation

A unified platform for building, evaluating, and applying reward models.

News

Installation

Quick Start

Features

Pre-built Reward Models

Custom Reward Models

Benchmarking

Applications

Documentation

Contributing

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages