Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions docs/sphinx_doc/source/api_reference.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
.. _api-reference:

API Reference
=============

This page shows some useful APIs of Trinity-RFT. Click the API name to see the detailed documentation.

.. toctree::
:maxdepth: 1
:glob:

build_api/trinity.buffer
build_api/trinity.explorer
build_api/trinity.trainer
build_api/trinity.algorithm
build_api/trinity.manager
build_api/trinity.common
build_api/trinity.utils
13 changes: 4 additions & 9 deletions docs/sphinx_doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,19 +35,14 @@ Welcome to Trinity-RFT's documentation!

.. toctree::
:maxdepth: 2
:hidden:
:caption: FAQ

tutorial/faq.md

.. toctree::
:maxdepth: 1
:glob:
:maxdepth: 2
:hidden:
:caption: API Reference

build_api/trinity.buffer
build_api/trinity.explorer
build_api/trinity.trainer
build_api/trinity.algorithm
build_api/trinity.manager
build_api/trinity.common
build_api/trinity.utils
api_reference
28 changes: 14 additions & 14 deletions docs/sphinx_doc/source/main.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
# Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models



## 🚀 News

* [2025-07] Trinity-RFT v0.2.0 is released.
Expand Down Expand Up @@ -82,6 +81,7 @@ It is designed to support diverse application scenarios and serve as a unified p
![Trinity-RFT-data-pipelines](../assets/trinity-data-pipelines.png)

</details>
<br>



Expand All @@ -90,12 +90,12 @@ It is designed to support diverse application scenarios and serve as a unified p

* **Adaptation to New Scenarios:**

Implement agent-environment interaction logic in a single `Workflow` or `MultiTurnWorkflow` class. ([Example](./docs/sphinx_doc/source/tutorial/example_multi_turn.md))
Implement agent-environment interaction logic in a single `Workflow` or `MultiTurnWorkflow` class. ([Example](/tutorial/example_multi_turn.md))


* **RL Algorithm Development:**

Develop custom RL algorithms (loss design, sampling, data processing) in compact, plug-and-play classes. ([Example](./docs/sphinx_doc/source/tutorial/example_mix_algo.md))
Develop custom RL algorithms (loss design, sampling, data processing) in compact, plug-and-play classes. ([Example](/tutorial/example_mix_algo.md))


* **Low-Code Usage:**
Expand Down Expand Up @@ -301,39 +301,39 @@ For studio users, click "Run" in the web interface.

Tutorials for running different RFT modes:

+ [Quick example: GRPO on GSM8k](./docs/sphinx_doc/source/tutorial/example_reasoning_basic.md)
+ [Off-policy RFT](./docs/sphinx_doc/source/tutorial/example_reasoning_advanced.md)
+ [Fully asynchronous RFT](./docs/sphinx_doc/source/tutorial/example_async_mode.md)
+ [Offline learning by DPO or SFT](./docs/sphinx_doc/source/tutorial/example_dpo.md)
+ [Quick example: GRPO on GSM8k](/tutorial/example_reasoning_basic.md)
+ [Off-policy RFT](/tutorial/example_reasoning_advanced.md)
+ [Fully asynchronous RFT](/tutorial/example_async_mode.md)
+ [Offline learning by DPO or SFT](/tutorial/example_dpo.md)


Tutorials for adapting Trinity-RFT to a new multi-turn agentic scenario:

+ [Multi-turn tasks](./docs/sphinx_doc/source/tutorial/example_multi_turn.md)
+ [Multi-turn tasks](/tutorial/example_multi_turn.md)


Tutorials for data-related functionalities:

+ [Advanced data processing & human-in-the-loop](./docs/sphinx_doc/source/tutorial/example_data_functionalities.md)
+ [Advanced data processing & human-in-the-loop](/tutorial/example_data_functionalities.md)


Tutorials for RL algorithm development/research with Trinity-RFT:

+ [RL algorithm development with Trinity-RFT](./docs/sphinx_doc/source/tutorial/example_mix_algo.md)
+ [RL algorithm development with Trinity-RFT](/tutorial/example_mix_algo.md)


Guidelines for full configurations: see [this document](./docs/sphinx_doc/source/tutorial/trinity_configs.md)
Guidelines for full configurations: see [this document](/tutorial/trinity_configs.md)


Guidelines for developers and researchers:

+ [Build new RL scenarios](./docs/sphinx_doc/source/tutorial/trinity_programming_guide.md#workflows-for-rl-environment-developers)
+ [Implement new RL algorithms](./docs/sphinx_doc/source/tutorial/trinity_programming_guide.md#algorithms-for-rl-algorithm-developers)
+ [Build new RL scenarios](/tutorial/trinity_programming_guide.md#workflows-for-rl-environment-developers)
+ [Implement new RL algorithms](/tutorial/trinity_programming_guide.md#algorithms-for-rl-algorithm-developers)




For some frequently asked questions, see [FAQ](./docs/sphinx_doc/source/tutorial/faq.md).
For some frequently asked questions, see [FAQ](/tutorial/faq.md).



Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ In this example, you will learn how to apply the data processor of Trinity-RFT t
2. how to configure the data processor
3. what the data processor can do

Before getting started, you need to prepare the main environment of Trinity-RFT according to the [installation section of the README file](../main.md),
Before getting started, you need to prepare the main environment of Trinity-RFT according to the [installation section of Quickstart](example_reasoning_basic.md),
and store the base url and api key in the environment variables `OPENAI_BASE_URL` and `OPENAI_API_KEY` for some agentic or API-model usages if necessary.

### Data Preparation
Expand Down
32 changes: 31 additions & 1 deletion docs/sphinx_doc/source/tutorial/example_multi_turn.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,37 @@ To run the ALFworld and WebShop env, you need to setup the corresponding environ
- ALFworld is a text-based interactive environment that simulates household scenarios. Agents need to understand natural language instructions and complete various domestic tasks like finding objects, moving items, and operating devices in a virtual home environment.
- WebShop is a simulated online shopping environment where AI agents learn to shop based on user requirements. The platform allows agents to browse products, compare options, and make purchase decisions, mimicking real-world e-commerce interactions.

You may refer to their original environment to complete the setup.
<br>
<details>
<summary>Guidelines for preparing ALFWorld environment</summary>

1. Pip install: `pip install alfworld[full]`

2. Export the path: `export ALFWORLD_DATA=/path/to/alfworld/data`

3. Download the environment: `alfworld-download`

Now you can find the environment in `$ALFWORLD_DATA` and continue with the following steps.
</details>

<details>
<summary>Guidelines for preparing WebShop environment</summary>

1. Install Python 3.8.13

2. Install Java

3. Download the source code: `git clone https://github.com/princeton-nlp/webshop.git webshop`

4. Create a virtual environment: `conda create -n webshop python=3.8.13` and `conda activate webshop`

5. Install requirements into the `webshop` virtual environment via the `setup.sh` script: `./setup.sh [-d small|all]`

Now you can continue with the following steps.
</details>
<br>

You may refer to their original environment for more details.
- For ALFWorld, refer to the [ALFWorld](https://github.com/alfworld/alfworld) repository.
- For WebShop, refer to the [WebShop](https://github.com/princeton-nlp/WebShop) repository.

Expand Down