Skip to content

Conversation

@pan-x-c
Copy link
Collaborator

@pan-x-c pan-x-c commented Jul 18, 2025

Description

This PR introduces a series of Step-wise Workflows to support step-wise reward calculations for tasks. Here are the main changes:

  • Add Step-wise Workflow Base Classes: Introduced the StepWiseRewardWorkflow and RewardPropagationWorkflow classes as a base for all step-wise reward workflows, defining the basic workflow structure and reward calculation methods. The task execution (Agent application) part is completely decoupled from the framework, allowing users to directly use the OpenAI API to write applications with low migration costs.
  • Enhance Experience Structure: The Experience structure now supports recording the current step of execution, facilitating grouping during training.
  • WorkflowRunner Refactoring: The WorkflowRunner no longer directly writes the Experience obtained from running the Workflow into the Buffer. Instead, it sends the results back to the Explorer for aggregation and grouping before unified writing, thus supporting finer-grained management.
  • Support AddStrategy: Explorer can pre-process the collected experiences before writing them into the experience buffer.

Checklist

Please check the following items before code is ready to be reviewed.

  • Code has passed all tests
  • Docstrings have been added/updated in Google Style
  • Documentation has been updated
  • Code is ready for review

@pan-x-c pan-x-c changed the title Add Step-wise Workflow [WIP] Add Step-wise Workflow Jul 18, 2025
@pan-x-c
Copy link
Collaborator Author

pan-x-c commented Jul 21, 2025

/run-unittest

@github-actions
Copy link

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
63 55 8 0 0 0 1.0s

Failed Tests

Failed Tests ❌ Fail Message
❌ tests/trainer/trainer_test.py::TestTrainerCountdown::test_trainer The test failed in the call phase due to an assertion error
❌ tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer The test failed in the call phase due to an assertion error
❌ tests/trainer/trainer_test.py::TestTrainerGSM8K::test_trainer The test failed in the call phase due to an assertion error
❌ tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer The test failed in the call phase due to an assertion error
❌ tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer The test failed in the call phase due to an assertion error
❌ tests/trainer/trainer_test.py::TestFullyAsyncMode::test_fully_async_mode_0_queue The test failed in the call phase
❌ tests/trainer/trainer_test.py::TestFullyAsyncMode::test_fully_async_mode_1_priority_queue The test failed in the call phase
❌ tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins The test failed in the call phase

Tests

Test Name Status Flaky Duration
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_dpo_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_mix_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_opmd_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sft_policy_loss 1ms
tests/buffer/file_test.py::TestFileBuffer::test_file_buffer 3ms
tests/buffer/file_test.py::TestFileBuffer::test_file_reader 1ms
tests/buffer/file_test.py::TestFileBuffer::test_file_writer 2ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_buffer_reuse 6ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_capacity 2ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_0_queue 5ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_1_priority_queue 4ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_capacity 4ms
tests/buffer/sql_test.py::TestSQLBuffer::test_create_sql_buffer 5ms
tests/common/config_test.py::TestConfig::test_all_examples_are_valid 1ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid 1ms
tests/common/config_test.py::TestConfig::test_load_default_config 4ms
tests/common/experience_test.py::TestEID::test_eid_properties 1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type 1ms
tests/common/experience_test.py::TestExperience::test_assertions 1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience 1ms
tests/common/experience_test.py::TestExperience::test_gather 1ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience 1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize 1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience 1ms
tests/common/experience_test.py::TestExperience::test_to_dict 1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion 1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate 53ms
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate 54ms
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate 54ms
tests/common/vllm_test.py::ModelWrapperTest_3::test_generate 42ms
tests/common/vllm_test.py::ModelWrapperTest_4::test_generate 55ms
tests/common/vllm_test.py::TestAPIServer::test_api 33ms
tests/common/vllm_test.py::TestTokenizer::test_assistant_token_mask 1ms
tests/explorer/explorer_test.py::BaseExplorerCase::test_explorer 1ms
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer 90ms
tests/explorer/explorer_test.py::TestExplorerCountdownNoEval::test_explorer 110ms
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations 5ms
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results 20ms
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods 15ms
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop 9ms
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks 8ms
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all 8ms
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch 14ms
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_rm_gallery_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable 1ms
tests/trainer/trainer_test.py::BaseTrainerCase::test_trainer 1ms
tests/trainer/trainer_test.py::TestTrainerCountdown::test_trainer 45ms
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer 43ms
tests/trainer/trainer_test.py::TestTrainerGSM8K::test_trainer 40ms
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer 39ms
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer 26ms
tests/trainer/trainer_test.py::TestFullyAsyncMode::test_fully_async_mode_0_queue 97ms
tests/trainer/trainer_test.py::TestFullyAsyncMode::test_fully_async_mode_1_priority_queue 91ms
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins 1ms

Github Test Reporter by CTRF 💚

@pan-x-c
Copy link
Collaborator Author

pan-x-c commented Jul 21, 2025

/run-unittest

@pan-x-c pan-x-c changed the title [WIP] Add Step-wise Workflow Add Step-wise Workflow Jul 21, 2025
@github-actions
Copy link

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
64 63 1 0 0 0 1.4s

Failed Tests

Failed Tests ❌ Fail Message
❌ tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer The test failed in the call phase due to an assertion error

Tests

Test Name Status Flaky Duration
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_dpo_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_mix_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_opmd_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sft_policy_loss 1ms
tests/buffer/file_test.py::TestFileBuffer::test_file_buffer 3ms
tests/buffer/file_test.py::TestFileBuffer::test_file_reader 1ms
tests/buffer/file_test.py::TestFileBuffer::test_file_writer 2ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_buffer_reuse 6ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_capacity 2ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_0_queue 5ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_1_priority_queue 4ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_capacity 4ms
tests/buffer/sql_test.py::TestSQLBuffer::test_create_sql_buffer 4ms
tests/common/config_test.py::TestConfig::test_all_examples_are_valid 1ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid 1ms
tests/common/config_test.py::TestConfig::test_load_default_config 4ms
tests/common/experience_test.py::TestEID::test_eid_properties 1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type 1ms
tests/common/experience_test.py::TestExperience::test_assertions 1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience 1ms
tests/common/experience_test.py::TestExperience::test_gather 1ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience 1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize 1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience 1ms
tests/common/experience_test.py::TestExperience::test_to_dict 1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion 1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate 43ms
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate 54ms
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate 55ms
tests/common/vllm_test.py::ModelWrapperTest_3::test_generate 42ms
tests/common/vllm_test.py::ModelWrapperTest_4::test_generate 55ms
tests/common/vllm_test.py::TestAPIServer::test_api 32ms
tests/common/vllm_test.py::TestTokenizer::test_assistant_token_mask 1ms
tests/explorer/explorer_test.py::BaseExplorerCase::test_explorer 1ms
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer 92ms
tests/explorer/explorer_test.py::TestExplorerCountdownNoEval::test_explorer 89ms
tests/explorer/explorer_test.py::TestExplorerWithAddStrategy::test_explorer 56ms
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations 4ms
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results 20ms
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods 15ms
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop 9ms
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks 8ms
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all 8ms
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch 14ms
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_rm_gallery_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable 1ms
tests/trainer/trainer_test.py::BaseTrainerCase::test_trainer 1ms
tests/trainer/trainer_test.py::TestTrainerCountdown::test_trainer 246ms
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer 97ms
tests/trainer/trainer_test.py::TestTrainerGSM8K::test_trainer 59ms
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer 85ms
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer 32ms
tests/trainer/trainer_test.py::TestFullyAsyncMode::test_fully_async_mode_0_queue 103ms
tests/trainer/trainer_test.py::TestFullyAsyncMode::test_fully_async_mode_1_priority_queue 99ms
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins 5ms

Github Test Reporter by CTRF 💚

@pan-x-c
Copy link
Collaborator Author

pan-x-c commented Jul 22, 2025

/run-unittest

@github-actions
Copy link

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
64 64 0 0 0 0 1.4s

Tests

Test Name Status Flaky Duration
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_dpo_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_mix_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_opmd_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sft_policy_loss 1ms
tests/buffer/file_test.py::TestFileBuffer::test_file_buffer 3ms
tests/buffer/file_test.py::TestFileBuffer::test_file_reader 1ms
tests/buffer/file_test.py::TestFileBuffer::test_file_writer 1ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_buffer_reuse 6ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_capacity 2ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_0_queue 4ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_1_priority_queue 5ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_capacity 4ms
tests/buffer/sql_test.py::TestSQLBuffer::test_create_sql_buffer 5ms
tests/common/config_test.py::TestConfig::test_all_examples_are_valid 1ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid 1ms
tests/common/config_test.py::TestConfig::test_load_default_config 4ms
tests/common/experience_test.py::TestEID::test_eid_properties 1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type 1ms
tests/common/experience_test.py::TestExperience::test_assertions 1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience 1ms
tests/common/experience_test.py::TestExperience::test_gather 1ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience 1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize 1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience 1ms
tests/common/experience_test.py::TestExperience::test_to_dict 1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion 1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate 44ms
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate 53ms
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate 55ms
tests/common/vllm_test.py::ModelWrapperTest_3::test_generate 42ms
tests/common/vllm_test.py::ModelWrapperTest_4::test_generate 54ms
tests/common/vllm_test.py::TestAPIServer::test_api 32ms
tests/common/vllm_test.py::TestTokenizer::test_assistant_token_mask 1ms
tests/explorer/explorer_test.py::BaseExplorerCase::test_explorer 1ms
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer 99ms
tests/explorer/explorer_test.py::TestExplorerCountdownNoEval::test_explorer 105ms
tests/explorer/explorer_test.py::TestExplorerWithAddStrategy::test_explorer 56ms
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations 5ms
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results 20ms
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods 15ms
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop 9ms
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks 8ms
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all 8ms
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch 14ms
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_rm_gallery_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable 1ms
tests/trainer/trainer_test.py::BaseTrainerCase::test_trainer 1ms
tests/trainer/trainer_test.py::TestTrainerCountdown::test_trainer 249ms
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer 94ms
tests/trainer/trainer_test.py::TestTrainerGSM8K::test_trainer 60ms
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer 101ms
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer 44ms
tests/trainer/trainer_test.py::TestFullyAsyncMode::test_fully_async_mode_0_queue 97ms
tests/trainer/trainer_test.py::TestFullyAsyncMode::test_fully_async_mode_1_priority_queue 93ms
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins 5ms

Github Test Reporter by CTRF 💚

@pan-x-c pan-x-c requested a review from Copilot July 22, 2025 03:21
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a comprehensive step-wise workflow system for fine-grained experience management and reward calculation. The main changes enhance the framework's capability to handle step-by-step task execution with improved experience tracking and grouping functionality.

  • Introduces step-wise workflow base classes that decouple task execution from the framework and enable low-cost migration from OpenAI API usage
  • Restructures the experience tracking system with a new EID (Experience ID) mechanism for better grouping and identification
  • Refactors the workflow runner and scheduler to support optional experience collection and pre-processing through add strategies

Reviewed Changes

Copilot reviewed 35 out of 35 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
trinity/common/experience.py Major refactoring with new EID class and enhanced Experience structure
trinity/common/workflows/step_wise_workflow.py New step-wise workflow base classes for step-by-step reward calculation
trinity/explorer/workflow_runner.py Modified to return experiences and support configurable experience collection
trinity/explorer/scheduler.py Updated to handle experience collection and return experiences alongside statuses
trinity/explorer/explorer.py Enhanced with add strategy support and experience count tracking
trinity/algorithm/add_strategy/ New add strategy system for pre-processing experiences before buffer storage
Comments suppressed due to low confidence (1)

trinity/common/experience.py:2

  • The module docstring describes "Workflow Runner Module" but this is the experience.py module. The docstring should be corrected to describe the experience module.
"""Experience Class."""

@pan-x-c
Copy link
Collaborator Author

pan-x-c commented Jul 22, 2025

/run-unittest

@github-actions
Copy link

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
64 64 0 0 0 0 1.4s

Tests

Test Name Status Flaky Duration
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_dpo_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_mix_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_opmd_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sft_policy_loss 1ms
tests/buffer/file_test.py::TestFileBuffer::test_file_buffer 3ms
tests/buffer/file_test.py::TestFileBuffer::test_file_reader 1ms
tests/buffer/file_test.py::TestFileBuffer::test_file_writer 1ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_buffer_reuse 6ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_capacity 2ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_0_queue 4ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_1_priority_queue 4ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_capacity 4ms
tests/buffer/sql_test.py::TestSQLBuffer::test_create_sql_buffer 4ms
tests/common/config_test.py::TestConfig::test_all_examples_are_valid 1ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid 1ms
tests/common/config_test.py::TestConfig::test_load_default_config 5ms
tests/common/experience_test.py::TestEID::test_eid_properties 1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type 1ms
tests/common/experience_test.py::TestExperience::test_assertions 1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience 1ms
tests/common/experience_test.py::TestExperience::test_gather 1ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience 1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize 1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience 1ms
tests/common/experience_test.py::TestExperience::test_to_dict 1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion 1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate 45ms
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate 54ms
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate 55ms
tests/common/vllm_test.py::ModelWrapperTest_3::test_generate 41ms
tests/common/vllm_test.py::ModelWrapperTest_4::test_generate 54ms
tests/common/vllm_test.py::TestAPIServer::test_api 31ms
tests/common/vllm_test.py::TestTokenizer::test_assistant_token_mask 1ms
tests/explorer/explorer_test.py::BaseExplorerCase::test_explorer 1ms
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer 88ms
tests/explorer/explorer_test.py::TestExplorerCountdownNoEval::test_explorer 98ms
tests/explorer/explorer_test.py::TestExplorerWithAddStrategy::test_explorer 63ms
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations 4ms
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results 20ms
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods 15ms
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop 9ms
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks 8ms
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all 8ms
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch 13ms
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_rm_gallery_workflow 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable 1ms
tests/trainer/trainer_test.py::BaseTrainerCase::test_trainer 1ms
tests/trainer/trainer_test.py::TestTrainerCountdown::test_trainer 230ms
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer 98ms
tests/trainer/trainer_test.py::TestTrainerGSM8K::test_trainer 63ms
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer 129ms
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer 43ms
tests/trainer/trainer_test.py::TestFullyAsyncMode::test_fully_async_mode_0_queue 95ms
tests/trainer/trainer_test.py::TestFullyAsyncMode::test_fully_async_mode_1_priority_queue 101ms
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins 5ms

Github Test Reporter by CTRF 💚

@pan-x-c
Copy link
Collaborator Author

pan-x-c commented Jul 22, 2025

/unittest-module-common

@github-actions
Copy link

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
23 23 0 0 0 0 295ms

Tests

Test Name Status Flaky Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid 2ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid 1ms
tests/common/config_test.py::TestConfig::test_load_default_config 5ms
tests/common/experience_test.py::TestEID::test_eid_properties 1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type 1ms
tests/common/experience_test.py::TestExperience::test_assertions 1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience 1ms
tests/common/experience_test.py::TestExperience::test_gather 1ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience 1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize 1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience 1ms
tests/common/experience_test.py::TestExperience::test_to_dict 1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion 1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion 1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate 46ms
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate 53ms
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate 55ms
tests/common/vllm_test.py::ModelWrapperTest_3::test_generate 41ms
tests/common/vllm_test.py::ModelWrapperTest_4::test_generate 54ms
tests/common/vllm_test.py::TestAPIServer::test_api 33ms
tests/common/vllm_test.py::TestTokenizer::test_assistant_token_mask 1ms

Github Test Reporter by CTRF 💚

@pan-x-c
Copy link
Collaborator Author

pan-x-c commented Jul 22, 2025

/unittest-diff

@hiyuchang hiyuchang merged commit 8a1d316 into modelscope:main Jul 22, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants