Fix dynamic timeout #409

pan-x-c · 2025-11-25T10:05:28Z

Description

The current version's dynamic timeout mechanism includes evaluation tasks into the timeout calculation, which affects the average time of normal rollout tasks, thus causing rollout tasks to time out.

Checklist

Please check the following items before code is ready to be reviewed.

Code has passed all tests
Docstrings have been added/updated in Google Style
Documentation has been updated
Code is ready for review

gemini-code-assist · 2025-11-25T10:05:47Z

Summary of Changes

Hello @pan-x-c, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the dynamic timeout mechanism by ensuring that evaluation tasks are no longer factored into the calculation of task running times and completion counts. This change prevents normal rollout tasks from timing out prematurely due to the inclusion of potentially longer-running evaluation tasks, thereby improving the reliability and accuracy of the dynamic timeout system.

Highlights

Dynamic Timeout Correction: The core issue where evaluation tasks were incorrectly included in the dynamic timeout calculation, leading to premature timeouts for normal rollout tasks, has been resolved.
Exclusion of Evaluation Tasks: The scheduler's total_running_time and total_completed_tasks metrics, which are crucial for dynamic timeout, now explicitly exclude evaluation tasks.
Enhanced Test Coverage: New and modified test cases have been added to scheduler_test.py and trainer_test.py to validate that evaluation tasks do not influence dynamic timeout metrics and that evaluation metrics are correctly processed.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively addresses the issue of evaluation tasks interfering with the dynamic timeout calculation. The fix is implemented cleanly in the scheduler by checking if a task is for evaluation before including it in the timeout statistics. The accompanying changes to the test suite are thorough, with a new targeted unit test confirming the fix and an integration test updated to cover the scenario with evaluation tasks. The code is well-structured and the changes are correct.

pan-x-c · 2025-11-25T10:17:49Z

/unittest-module-explorer

github-actions · 2025-11-25T10:33:55Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
45	44	0	1	0	0	14m 23s

Skipped

Tests	Status
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter	skipped ⏭️

Tests

Test Name	Status	Duration
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer	✅	1m 26s
tests/explorer/explorer_test.py::TestExplorerGSM8KRULERNoEval::test_explorer	✅	1m 40s
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer	✅	3m 38s
tests/explorer/explorer_test.py::ServeTest::test_serve	✅	1m 21s
tests/explorer/scheduler_test.py::SchedulerTest::test_async_workflow	✅	12.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations	✅	12.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_dynamic_timeout	✅	20.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results	✅	27.7s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_0	✅	12.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_1	✅	12.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_0	✅	12.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_1	✅	12.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution	✅	12.7s
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow	✅	12.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_over_rollout_min_wait	✅	16.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods	✅	22.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop	✅	23.9s
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks	✅	15.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid	✅	32.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all	✅	15.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch	✅	21.2s
tests/explorer/scheduler_test.py::TestRunnerStateCollection::test_runner_state_collection	✅	17.4s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_0	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_1	✅	602ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_0	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_1	✅	1.0s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps	✅	1.0s
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow	✅	34ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow	✅	24ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow	✅	174ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow	✅	3ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow	✅	13ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow	✅	8ms
tests/explorer/workflow_test.py::WorkflowTest::test_rm_gallery_workflow	✅	133ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_0	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_1	✅	101ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_0	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_1	✅	201ms
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_0::test_multi_turn_workflow	✅	14.7s
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_1::test_multi_turn_workflow	✅	14.7s
tests/explorer/workflow_test.py::TestWorkflowStateRecording::test_workflow_state_recording	✅	4.0s
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter	⏭️	1ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner	✅	295ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner_get_state	✅	8.1s

Github Test Reporter by CTRF 💚

pan-x-c · 2025-11-25T10:34:13Z

/unittest-module-trainer

github-actions · 2025-11-25T11:14:21Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
21	19	0	2	0	0	38m 9s

Skipped

Tests	Status
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	skipped ⏭️
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	skipped ⏭️

Tests

Test Name	Status	Duration
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer	✅	2m 41s
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer	✅	3m 52s
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer	✅	1m 26s
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer	✅	1m 21s
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer	✅	1m 28s
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer	✅	1m 23s
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer	✅	1m 37s
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	✅	2m 32s
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer	✅	1m 4s
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer	✅	1m 1s
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools	✅	1m 1s
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode	✅	1m 54s
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode	✅	1m 50s
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode	✅	2m 21s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer	✅	2m 15s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer	✅	3m 58s
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer	✅	1m 24s
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	⏭️	810ms
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	⏭️	809ms
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer	✅	3m 7s
tests/trainer/trainer_test.py::TestOverRollout::test_trainer	✅	1m 22s

Github Test Reporter by CTRF 💚

fix dynamic timeout

e978a80

gemini-code-assist bot reviewed Nov 25, 2025

View reviewed changes

fix pre-commit

ce2b0d3

add more comments

5bef192

hiyuchang approved these changes Nov 26, 2025

View reviewed changes

hiyuchang merged commit 75f7d2d into modelscope:main Nov 26, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix dynamic timeout #409

Fix dynamic timeout #409

Uh oh!

pan-x-c commented Nov 25, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Nov 25, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

pan-x-c commented Nov 25, 2025

Uh oh!

github-actions bot commented Nov 25, 2025

Uh oh!

pan-x-c commented Nov 25, 2025

Uh oh!

github-actions bot commented Nov 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix dynamic timeout #409

Fix dynamic timeout #409

Uh oh!

Conversation

pan-x-c commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

gemini-code-assist bot commented Nov 25, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

pan-x-c commented Nov 25, 2025

Uh oh!

github-actions bot commented Nov 25, 2025

Summary

Skipped

Tests

Uh oh!

pan-x-c commented Nov 25, 2025

Uh oh!

github-actions bot commented Nov 25, 2025

Summary

Skipped

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pan-x-c commented Nov 25, 2025 •

edited

Loading