Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable force cluster upgrade #231

Merged

Conversation

wilsonwang371
Copy link
Collaborator

Why are these changes needed?

This is a patch to bug #155. I added reconcile logic so that in case of following fields updated, we will rebuild the cluster.

  1. head template container amount
  2. head template container image
  3. head template container resources
  4. worker template container amount
  5. worker template container image
  6. worker template container resources

Related issue number

#155

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@wilsonwang371 wilsonwang371 force-pushed the wilson/force-cluster-upgrade branch 2 times, most recently from f788973 to d326690 Compare April 19, 2022 06:17
@wilsonwang371
Copy link
Collaborator Author

@Jeffwan you can take a look at this and I believe we still need to use filter functions in the near future. I am thinking comparing different revisions on RayParams is the best solution since we are parsing and giving them as arguments to pods which needs a lot of efforts to compare if we want to do it in the reconcile logic.

@Jeffwan
Copy link
Member

Jeffwan commented Apr 20, 2022

BTW, What does the container amount mean? you mean the total number of the containers in one pod template?

}
updatedWorkerPods := false
for _, item := range workerPods.Items {
if r.needRemoveOldPod(worker.Template, item) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we actually has the worker group template. In that case, the template would be same. Do we need to compare pod with template one by one?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the group template length would be 1. So anyway we will not have extra overhead here, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. the workingGroupTemplate would be 1. I mean the pods created from the template should be same. Do we want to compare each of them with the template or just randomly pick one? I don't have strong opinion on this because technically, the pods could be different if there're some manual modification.

@wilsonwang371
Copy link
Collaborator Author

BTW, What does the container amount mean? you mean the total number of the containers in one pod template?

yes

@wilsonwang371 wilsonwang371 force-pushed the wilson/force-cluster-upgrade branch 6 times, most recently from 28d6e97 to 433d693 Compare April 20, 2022 22:07
@wilsonwang371 wilsonwang371 changed the title [WIP]enable force cluster upgrade enable force cluster upgrade Apr 20, 2022
@Jeffwan
Copy link
Member

Jeffwan commented Apr 20, 2022

Overall looks good to me. Since this is a tricky pattern, let's wait for more feedbacks.

@wilsonwang371 wilsonwang371 changed the title enable force cluster upgrade [WIP]enable force cluster upgrade May 11, 2022
@DmitriGekhtman
Copy link
Collaborator

I've been meaning to take a look at this PR but haven't gotten the chance yet.
I think enabling in-place updates (and documenting what is and isn't possible to update) is very important.

@wilsonwang371 @Jeffwan how's this PR going? Can we merge an initial version across the finish line?

@wilsonwang371
Copy link
Collaborator Author

wilsonwang371 commented Jun 8, 2022

i will clean it up and make it ready soon

@DmitriGekhtman
Copy link
Collaborator

Thanks @wilsonwang371 !!

@wilsonwang371 wilsonwang371 force-pushed the wilson/force-cluster-upgrade branch 2 times, most recently from 8607fba to 19d9043 Compare June 9, 2022 18:06
@wilsonwang371 wilsonwang371 changed the title [WIP]enable force cluster upgrade enable force cluster upgrade Jun 9, 2022
Copy link
Member

@Jeffwan Jeffwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wilsonwang371 Please check comments on the feature flag.

@wilsonwang371 wilsonwang371 force-pushed the wilson/force-cluster-upgrade branch from 19d9043 to 626bb33 Compare June 9, 2022 23:59
@Jeffwan
Copy link
Member

Jeffwan commented Jun 10, 2022

It's safe to merge especially protected by the feature flag. Let's merge it and give it a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants