Fix parallel vae by gty111 · Pull Request #281 · xdit-project/xDiT

gty111 · 2024-09-21T09:45:31Z

related to #271

xfuser/model_executor/models/transformers/transformer_flux.py

xfuser/model_executor/pipelines/pipeline_flux.py

xfuser/model_executor/pipelines/pipeline_pixart_sigma.py

xfuser/model_executor/pipelines/pipeline_stable_diffusion_3.py

gty111 · 2024-09-24T03:12:54Z

Related PR xdit-project/DistVAE#3

To consider DP, we add gather_broadcast_latents and is_dp_last_group in xFuserPipelineBaseWrapper.

gather_broadcast_latents:

create dp last group
gather latents from dp last group and concatenate the latents in the batch dim
broadcast latents

Then use default group to complete parallel vae.

When using parallel vae, only one process need to process img after vae. So is_dp_last_group is to correctly identify the process which needs to do sth after vae.

@Eigensystem

xfuser/model_executor/pipelines/pipeline_hunyuandit.py

Eigensystem · 2024-09-24T10:23:32Z

I think it is not a elegant implementation because you combined all the dp latents into one tensor instead of using different dp groups to process different latent. It will increase the communication load.

gty111 · 2024-09-26T03:54:46Z

I think it is not a elegant implementation because you combined all the dp latents into one tensor instead of using different dp groups to process different latent. It will increase the communication load.

It is indeed not the best implementation. Using different dp groups need to refactor DistVAE. Since parallel vae is an option, maybe we can first fix parallel vae in this PR, and further optimize it in the future.

Eigensystem

LGTM.

gty111 added 4 commits September 21, 2024 17:40

Fix warning

d348523

Fix distvae in sd3

a51cb0f

Fix parallel vae in other models

43a0707

Suppress warning

989a0b7

feifeibear requested a review from Eigensystem September 23, 2024 01:43

gty111 changed the title ~~Fix parallel vae~~ Fix parallel vae and flux model Sep 23, 2024

Eigensystem requested changes Sep 23, 2024

View reviewed changes

gty111 force-pushed the distvae branch from 4b26edb to 23f80af Compare September 24, 2024 03:02

gty111 changed the title ~~Fix parallel vae and flux model~~ Fix parallel vae Sep 24, 2024

Eigensystem requested changes Sep 24, 2024

View reviewed changes

xfuser/model_executor/pipelines/pipeline_hunyuandit.py Outdated Show resolved Hide resolved

Fix DP parallel when using parallel vae

23f80af

Eigensystem approved these changes Sep 26, 2024

View reviewed changes

Eigensystem merged commit 48e3633 into xdit-project:main Sep 26, 2024

Fix return value

bbb2c44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix parallel vae#281

Fix parallel vae#281
Eigensystem merged 6 commits intoxdit-project:mainfrom
gty111:distvae

gty111 commented Sep 21, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gty111 commented Sep 24, 2024 •

edited

Loading

Uh oh!

Uh oh!

Eigensystem commented Sep 24, 2024

Uh oh!

gty111 commented Sep 26, 2024

Uh oh!

Eigensystem left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gty111 commented Sep 21, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gty111 commented Sep 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Eigensystem commented Sep 24, 2024

Uh oh!

gty111 commented Sep 26, 2024

Uh oh!

Eigensystem left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gty111 commented Sep 24, 2024 •

edited

Loading