Skip to content

Conversation

@lossyrob
Copy link
Member

@lossyrob lossyrob commented May 17, 2022

In-place sort was being done on an ephemeral list. The lack of sort was causing itertools.group_by to make small groups as it streamed through the iterator, causing ingests to be slow for large sets of items with a mix of partitions.

Seeing a significant performance increase for large item ingest with many partitions (weekly partitions on about 4 months of Sentinel 2, dropping 50K item group ingests from 1000s to 40s)

@lossyrob lossyrob marked this pull request as draft May 17, 2022 00:47
@lossyrob lossyrob changed the title Fix incorrect in-place sort Loader optimization: Fix incorrect in-place sort of chunks by partition May 17, 2022
@lossyrob lossyrob marked this pull request as ready for review May 17, 2022 02:59
@lossyrob lossyrob requested a review from bitner May 17, 2022 03:01
@bitner bitner merged commit 3baae42 into main May 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants