fix(io): attempt to optimize loading/storing i/o#33
Merged
chrisdickinson merged 1 commit intomainfrom Nov 17, 2023
Merged
Conversation
Move the `if` check out of the main loop so we don't branch on each pass and use bitwise operations to get the 8-byte chunk count & remainder. The resulting generated code is a little bit smaller (by 2k bytes or so) but runs about the same speed. (This is unsurprising in retrospect: these are the sorts of changes an optimizing compiler might perform!) I used hyperfine [1] and the extism CLI to test `count_vowels` 100 times against `/usr/share/dict/words`. This was useful primarily from a "getting my fingerprints on the problem space" perspective. Additionally, fix clippy lints. [1]: https://github.com/sharkdp/hyperfine
2b594a0 to
81582a9
Compare
zshipko
approved these changes
Nov 17, 2023
Contributor
zshipko
left a comment
There was a problem hiding this comment.
Thanks! I think regardless of the performance this is much nicer. I just started porting this over to the c-pdk and will probably do go too.
Author
|
Thanks! mtb0x1 also brought up a good point in Discord – we might consider adding |
Contributor
|
Good point - I will make an issue on extism/extism for that! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Move the
ifcheck out of the main loop so we don't branch on each pass and use bitwise operations to get the 8-byte chunk count & remainder. The resulting generated code is a little bit smaller (by 2k bytes or so) but runs about the same speed. (This is unsurprising in retrospect: these are the sorts of changes an optimizing compiler might perform!) I used hyperfine 1 and the extism CLI to testcount_vowels100 times against/usr/share/dict/words.This was useful primarily from a "getting my fingerprints on the problem space" perspective, so we might not end up using these changes. I present them here because, well, they were fun to write 😄
Additionally, fix clippy lints.