-
Notifications
You must be signed in to change notification settings - Fork 5k
Performance optimization for Align post processing block #2040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
85bd4a4 to
4fe7f05
Compare
ev-mp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good
src/proc/align.cpp
Outdated
| byte* other_aligned_to_depth = const_cast<byte*>(aligned_frame.frame->get_frame_data()); | ||
| memset(other_aligned_to_depth, 0, depth_intrinsics.height * depth_intrinsics.width * aligned_bytes_per_pixel); | ||
| #ifdef __SSSE3__ | ||
| auto uid = other_frame->get_stream()->get_unique_id(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it used ?
src/proc/align.cpp
Outdated
| auto data = (int16_t*)depth_frame->get_frame_data(); | ||
|
|
||
| #ifdef __SSSE3__ | ||
| auto uid = other_frame->get_stream()->get_unique_id(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here
| } | ||
| } | ||
|
|
||
| } No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation + add empty line at the end of file
src/proc/align.cpp
Outdated
| { | ||
| } | ||
|
|
||
| void image_transform::pre_compute_x_y_map(float offset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add comments what the pre-compute comprise of
| bottom_right = _pixel_bottom_right_int; | ||
| } | ||
|
|
||
| switch (bpp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The switch case seem to be redundant
src/proc/align.cpp
Outdated
| template<> | ||
| inline void image_transform::distorte_x_y<RS2_DISTORTION_MODIFIED_BROWN_CONRADY>(const __m128& x, const __m128& y, __m128* distorted_x, __m128* distorted_y, const rs2_intrinsics& to) | ||
| { | ||
| __m128 c[5]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add comments for the hard-coded values - [5] == distortion coefficients,
for maintainability
src/proc/align.cpp
Outdated
| const rs2_intrinsics& to, | ||
| const rs2_extrinsics& from_to_other) | ||
| { | ||
| //mask for shuffle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add pseudo-code to describe the flow for maintainability
This pull-request improves latency and CPU utilization of
alignprocessing block.The solution combines SSE4 optimization for Intel architecture with changes to the algorithm to reduce overhead.
Following-up on: #1189, #1105
OS Microsoft Windows 10 Enterprise
Processor Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz, 2301 Mhz, 2 Core(s), 4 Logical Processor(s)
Align depth to color
Align color to depth
Ubuntu 16.04
Intel® Core™ i7-6700 CPU @ 3.40GHz × 8
Align depth to color
Align color to depth