Performance optimization for Align post processing block #2040

aangerma · 2018-07-11T16:43:40Z

This pull-request improves latency and CPU utilization of align processing block.
The solution combines SSE4 optimization for Intel architecture with changes to the algorithm to reduce overhead.

Following-up on: #1189, #1105

OS Microsoft Windows 10 Enterprise

Processor Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz, 2301 Mhz, 2 Core(s), 4 Logical Processor(s)

Align depth to color

		Depth 640x480	Depth 1280x720
Color 640x480	NAIVE (microseconds)	11217.4	26023.6
	SSE (microseconds)	4346.96	4360.56
Speedup		x 2.58	x 5.96
Color 1280x720	NAIVE (microseconds)	14868.2	44311.9
	SSE (microseconds)	10497.1	8327.42
Speedup		x 1.41	x 5.32

Align color to depth

		Color 640x480	Color 1280x720
Depth 640x480	NAIVE (microseconds)	14904.4	14642.8
	SSE (microseconds)	3454.34	3863.21
Speedup		x 4.31	x 3.79
Depth 1280x720	NAIVE (microseconds)	24630.5	22478.8
	SSE (microseconds)	3218.72	3292.07
Speedup		x 7.65	x 6.82

Ubuntu 16.04

Intel® Core™ i7-6700 CPU @ 3.40GHz × 8

Align depth to color

		Depth 640x480	Depth 1280x720
Color 640x480	NAIVE (microseconds)	13006.9	19729.7
	SSE (microseconds)	1941.03	1120.02
Speedup		x 6.70	x 17.61
Color 1280x720	NAIVE (microseconds)	8325.66	21869.3
	SSE (microseconds)	4256.76	3838.1
Speedup		x 1.95	x 5.69

Align color to depth

		Color 640x480	Color 1280x720
Depth 640x480	NAIVE (microseconds)	14049.8	10048.8
	SSE (microseconds)	1475.04	1518.32
Speedup		x 9.52	x 6.61
Depth 1280x720	NAIVE (microseconds)	22304.3	25391.8
	SSE (microseconds)	1430.76	1526.98
Speedup		x 15.58	x 16.62

ev-mp

Looking good

ev-mp · 2018-07-16T08:34:15Z

src/proc/align.cpp

                byte* other_aligned_to_depth = const_cast<byte*>(aligned_frame.frame->get_frame_data());
                memset(other_aligned_to_depth, 0, depth_intrinsics.height * depth_intrinsics.width * aligned_bytes_per_pixel);
+#ifdef __SSSE3__
+                auto uid = other_frame->get_stream()->get_unique_id();


Is it used ?

ev-mp · 2018-07-16T08:34:45Z

src/proc/align.cpp

+                auto data = (int16_t*)depth_frame->get_frame_data();
+
+#ifdef __SSSE3__
+                auto uid = other_frame->get_stream()->get_unique_id();


ev-mp · 2018-07-16T08:35:45Z

src/proc/align.cpp

    }
-}
+
+        }


Indentation + add empty line at the end of file

ev-mp · 2018-07-16T08:51:18Z

src/proc/align.cpp

+    {
+    }
+
+    void image_transform::pre_compute_x_y_map(float offset)


Please add comments what the pre-compute comprise of

ev-mp · 2018-07-16T09:00:54Z

src/proc/align.cpp

+            bottom_right = _pixel_bottom_right_int;
+        }
+
+        switch (bpp)


The switch case seem to be redundant

ev-mp · 2018-07-16T09:02:55Z

src/proc/align.cpp

+    template<>
+    inline void image_transform::distorte_x_y<RS2_DISTORTION_MODIFIED_BROWN_CONRADY>(const __m128& x, const __m128& y, __m128* distorted_x, __m128* distorted_y, const rs2_intrinsics& to)
+    {
+        __m128 c[5];


Please add comments for the hard-coded values - [5] == distortion coefficients,
for maintainability

ev-mp · 2018-07-16T09:03:52Z

src/proc/align.cpp

+        const rs2_intrinsics& to,
+        const rs2_extrinsics& from_to_other)
+    {
+        //mask for shuffle


Please add pseudo-code to describe the flow for maintainability

aangerma force-pushed the development branch 3 times, most recently from 85bd4a4 to 4fe7f05 Compare July 12, 2018 14:29

dorodnic closed this Jul 12, 2018

dorodnic reopened this Jul 12, 2018

dorodnic closed this Jul 15, 2018

dorodnic reopened this Jul 15, 2018

aangerma force-pushed the development branch from 4fe7f05 to a9ecc1b Compare July 16, 2018 06:42

aangerma added 2 commits July 16, 2018 10:19

Align improve performance.

029b0d9

Fixes compilation error on android.

59d6af9

ev-mp approved these changes Jul 16, 2018

View reviewed changes

aangerma force-pushed the development branch from a9ecc1b to 30a76ea Compare July 16, 2018 08:47

ev-mp reviewed Jul 16, 2018

View reviewed changes

src/proc/align.cpp

bottom_right = _pixel_bottom_right_int;

}

switch (bpp)

Copy link

Contributor

ev-mp Jul 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The switch case seem to be redundant

ev-mp reviewed Jul 16, 2018

View reviewed changes

Fixes following code review.

7abb777

aangerma force-pushed the development branch from 30a76ea to 7abb777 Compare July 16, 2018 09:50

dorodnic merged commit 05d95f8 into realsenseai:development Jul 16, 2018

jpapon mentioned this pull request Jul 19, 2018

High CPU Utilization #1105

Closed

dorodnic mentioned this pull request Aug 29, 2018

Low fps when aligning depth to colored frame #2321

Closed

dorodnic mentioned this pull request Dec 17, 2018

Grid artifact related to SSE instructions #2909

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance optimization for Align post processing block #2040

Performance optimization for Align post processing block #2040

Uh oh!

aangerma commented Jul 11, 2018 •

edited by dorodnic

Loading

Uh oh!

ev-mp left a comment

Uh oh!

ev-mp Jul 16, 2018

Uh oh!

ev-mp Jul 16, 2018

Uh oh!

ev-mp Jul 16, 2018

Uh oh!

ev-mp Jul 16, 2018

Uh oh!

ev-mp Jul 16, 2018

Uh oh!

ev-mp Jul 16, 2018

Uh oh!

ev-mp Jul 16, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

                   }
-              }
+                      }

                
                    No newline at end of file

Performance optimization for Align post processing block #2040

Performance optimization for Align post processing block #2040

Uh oh!

Conversation

aangerma commented Jul 11, 2018 • edited by dorodnic Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

OS Microsoft Windows 10 Enterprise

Align depth to color

Align color to depth

Ubuntu 16.04

Align depth to color

Align color to depth

Uh oh!

ev-mp left a comment

Choose a reason for hiding this comment

Uh oh!

ev-mp Jul 16, 2018

Choose a reason for hiding this comment

Uh oh!

ev-mp Jul 16, 2018

Choose a reason for hiding this comment

Uh oh!

ev-mp Jul 16, 2018

Choose a reason for hiding this comment

Uh oh!

ev-mp Jul 16, 2018

Choose a reason for hiding this comment

Uh oh!

ev-mp Jul 16, 2018

Choose a reason for hiding this comment

Uh oh!

ev-mp Jul 16, 2018

Choose a reason for hiding this comment

Uh oh!

ev-mp Jul 16, 2018

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aangerma commented Jul 11, 2018 •

edited by dorodnic

Loading