in_tail: Check detaching inode when follow_inodes #4191

garyzjq · 2023-05-31T06:17:26Z

Which issue(s) this PR fixes:
Fixes #4190

What this PR does / why we need it:
~~Add validation to make sure detach_watcher is detaching expected watcher. This can avoid unexpectedly detach new watcher created for new log file and lead to log stuck transiently.~~

Add log to check that detaching inode is the same as the detaching TailWatcher's inode when enabling follow_inodes.

Note: If they do not match, canceling the detach (by adding return) may prevent an incorrect detach.
Since #4208 will prevent an incorrect detach, we will only add the warning log in this PR for now.

Docs Changes:
N/A

Release Note:
~~Fix transient log stuck in in_tail when log file rotated and follow_inodes is enabled~~
Same as the title.

lib/fluent/plugin/in_tail.rb

If `refresh_watchers` run before `update_watcher`, the old implementation of `update_watcher` detach wrongly the new TailWatcher which is added in `refresh_watcher`. This causes the problem of stopping tailing log and handle leak. The test case `test_updateTW_after_refreshTW` reproduces this problem. This fix solves it. There are another BUG about unwatching. I adjusted some expected values of the tests for this BUG. When `refresh_watcher` find the rotated old file AFTER unwatching it (`rotate_wait`), then the logs will be collected in duplicate. If `refresh_watcher` find it BEFORE unwatching it (`rotate_wait`), this problem doesn't occur because the position entry is still alive. We need to fix this and fix the adjusted expected values. This fix is based on the content and discussion of the following issue and PR: * fluent#3614 * fluent#4185 * fluent#4191 Signed-off-by: Daijiro Fukuda <[email protected]> Co-authored-by: Katuya Kawakami <[email protected]> Co-authored-by: Masaki Hatada <[email protected]> Co-authored-by: Gary Zhu <[email protected]> Co-authored-by: Takuro Ashie <[email protected]>

ashie · 2023-06-26T06:08:59Z

@garyzjq Sorry for my late response.
We are going to fix this issue with #4208, please check it.
I believe it resolves both stuck issue & leak issue.

If `refresh_watchers` run before `update_watcher`, the old implementation of `update_watcher` detach wrongly the new TailWatcher which is added in `refresh_watcher`. This causes the problem of stopping tailing log and handle leak. The test case `test_updateTW_after_refreshTW` reproduces this problem. This fix solves it. There are another BUG about unwatching. I adjusted some expected values of the tests for this BUG. When `refresh_watcher` find the rotated old file AFTER unwatching it (`rotate_wait`), then the logs will be collected in duplicate. If `refresh_watcher` find it BEFORE unwatching it (`rotate_wait`), this problem doesn't occur because the position entry is still alive. We need to fix this and fix the adjusted expected values. This fix is based on the content and discussion of the following issue and PR: * fluent#3614 * fluent#4185 * fluent#4191 Signed-off-by: Daijiro Fukuda <[email protected]> Co-authored-by: Katuya Kawakami <[email protected]> Co-authored-by: Masaki Hatada <[email protected]> Co-authored-by: Gary Zhu <[email protected]> Co-authored-by: Takuro Ashie <[email protected]>

ashie · 2023-06-27T00:51:54Z

Although the issue will be fixed by #4208, this fix still might be useful as a last guard.
Especially, it's might better to add the log even if we don't return here.
So we'll keep open this for a while to consider about it.

garyzjq · 2023-06-28T05:36:34Z

Thanks @ashie . I also checked #4208 and looks good. The watcher to detach is directly passed to detach_watcher method instead of find from tail map, so should be correct by design.
I think I can change my PR to add a log but won't return (for safety), to see whether there's any other corner case which may lead to detach unexpected watcher.

ashie

I think I can change my PR to add a log but won't return (for safety), to see whether there's any other corner case which may lead to detach unexpected watcher.

Thanks, it's worth to merge to check other cases.

daipom

@garyzjq
LGTM! Thanks!

Could you please do the following?

Add DCO to all the commits or rebase them to one commit.
Fix the comment: in_tail: Check detaching inode when follow_inodes #4191 (comment)

Signed-off-by: Gary Zhu <[email protected]> Signed-off-by: garyzjq <[email protected]>

Signed-off-by: garyzjq <[email protected]>

garyzjq · 2023-06-29T04:06:26Z

@garyzjq LGTM! Thanks!

Could you please do the following?

Add DCO to all the commits or rebase them to one commit.

Fix the comment: Check inode expectation to detach correct watcher when follow_inodes #4191 (comment)

sure, done for DCO

daipom · 2023-06-30T02:15:25Z

@garyzjq Thanks! I fixed the title and some issue comments and merge this!
#4190 and this PR have given us much insight to improve in_tail!
Thanks so much!

garyzjq · 2023-06-30T06:57:53Z

thanks @daipom and @ashie, really a nice experience for me to contribute to fluentd :)

ashie self-requested a review May 31, 2023 06:20

garyzjq force-pushed the jiaqz/inodestuck branch 2 times, most recently from 6e8bbde to fbc3c2d Compare May 31, 2023 06:30

ashie reviewed Jun 1, 2023

View reviewed changes

lib/fluent/plugin/in_tail.rb Show resolved Hide resolved

This was referenced Jun 6, 2023

Fluentd in_tail "unreadable" file causes "following tail of <file>" to stop and no logs pushed #3614

Closed

in_tail: Use inode for key of TailWatcher when follow_inodes #4185

Closed

daipom mentioned this pull request Jun 20, 2023

in_tail: Ensure to detach correct watcher on rotation with follow_inodes #4208

Merged

ashie approved these changes Jun 28, 2023

View reviewed changes

ashie requested a review from daipom June 29, 2023 00:18

daipom approved these changes Jun 29, 2023

View reviewed changes

garyzjq added 2 commits June 29, 2023 12:05

Check inode expectation to detach correct watcher when follow_inodes

e9b36c1

Signed-off-by: Gary Zhu <[email protected]> Signed-off-by: garyzjq <[email protected]>

only add warning log only when detect detaching a wrong watcher

bcd9a1a

Signed-off-by: garyzjq <[email protected]>

garyzjq force-pushed the jiaqz/inodestuck branch from da55678 to bcd9a1a Compare June 29, 2023 04:05

daipom changed the title ~~Check inode expectation to detach correct watcher when follow_inodes~~ in_tail: Check detaching inode when follow_inodes Jun 30, 2023

daipom merged commit f436211 into fluent:master Jun 30, 2023

garyzjq deleted the jiaqz/inodestuck branch July 2, 2023 16:20

daipom added this to the v1.16.2 milestone Oct 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

in_tail: Check detaching inode when follow_inodes #4191

in_tail: Check detaching inode when follow_inodes #4191

garyzjq commented May 31, 2023 •

edited by daipom

Loading

ashie commented Jun 26, 2023 •

edited

Loading

ashie commented Jun 27, 2023

garyzjq commented Jun 28, 2023

ashie left a comment

daipom left a comment •

edited

Loading

garyzjq commented Jun 29, 2023

daipom commented Jun 30, 2023

garyzjq commented Jun 30, 2023

in_tail: Check detaching inode when follow_inodes #4191

in_tail: Check detaching inode when follow_inodes #4191

Conversation

garyzjq commented May 31, 2023 • edited by daipom Loading

ashie commented Jun 26, 2023 • edited Loading

ashie commented Jun 27, 2023

garyzjq commented Jun 28, 2023

ashie left a comment

Choose a reason for hiding this comment

daipom left a comment • edited Loading

Choose a reason for hiding this comment

garyzjq commented Jun 29, 2023

daipom commented Jun 30, 2023

garyzjq commented Jun 30, 2023

garyzjq commented May 31, 2023 •

edited by daipom

Loading

ashie commented Jun 26, 2023 •

edited

Loading

daipom left a comment •

edited

Loading