-
Notifications
You must be signed in to change notification settings - Fork 193
Fix leaking sockets #454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix leaking sockets #454
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've started reviewing, but I'd rather look at a clean diff before adding any comments, as I may be missing things.
Thanks for updating. I'm not sure I fully understand the motivation behind the change here though. It seems like the problem is that, if an exception is thrown from a reused connection, we need to ensure that it is released. My question is: what makes timeout exceptions special? Shouldn't we simply call |
@swamp-agr Let me describe how it works right now. When new connection is allocated, weak finalizer is attached to it to deallocate it in case it leaks, but it doesn't guarantee prompt deallocation, see here. Indeed The issue in question occurs after 4-7 hours of correct work, so I assume there where multiple major GCs during this time, so finalizers had enough time to be executed. I see three possibilities here:
So my conclusion: it's likely that you are fixing a symptom instead of the core issue. There is nothing wrong with fixing symptoms, and actually I never liked the fact that And I agree with @snoyberg that request timeout is not special in any way. Consider for example the connection timeout, with high probability it will trigger the same issue. Not to mention IO errors or async exceptions. |
@Yuras, thank you for detailed explanation. @snoyberg, I want to provide wide fix for deallocation of the In this case the next steps for me would be:
Please expect fix soon. |
- `getConn` wrapped in `bracketOnError` because `timeout ms f` applied to it. - All exceptions trigger connection release with DontReuse. - Timeout settings reverted completely.
Changes that were described in comment above implemented now. |
| otherwise = | ||
-- Release connection in case of connection timeout: | ||
-- https://github.com/snoyberg/http-client/pull/454 | ||
bracketOnError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call is a nop. bracketOnError
will only trigger the cleanup if the in-between action throws an exception. If the allocation action throws an exception, the finalizer will never be run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that. On the other hand, in case of exception Managed Connection
will not be created and it will not be an issue. Is it correct or I am missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. I'm not sure what problem you're trying to prevent here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I was concerned about
timeout timeout' (getConn req m)
call ingetConnectionWrapper
. - I thought that in some tricky case connection could be acquired but forgotten via
timeout
. In this case its future will be uncertain and I tried to prevent it. - If
Managed Connection
created before timeout, it definitely would be returned. Otherwise, it would not be created at all. - And if it would not be created, then
ConnectionTimeout
exception could be "safely" thrown fromgetConnectionWrapper
.
Now I see, there is nothing to worry about. I will completely restore getConn
.
Thank you for your comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Only last request is a changelog entry.
@Yuras did you have any additional comments before I merge this one in?
@snoyberg changelog updated accordingly (I assume that next version will be |
After quick chat with @Yuras he pointed me out on where
getConnectionWrapper mtimeout f =
case mtimeout of
Nothing -> fmap ((,) Nothing) f
Just timeout' -> do
before <- getCurrentTime
mres <- timeout timeout' f
case mres of
Nothing -> throwHttp ConnectionTimeout
Just res -> do
now <- getCurrentTime
let timeSpentMicro = diffUTCTime now before * 1000000
remainingTime = round $ fromIntegral timeout' - timeSpentMicro
if remainingTime <= 0
then throwHttp ConnectionTimeout -- <-- In this case connection has already present, it could leak
else return (Just remainingTime, res) There is also possibility of asynchronous exceptions. |
bracketOnError getConnWithTimeout releaseConnMaybe checkConnDuration | ||
where | ||
-- If applicable, wait for timeout period to acquire the connection from the pool. | ||
getConnWithTimeout = case mtimeout of | ||
Nothing -> fmap ((,) Nothing) f | ||
Just timeout' -> do | ||
before <- getCurrentTime | ||
mres <- timeout timeout' f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a problem. timeout
is now running in a masked state, which we shouldn't be doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should timeout
function be wrapped in interruptible
explicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@snoyberg Why is it a problem? getConn
is interruptible, so timeout
will be able to interrupt it. And in any case, getConn
basically just calls takeKeyedPool
, which is mask_
'ed anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to rely on the current behavior of bracketOnError
not masking in the acquire clause, that may change in the future. It's fragile to write code that way IMO. It seems like the primary thing we're trying to achieve is ensuring that releaseConnMaybe
is called in the case of remainingTime <= 0
. Why not add that in directly before the throwHttp
call, or add an onException
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restored previous version and handled this case directly. Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. Let’s start with this change, and we can continue with other improvements as necessary
Hi @snoyberg!
This PR is intended to close the issue with leaking sockets in case where
managerResponseTimeout
is relatively small. With outgoing constant 1K rps application becomes stuck after 4-17 hours uptime. Force reboot required every 4 hours. Situation is pretty similar to #374. Thus, this PR will close #374.Details
withResponse req man f = bracket (responseOpen req man) responseClose f
bracket
(i.e. inresponseOpen
, when connection is opened).bracket
.reaper
introduced inData.KeyedPool
did not manage such connections well.KeyedPool
for further debugging purposes and produce the explicit test for such case.httpRaw'
some exceptions are indeed handled well:mTimeoutException
added asmRetryableException
, i.e. in a flexible way. Default implementation was also provided:const False
to fallback to current upstream.Manager
contains assumption about next package version (@since 0.7.5
). I could remove it if you ask.I havestylish-haskell
enabled by default. Thus, there are also some formatting changed. If formatting is not acceptable, I could proceed with reverting it.I am looking forward to bring the fix to upstream.
Best Regards,
@swamp-agr