Skip to content

Fix forever lock of main thread in runBlocking #2374

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

slavonnet
Copy link
Contributor

In havy concurect UI load where in views have many small runBlocking (and withContext) blocks some time we can get parkNanos == Long.MAX_VALUE.

@slavonnet
Copy link
Contributor Author

@slavonnet
Copy link
Contributor Author

@elizarov can you review please? :)

@slavonnet
Copy link
Contributor Author

maybe fix #1679

Copy link
Contributor

@elizarov elizarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First and foremost, this PR needs a test that it supposedly fixes. Also, looking at the code I see (wihtout any test) that this code will be consuming 100% CPU time when there are no tasks for runBlocking { ... } to execute, since, in this case, nanos == Long.MAX_VALUE and this change makes parkNanos a noop that immediately returns instead of waiting (without consuming CPU time) as it should be.

@slavonnet
Copy link
Contributor Author

There is a difficulty ... I cannot write a test. I know the result of a long blocking, but how quickly to get an empty queue, but with a working coroutine there are only guesses ..

I repeat (rarely) when creating hundreds of custom views (form fields) where there is runBlocking in the init

- add logic (if MainThread) becouse need more chance process Frame
@slavonnet slavonnet requested a review from elizarov November 9, 2020 22:40
…dlock

# Conflicts:
#	kotlinx-coroutines-core/jvm/src/TimeSource.kt
@slavonnet
Copy link
Contributor Author

fixed

Copy link
Contributor Author

@slavonnet slavonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code more clean

Copy link
Contributor

@elizarov elizarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All I see from this code is that it is attempting to shuffle some problems under the rug by replacing a proper "sleep until woken up" with "sleep at most 10ms", instead of figuring out what's going on and fixing the problem. Without a test, it is not even possible to very that it helps in any sense.

@slavonnet
Copy link
Contributor Author

slavonnet commented Nov 18, 2020

@elizarov Я попробую, но ты не прав. Ограничения на главном потоке более важны и критичны чем текущий баг. Это ограничение должно проверяться независимо от текущего вопросы. Главный поток должен просыпаться внутри фрейма чтобы на главном Лупере отрисовать фрейм.. Если по какой-то причине след событие далеко - не забываем запускать фрейм. ю

И что плохого если глобальное у и логичное условие не дает права на такую ошибку в будущем? Почему рельзчя глобально исправить кейс?

@slavonnet
Copy link
Contributor Author

I wrote a draft that shows deadlocks in different places. Results vary from system to system.
Please manual run file from
#2428

@slavonnet
Copy link
Contributor Author

Looks like queue overload somewhere

image
image

Copy link
Member

@qwwdfsad qwwdfsad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've studied tests from the #2428.

The very first hang test can be boiled down to a classic deadlock:

val executor = newFixedThreadPoolContext(1, "")
runBlocking(executor) {
    withContext(executor) {
}

It's not a bug in kotlinx.coroutines, but rather a classic deadlock when trying to submit two tasks into single-threaded executor that depend on each other.
The rest of the test timeouts because the executors from the previous tests were not reset and still hang on coroutines from the previous test runs.

Neither PR nor tests seem to be constructive or respectful, so I'm closing this PR.
You can create a separate issue with a stable reproducer that has the self-contained snippet of code, expected behaviour and actual behaviour description. Please note that just giving us 625 different test configurations that fail for some reason without further analysis is not helpful and just waste maintainers time. I also suggest getting familiar with contribution guide

The official language of the Kotlin org is English, any further comments using non-English language will be removed

@qwwdfsad qwwdfsad closed this Mar 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants