Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loom test for deadlock observed in tokio's test suite #6876

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jofas
Copy link
Contributor

@jofas jofas commented Sep 28, 2024

This PR adds a Loom test for the deadlock observed in #6847.

When I run this test locally on my machine with

LOOM_MAX_PREEMPTIONS=1 LOOM_MAX_BRANCHES=10000 RUSTFLAGS="--cfg loom -C debug_assertions" \
    cargo test --lib --release --features full pool_deadlock_on_blocked_task \
    -- --test-threads=1 --nocapture

I get the following error:

running 1 test
test runtime::tests::loom_multi_thread::group_d::pool_deadlock_on_blocked_task ... thread 'runtime::tests::loom_multi_thread::group_d::pool_deadlock_on_blocked_task' panicked at /home/masterusr/.cargo/registry/src/index.crates.io-6f17d22bba15001f/loom-0.7.2/src/rt/execution.rs:216:13:
deadlock; threads = [(Id(0), Blocked(Location(None))), (Id(1), Blocked(Location(None))), (Id(2), Blocked(Location(None)))]
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'runtime::tests::loom_multi_thread::group_d::pool_deadlock_on_blocked_task' panicked at /home/masterusr/.cargo/registry/src/index.crates.io-6f17d22bba15001f/loom-0.7.2/src/rt/thread.rs:276:39:
called `Option::unwrap()` on a `None` value
stack backtrace:
   0:     0x56425478aad5 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h1e1a1972118942ad
   1:     0x5642547b34db - core::fmt::write::hc090a2ffd6b28c4a
   2:     0x56425478834f - std::io::Write::write_fmt::h8898bac6ff039a23
   3:     0x56425478a8ae - std::sys_common::backtrace::print::ha96650907276675e
   4:     0x56425478c229 - std::panicking::default_hook::{{closure}}::h215c2a0a8346e0e0
   5:     0x56425478bf6d - std::panicking::default_hook::h207342be97478370
   6:     0x56425478c7f6 - std::panicking::rust_panic_with_hook::hac8bdceee1e4fe2c
   7:     0x56425478c56b - std::panicking::begin_panic_handler::{{closure}}::h00d785e82757ce3c
   8:     0x56425478af99 - std::sys_common::backtrace::__rust_end_short_backtrace::h1628d957bcd06996
   9:     0x56425478c2d7 - rust_begin_unwind
  10:     0x5642543ae4e3 - core::panicking::panic_fmt::hdc63834ffaaefae5
  11:     0x5642543ae58c - core::panicking::panic::h75b3c9209f97d725
  12:     0x5642543ae489 - core::option::unwrap_failed::h4b4353bf890a85df
  13:     0x5642545f8aff - loom::rt::object::Ref<T>::set_action::hd5b09cd3dece6232
  14:     0x56425461154c - scoped_tls::ScopedKey<T>::with::hd6ef3a1bee7ec98b
  15:     0x5642545e6bce - loom::rt::atomic::Atomic<T>::store::h5d2d323740f21a8e
  16:     0x56425459365c - tokio::runtime::scheduler::multi_thread::park::Parker::park::h4ae71e780fadb3a8
  17:     0x56425450599e - tokio::runtime::scheduler::multi_thread::worker::Context::park_timeout::hc7658d589be3126b
  18:     0x56425450476c - tokio::runtime::scheduler::multi_thread::worker::Context::run::h57894510918d1e2b
  19:     0x56425454aafd - tokio::runtime::context::scoped::Scoped<T>::set::h8d9c484a2b1a5a11
  20:     0x5642544bdab5 - loom::thread::LocalKey<T>::try_with::h3f0132ee1c91aba6
  21:     0x564254524aeb - tokio::runtime::context::runtime::enter_runtime::h3ce4255900a8aedd
  22:     0x564254503b15 - tokio::runtime::scheduler::multi_thread::worker::run::h1f5b7e23b8e40277
  23:     0x5642544a9567 - loom::cell::unsafe_cell::UnsafeCell<T>::with_mut::h476d68d1ebb373c5
  24:     0x5642544eac54 - tokio::runtime::task::core::Core<T,S>::poll::h7aabc50663325fdb
  25:     0x56425442454e - tokio::runtime::task::harness::Harness<T,S>::poll::h56ead0a7ce702948
  26:     0x5642545a69be - tokio::runtime::blocking::pool::Inner::run::hfe127e926858c8af
  27:     0x564254562e2e - core::ops::function::FnOnce::call_once{{vtable.shim}}::hfbc82a17d67087f1
  28:     0x564254601527 - generator::stack::StackBox<F>::call_once::heb5e6a7940558221
  29:     0x56425475ed0b - std::panicking::try::h92c7df7a6cc6dd01
  30:     0x56425475f0c8 - generator::detail::gen::gen_init_impl::habe2c082c5ebb920
  31:     0x56425475ef79 - generator::detail::asm::gen_init::h5730af05b288df0e
  32:                0x0 - <unknown>
thread 'runtime::tests::loom_multi_thread::group_d::pool_deadlock_on_blocked_task' panicked at library/core/src/panicking.rs:228:5:
panic in a destructor during cleanup
thread caused non-unwinding panic. aborting.
error: test failed, to rerun pass `--lib`

Caused by:
  process didn't exit successfully: `/home/masterusr/src/tokio/target/release/deps/tokio-367a8771b781d33e pool_deadlock_on_blocked_task --test-threads=1 --nocapture` (signal: 6, SIGABRT: process abort signal)

which I believe signifies that the test is able to successfully replicate the deadlock.

I used oneshot channels instead of barriers as is done in the flaky test where the deadlock was first observed, because Loom currently does not support Barriers.

I'm opening this up as a Draft PR because I'm looking for early feedback on whether I'm on the right track here or if I have misunderstood the assignment.

@Darksonn
Copy link
Contributor

Yep, that looks like it catches the bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants