You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently we started to see rsync failures on the Testing Farm.
The error happens from time to time, so I believe there is some parallelism taking place.
This is happening with tmt version 1.38.0
I believe the following fragment helps to understand the order of the events:
execute task #1: clang-tests on default-0
how: tmt
00:00:00 pass /clang-tests/tests/clang-format-diff (on default-0) [1/15]
00:00:03 pass /clang-tests/tests/fedora-flags (on default-0) [2/15]
00:00:01 pass /clang-tests/tests/gcc-clang-compatibility (on default-0) [3/15]
00:00:00 pass /clang-tests/tests/libomp (on default-0) [4/15]
00:32:56 pass /clang-tests/tests/llvm-test-suite (on default-0) [5/15]
00:00:00 pass /clang-tests/tests/long-double (on default-0) [6/15]
00:00:00 pass /clang-tests/tests/openmp-rpm (on default-0) [7/15]
00:00:10 pass /clang-tests/tests/rhbz_1647130 (on default-0) [8/15]
00:00:01 pass /clang-tests/tests/rhbz_1657544 (on default-0) [9/15]
00:00:00 pass /clang-tests/tests/rhbz_2239619 (on default-0) [10/15]
00:00:00 pass /clang-tests/tests/rhbz_482491 (on default-0) [11/15]
00:00:00 pass /clang-tests/tests/rpmmacros (on default-0) [12/15]
00:00:08 pass /clang-tests/tests/toolchains (on default-0) [13/15]
00:00:01 pass /clang-tests/tests/ucrt64-toolchain (on default-0) [14/15]
00:00:00 pass /clang-tests/tests/use-correct-dwarf-default (on default-0) [15/15]
execute task #2: compiler-rt-tests on default-0
how: tmt
00:00:06 pass /compiler-rt-tests/tests/broken-symlinks (on default-0) [1/3]
00:00:03 pass /compiler-rt-tests/tests/fp16-abi (on default-0) [2/3]
00:00:00 pass /compiler-rt-tests/tests/sanity (on default-0) [3/3]
execute task #3: libomp-tests on default-0
how: tmt
00:00:00 pass /libomp-tests/tests/sanity (on default-0) [1/1]
execute task #4: llvm-tests on default-0
how: tmt
fail: Failed to push workdir to the guest. This usually means that login as 'root' to the guest does not work.
fail: Command 'rsync -s -R -r -z --links --safe-links --delete -e 'ssh -oForwardX11=no -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oConnectionAttempts=5 -oConnectTimeout=60 -oServerAliveInterval=5 -oServerAliveCountMax=60 -oIdentitiesOnly=yes -p22 -i /etc/citool.d/id_rsa_artemis -oPasswordAuthentication=no' /var/ARTIFACTS/work-snapshot-gatingx591m28o/tests/snapshot-gating [email protected]:/' returned 24.
summary: 19 tests executed
finish
summary: 0 tasks completed
plan failed
The exception was caused by 1 earlier exceptions
Cause number 1:
execute step failed
The exception was caused by 1 earlier exceptions
Cause number 1:
Command 'rsync -s -R -r -z --links --safe-links --delete -e 'ssh -oForwardX11=no -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oConnectionAttempts=5 -oConnectTimeout=60 -oServerAliveInterval=5 -oServerAliveCountMax=60 -oIdentitiesOnly=yes -p22 -i /etc/citool.d/id_rsa_artemis -oPasswordAuthentication=no' /var/ARTIFACTS/work-snapshot-gatingx591m28o/tests/snapshot-gating [email protected]:/' returned 24.
stderr (3 lines)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Warning: Permanently added '18.117.234.166' (ED25519) to the list of known hosts.
file has vanished: "/var/ARTIFACTS/work-snapshot-gatingx591m28o/tests/snapshot-gating/execute/data/guest/default-0/clang-tests/tests/rhbz_1647130-8/output.txt.copy"
rsync warning: some files vanished before they could be transferred (code 24) at main.c(1338) [sender=3.3.0]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
So, while copying the files for llvm-tests, the files from the other tests end up being copied too, while something else is still playing with output from clang-tests.
I don't think there's any *.copy created by tmt, I do suspect TF might be a culprit here.
@ukulekek since you were involved in the archiving part of TF recently, WDYT? Could it be that TF creates various *.copy files in tmt's workdir, removing them at some point? Because if that's the case, I think we see a race condition in action: tmt starting an rsync command, with output.txt.copy in the workdir, but the file then gets removed by TF while rsync is still running, which results in rsync reporting it as a vanishing file.
Yes, we caused the problem because we do create copies of files in the working directory and delete them after syncing to the artifacts storage server. I created TFT-2973 issue and will work on it, so it should land next release
Recently we started to see rsync failures on the Testing Farm.
The error happens from time to time, so I believe there is some parallelism taking place.
This is happening with tmt version 1.38.0
I believe the following fragment helps to understand the order of the events:
So, while copying the files for
llvm-tests
, the files from the other tests end up being copied too, while something else is still playing with output fromclang-tests
.The workdir for this execution is available at: https://artifacts.dev.testing-farm.io/26f8be4d-e57f-46af-8098-9900b7bb8d6d/work-snapshot-gatingx591m28o/
The text was updated successfully, but these errors were encountered: