Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to sqlite db breaks some tests #2435

Closed
cgwalters opened this issue Jan 8, 2021 · 6 comments · Fixed by #2436
Closed

Switch to sqlite db breaks some tests #2435

cgwalters opened this issue Jan 8, 2021 · 6 comments · Fixed by #2436
Assignees

Comments

@cgwalters
Copy link
Member

e.g.

FAIL: reset
   reset: #       "id": "fedora-coreos-6ffa09d838396a82cad23de9c99bba8bada2110dee1e3d15633891fe9ed64dff.0",
   reset: #       "version": "33.20210108.dev.0",
   reset: #       "requested-packages": [],
   reset: #       "requested-base-removals": [],
   reset: #       "initramfs-etc": [],
   reset: #       "serial": 0,
   reset: #       "timestamp": 1610137000,
   reset: #       "base-local-replacements": [],
   reset: #       "packages": [],
   reset: #       "pending-base-timestamp": 1610137879,
   reset: #       "booted": false,
   reset: #       "pending-base-checksum": "cd6dbd364409fb2fd16b12c1311391eb8c5b0e14744f99a2d79e3c4dcd577b31"
   reset: #     }
   reset: #   ],
   reset: #   "transaction": null,
   reset: #   "cached-update": null
   reset: # }
   reset: + echo '.deployments[0]["base-local-replacements"]|length == 1 failed to match status.json'
   reset: .deployments[0]["base-local-replacements"]|length == 1 failed to match status.json
   reset: + exit 1

Haven't debugged why this is yet.

@cgwalters cgwalters self-assigned this Jan 8, 2021
@cgwalters cgwalters changed the title Some tests currently failing on master Switch to sqlite db breaks some tests Jan 8, 2021
@cgwalters
Copy link
Member Author

jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jan 8, 2021
This reverts commit 51cd493.

It seems like it breaks some rpm-ostree pkglayering tests, which means
it might be subtly changing layering semantics. Let's revert for now
while we investigate to make sure it doesn't somehow get into a release.

See: coreos/rpm-ostree#2435
@cgwalters
Copy link
Member Author

OK I've narrowed this down a bit, here's some debugging I have:

diff --git a/src/daemon/rpmostree-sysroot-upgrader.cxx b/src/daemon/rpmostree-sysroot-upgrader.cxx
index 337cca4c..7eedd70f 100644
--- a/src/daemon/rpmostree-sysroot-upgrader.cxx
+++ b/src/daemon/rpmostree-sysroot-upgrader.cxx
@@ -783,6 +783,9 @@ finalize_replacement_overrides (RpmOstreeSysrootUpgrader *self,
       if (!rpmostree_sack_get_by_pkgname (self->rsack->sack, pkgname, &pkg, error))
         return FALSE;
 
+      if (!pkg)
+        g_printerr ("missing %s\n", pkgname);
+
       /* make inactive if it's missing or if that exact nevra is already present */
       if (pkg && !rpmostree_sack_has_subject (self->rsack->sack, nevra))
         {
diff --git a/src/libpriv/rpmostree-rpm-util.cxx b/src/libpriv/rpmostree-rpm-util.cxx
index a37d86e0..12c95188 100644
--- a/src/libpriv/rpmostree-rpm-util.cxx
+++ b/src/libpriv/rpmostree-rpm-util.cxx
@@ -891,6 +891,19 @@ rpmostree_get_base_refsack_for_root (int                dfd,
   if (!get_sack_for_root (tmpdir.fd, ".", &sack, error))
     return FALSE;
 
+  auto ostreepkg = "ostree";
+  hy_autoquery HyQuery query = hy_query_create (sack);
+  hy_query_filter (query, HY_PKG_NAME, HY_EQ, ostreepkg);
+  g_autoptr(GPtrArray) pkgs = hy_query_run (query);
+  if (pkgs->len == 0)
+    g_printerr ("WARNING: sack does not contain '%s'\n", ostreepkg);
+  
+  auto count = dnf_sack_count (sack);
+  if (count == 0)
+    g_printerr ("WARNING: Empty sack\n");
+  else
+    g_printerr ("sack: %u\n", count);
+
   *out_sack = rpmostree_refsack_new (sack, &tmpdir);
   return TRUE;
 }
diff --git a/tests/vmcheck/test-reset.sh b/tests/vmcheck/test-reset.sh
index 78c03ce2..c104c076 100755
--- a/tests/vmcheck/test-reset.sh
+++ b/tests/vmcheck/test-reset.sh
@@ -31,6 +31,7 @@ vm_cmd ostree refs $(vm_get_pending_csum) --create vmcheck_tmp/with_foo
 vm_rpmostree cleanup -p
 vm_ostree_commit_layered_as_base vmcheck_tmp/with_foo vmcheck
 vm_rpmostree upgrade
+echo "ok base prep"
 
 # now do some layering, overrides, and initramfs
 vm_build_rpm foo version 2.0
@@ -40,6 +41,7 @@ vm_rpmostree override replace --install bar \
   --install /var/tmp/vmcheck/yumrepo/packages/x86_64/baz-1.0-1.x86_64.rpm \
   /var/tmp/vmcheck/yumrepo/packages/x86_64/foo-2.0-1.x86_64.rpm
 vm_rpmostree initramfs --enable
+echo "ok prepared setup"
 
 vm_reboot
 vm_assert_status_jq \

The problem seems to be the /usr/lib/sysimage/rpm-ostree-base-db copy is wrong:

[root@qemu0 ~]# rpm -q foo
foo-1.0-1.x86_64
[root@qemu0 ~]# rpm --dbpath=/usr/lib/sysimage/rpm-ostree-base-db/ -q foo
package foo is not installed
[root@qemu0 ~]# 

Yet this test case used vm_ostree_commit_layered_as_base which should have updated this.

Ah and here's another interesting fact, it is visible in db list for the base commit:

[root@qemu0 ~]# rpm-ostree db list 4f524603c2489fda4ebf5d0b7cd5cfbbf2cb7a8b0fc6f403ce2addd0277446dc foo
ostree commit: 4f524603c2489fda4ebf5d0b7cd5cfbbf2cb7a8b0fc6f403ce2addd0277446dc
 foo-1.0-1.x86_64

@cgwalters
Copy link
Member Author

Worth noting that coreos/fedora-coreos-config#677 would have (potentially) caused this to fail in the FCOS PR.

@cgwalters
Copy link
Member Author

Narrowing it down more via

diff --git a/tests/vmcheck/test-reset.sh b/tests/vmcheck/test-reset.sh
index 78c03ce2..ccd7d3b8 100755
--- a/tests/vmcheck/test-reset.sh
+++ b/tests/vmcheck/test-reset.sh
@@ -31,6 +31,8 @@ vm_cmd ostree refs $(vm_get_pending_csum) --create vmcheck_tmp/with_foo
 vm_rpmostree cleanup -p
 vm_ostree_commit_layered_as_base vmcheck_tmp/with_foo vmcheck
 vm_rpmostree upgrade
+exit 1
+echo "ok base prep"

Then e.g. $ env VMCHECK_DEBUG=1 TESTS=reset ./tests/vmcheck.sh and use ssh to get a shell. Interestingly, the new deployment's db and basedb have exactly the same size, but different sha256 checksums. I think something is going wrong in that rsync...

@cgwalters
Copy link
Member Author

cgwalters commented Jan 8, 2021

Oh yeah...

Rsync finds files that need to be transferred using a "quick check" algorithm (by default) that looks for files that have changed in size or in last-modified time.

Testing a patch to use -I now.

(I mean ordinarily what rsync is doing is sane, it's really ostree's mtime canonicalization that's defeating it)

cgwalters added a commit to cgwalters/rpm-ostree that referenced this issue Jan 8, 2021
Apparently small rpmdb changes can cause the size to stay the
same due to preallocation, and rsync defaults to skipping files
based on (name, size, mtime).  It's really ostree's mtime canonicalization
that's unfortunate here.

Anyways, we obviously don't care about performance here so use
`-I` to disable that rsync check.

(Also remove the `mkdir -p` since it's not necessary since a long time)

Closes: coreos#2435
@cgwalters
Copy link
Member Author

PR in #2436

openshift-merge-robot pushed a commit that referenced this issue Jan 9, 2021
Apparently small rpmdb changes can cause the size to stay the
same due to preallocation, and rsync defaults to skipping files
based on (name, size, mtime).  It's really ostree's mtime canonicalization
that's unfortunate here.

Anyways, we obviously don't care about performance here so use
`-I` to disable that rsync check.

(Also remove the `mkdir -p` since it's not necessary since a long time)

Closes: #2435
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant