-
Notifications
You must be signed in to change notification settings - Fork 23
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Welcome to Codecov 🎉Once merged to your default branch, Codecov will compare your coverage reports and display the results in this comment. Thanks for integrating Codecov - We've got you covered ☂️ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job @viktorpm and @alessandrofelder!
The validation functions appear to do what they claim and are very understandable.
I've sprinkled several pedantic comments throughout, but feel free to take them or leave them, as many of them are a matter of style.
Now, I am coming to my only substantive comment, which concerns the capacity of this design to accommodate multiple future validation functions.
In the validate_atlas()
function, the current two validations are performed sequentially, i.e. first assert validate_atlas_files()
and then assert validate_mesh_matches_image_extents()
.
In this case, it makes sense to have them run sequentially, since there is no point in looking at the image extents if the files are not there to begin with.
However, in future, we may add validators that are not dependent on each other, for example:
- A: validate_mesh_normals
- B: validate_annotation_image_orientation
In this case, if you assert A, then B, the first assertion will fail if there is a problem with the mesh normals, and the second assertion will never run. So you will not learn if there is also a problem with image orientation. At least that's my prediction of what will happen in that case, correct me if I have misread the code.
It would be nice to have a report of all the independent problems with an atlas, not just the first problem encountered.
I wonder if you guys have thought about such use cases - i.e. how to run multiple independent assertions. One solution could be catching each "AssertionError" and logging it into a list of errors instead of erroring out outright. But I'm sure there are also other ways. Since this problem concerns future validators, and not the currently existing ones, feel free to open an issue instead of fixing it here (if you prefer).
P.S: This PR contains updates to Python tooling, like the dev dependencies and the gh actions. Normally, I'd say it's best practice to implement those in an independent PR, but I understand the practical considerations in this case, so I won't complain about it.
bg_atlasgen/validate_atlases.py
Outdated
def validate_atlas_files(atlas_path: Path): | ||
"""Checks if basic files exist in the atlas folder""" | ||
|
||
assert atlas_path.exists(), f"Atlas path {atlas_path} not found" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exists()
method will return True whether the path in question is a directory or a file.
Since we explicitly expect atlas_path
to be a directory, would it make sense to be stricter and check for atlas_path.is_dir()
?
bg_atlasgen/validate_atlases.py
Outdated
for expected_file_name in expected_files: | ||
expected_path = Path(atlas_path / expected_file_name) | ||
assert ( | ||
expected_path.exists() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, here one could be stricter and test expected_path.is_file()
(except for "meshes", which I presume is a directory).
I don't foresee a real case scenario in which a file would be created instead of a folder, or vice versa, but I also don't see a downside to being stricter.
Up to you to decide.
bg_atlasgen/validate_atlases.py
Outdated
|
||
|
||
def _assert_close(mesh_coord, annotation_coord, pixel_size): | ||
"""Helper function to check if the mesh and the annotation coordinate are closer to each other than 10 times the pixel size""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring exceeds usual line length limits, use multi-line strings?
For example:
"""Helper function to check if the mesh and the annotation coordinate are
closer to each other than 10 times the pixel size"""
bg_atlasgen/validate_atlases.py
Outdated
return True | ||
|
||
|
||
def _assert_close(mesh_coord, annotation_coord, pixel_size): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type annotations are missing here. I presume mesh_coord
and annotation_coord
are numpy arrays?
The could be marked as mesh_coord: np.ndarray
Would also be helpful to have a full docstring with expected parameters and returns here (since there are several arguments).
In the description of each argument, the expected array shape could also be indicated.
bg_atlasgen/validate_atlases.py
Outdated
"""Helper function to check if the mesh and the annotation coordinate are closer to each other than 10 times the pixel size""" | ||
assert ( | ||
abs(mesh_coord - annotation_coord) <= 10 * pixel_size | ||
), f"Mesh coordinate {mesh_coord} and annotation coordinate {annotation_coord} differ by more than 10 times pixel size {pixel_size}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this into a multi-line string to avoid violating line length limits
bg_atlasgen/validate_atlases.py
Outdated
z_min_scaled, z_max_scaled = z_min * resolution[0], z_max * resolution[0] | ||
y_min_scaled, y_max_scaled = y_min * resolution[1], y_max * resolution[1] | ||
x_min_scaled, x_max_scaled = x_min * resolution[2], x_max * resolution[2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this block operates on the annotation image, I would move it further above, just after x_min, x_max = ...
and before mesh_points = ...
.
This is to visually split operations on the image vs operations on the mesh vs their comparisons. I would also add a one-line comment above each of these 3 blocks, sth like # get min and max image extents along each axis
etc., to improve readability at a glance, but this a more subjective "aesthetic" point, so feel absolutely free to ignore.
If you decide to follow my advice on refactoring this function, the above points may be rendered mute anyway.
bg_atlasgen/validate_atlases.py
Outdated
def validate_atlas(atlas_name, version): | ||
"""Validates the latest version of a given atlas""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the atlas get updated to latest during validation, why do you need to pass the version as an argument? Even if you pass an older version, it will be overriden by the update, and the function will actually validate the newer version. Or am I wrong?
bg_atlasgen/validate_atlases.py
Outdated
updated = get_atlases_lastversions()[atlas_name]["updated"] | ||
if not updated: | ||
update_atlas(atlas_name) | ||
atlas_path = Path(get_brainglobe_dir()) / f"{atlas_name}_v{version}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't the version change after the update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you would have to get the version after the update
print("Summary") | ||
print("### Valid atlases ###") | ||
print(valid_atlases) | ||
print("### Invalid atlases ###") | ||
print(invalid_atlases) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be useful to also save the output in a .txt (or .md) file in addition to printing it?
bg_atlasgen/validate_atlases.py
Outdated
), f"Atlas file {atlas_path} validation failed" | ||
assert validate_mesh_matches_image_extents( | ||
atlas | ||
), "Atlas object validation failed" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Atlas object validation" is too abstract for this message. You didn't check the entire object, just the extents of the annotation image and the mesh, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless this message anticipates the addition of more checks in this group
…erance argument to _assert_close function
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked at the changes you've already implemented, and they are fine, with the exception of the failing test that I've highlighted.
Let me know if you need help with tackling some of the other trickier suggestions.
assert ( | ||
expected_path.is_file() | ||
), f"Expected file not found at {expected_path}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @viktorpm, the test failure is caused by this assertion. The problem is that you are checking if all elements in expected_files
are indeed existing files (as I suggested), but "meshes" is a folder not a file. I would check the meshes separately with is_dir()
, for example:
assert atlas_path.is_dir(), f"Atlas path {atlas_path} not found"
expected_files = [
"annotation.tiff",
"reference.tiff",
"metadata.json",
"structures.json",
]
for expected_file_name in expected_files:
expected_path = Path(atlas_path / expected_file_name)
assert (
expected_path.is_file()
), f"Expected file not found at {expected_path}"
meshes_path = atlas_path / "meshes"
assert meshes_path.is_dir(), f"Meshes path {meshes_path} not found"
return True
It's important that you check the meshes folder after you check individual files, otherwise the test_invalid_atlas_path()
will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! It's fixed now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the solution you guys came up with for running multiple validations and collecting all the results! Nice work!
I left only 4 tiny comments, 3 of which have to do with appeasing Solar Lint, so that the CI checks won't fail.
Approved 🎉
bg_atlasgen/validate_atlases.py
Outdated
updated = get_atlases_lastversions()[atlas_name]["updated"] | ||
if not updated: | ||
update_atlas(atlas_name) | ||
Path(get_brainglobe_dir()) / f"{atlas_name}_v{version}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line can be removed, as far as I can see it's not being assigned to avariable to used in any other way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, the removal of this line is probably necessary for the Sonar Lint tests to pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
bg_atlasgen/validate_atlases.py
Outdated
# validate_atlas(atlas_name, version) | ||
(atlas_name, version), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In total I count 5 validation functions being passed, what's this 6th set of parameters for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was left there by mistake. Removed now
bg_atlasgen/validate_atlases.py
Outdated
def open_for_visual_check(): | ||
pass | ||
|
||
|
||
def validate_checksum(): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sonar lint complains because it requires you to add a comment to such "empty" functions, mentioning what you intend these for (like you have done for the following check_additional_references()
function).
If you add a comment inside each, the checks should pass.
Thank you @niksirbi for reviewing it! |
Description
What is this PR
Why is this PR needed?
We want to test if all the atlases fit the brainglobe framework
What does this PR do?
References
gets started with brainglobe/brainglobe-atlasapi#217
also fixes #94
fixes #100
fixes #97
How has this PR been tested?
It was run on the HPC. We got the expected output: two lists with passed and failed atlases
Is this a breaking change?
No
Does this PR require an update to the documentation?
Not for now.
Checklist: