Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding class BlobLoader to separate loading from JsonIndexDataset logic #1463

Open
wants to merge 37 commits into
base: main
Choose a base branch
from

Conversation

salaxieb
Copy link

added class BlobLoader which is responsible for loading blob for given FrameData
moved loading function to separate file load_blob.py

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 28, 2023
R=torch.tensor(entry_viewpoint.R, dtype=torch.float)[None],
T=torch.tensor(entry_viewpoint.T, dtype=torch.float)[None],
)
return self.blob_loader.load(frame_data, entry, point_cloud)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d suggest to pass self.seq_annots[entry.sequence_name] here, as we can later add other sequence-level blobs, not just point clouds.

Copy link
Contributor

@shapovalov shapovalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting it so quickly! Overall looks good.
@bottler – do you want to take a look at the overall idea? For the context, this change will be tested together with SqlDataset refactoring in pixar_replay, hence Ildar is developing here and is going to import to fbcode once everything works.


path_manager: Any = None

def __init__(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make it a dataclass or a Configurable to avoid boilerplate.

@@ -0,0 +1,545 @@
import functools
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s call the module blob_loader to match the class name.

self.box_crop_mask_thr: float = box_crop_mask_thr
self.box_crop_context: float = box_crop_context

def load(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this method returns FrameData, it will be unclear if it modifies the object or copies it. Do we need to return it? At least please make sure to document that frame_data is modified in-place.

sequence_file = os.path.join(dataset_root, category, "sequence_annotations.jgz")
self.image_size = 256

expand_args_fields(JsonIndexDataset)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be redundant with modern pytorch3d.

@salaxieb
Copy link
Author

salaxieb commented Mar 9, 2023

I'm not sure about last commit
Fact that it's nice to check all the frames, but same time it makes test longer
I can choose a sweet spot of checking every 200 frames for example. Give me your thoughts..

Copy link
Contributor

@shapovalov shapovalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the verbose test!
I think it should be enough to test on 1 frame only – it is unlikely we can uncover any problem by loading multiple images of the same type.

Also, in a spirit of a unit test, you can make it more lightweight. Currently, the setup includes creating a JsonIndexDataset, which is a lot of stuff to do. If creating a dataset crashes, we don’t even get to testing a loader.
As an option, you can store all relative paths for one frame and test loading functions on them, and for other tests, create FrameAnnotation / FrameData objects with relevant fields set.

def test_load_mask(self):
path = os.path.join(self.dataset_root, self.entry.mask.path)
def _load_mask_test(self, entry):
path = os.path.join(self.dataset_root, entry.mask.path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use methods like self.assertEqual as they give better diagnostics.

Copy link
Contributor

@bottler bottler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good change: splitting the logic makes the dataset object much easier to understand, even before we start looking at new implementations.

@@ -65,7 +63,7 @@ class JsonIndexDataset(DatasetBase, ReplaceableBase):
A dataset with annotations in json files like the Common Objects in 3D
(CO3D) dataset.

Args:
Metadata-related args::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This distinction between the two types of members doesn't matter to the user of this class; it's an implementation detail. We could leave as a big list in this comment, or we could split into groups like "Options affecting WHICH frames are loaded"/"Options affecting what data is loaded for each frame"/"Misc options".

pytorch3d/implicitron/dataset/blob_loader.py Outdated Show resolved Hide resolved
box_crop_context: float = 0.3
path_manager: Any = None

def load(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could call this function load_ to reinforce the inplaceness.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it hint to modifying BlobLoader state rather than FrameData state though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Not sure.

facebook-github-bot pushed a commit that referenced this pull request Apr 4, 2023
…x to FrameData

Summary:
extracted blob loader
added documentation for blob_loader
did some refactoring on fields
for detailed steps and discussions see:
#1463
fairinternal/pixar_replay#160

Reviewed By: bottler

Differential Revision: D44061728

fbshipit-source-id: eefb21e9679003045d73729f96e6a93a1d4d2d51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants