-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lifted structure loss #340
Lifted structure loss #340
Conversation
* Add initial version of pn_loss * Add basic tests for pn_loss
* Update squared L2. * Version bump to 13.3 * Remove \r chars from notebooks. Was breaking the jupyterlabs editor.
Reviewed live * Samplers refactoring * efficientnet better args
* Update squared L2. * Version bump to 13.3 * Remove \r chars from notebooks. Was breaking the jupyterlabs editor. * Add lifted struct and multisim loss. * Clean up doc strings in loss functions. * Add more specific FloatTensor and IntTensor types to samplers. * Remove old losses.py module. * Update multisim loss to include the similarity-P filtering and work with distance instead of similarity. * Change InnerProductDistance to InnerProductSimilarity to make it clear that this does not function like the other distances. * Update Circle Loss to work with distance instead of similarity. * Create custom LogSumExp to support masking and adding 1 to ensure positive loss * Fix bug in valid_anchors where transpose won't work because the shape is [m]. Revert back to reshape. * Fix mypy errors * Remove lifted_struct_loss from losses init * Add period in comment
Fix tensorflow to 2.5 to avoid breaking tests in 2.6
Thanks for the PR! Looking forward to adding this to the tfsim losses. I left some comments on the loss function. |
@owenvallis implemented the changes requested, although i'm a little lost about the following snippet: # Get negative distances
negative_dists, _ = negative_distances(
negative_mining_strategy, pairwise_distances, negative_mask
) since both of results are not being used anywhere else |
Good point! Looks like we're missing the columns-wise concat in the logsumexp. You had it in the previous commit, so pulling that back in I think it should be:
|
@owenvallis implemented changes discussed, and also fixed for |
Thanks! Looks like the CLA is working for the current commits, but you will need to update the author on the older commit. Let me know once the tests are ready and I'll run the checks and merge the PR. |
positive_mining_strategy, pairwise_distances, positive_mask | ||
) | ||
|
||
# Get negative distances |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this again, I think we can remove lines 52-54 and use pairwise_distances
in the tf.gather on line 57.
My concern is that the negative mining strategy is meant to return a single negative per row, but here we really want to keep all the negatives.
wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I mean, it also makes more sense to do this in tf.gather
so we keep consistency (so something like reordered_pairwise_distances = tf.gather(pairwise_distances, positive_indices, axis=1)
) which also agrees with reordered_pairwise_distances
. As fair as I understand by reading the paper, I do think we want to keep all the negatives.
Do you think it makes sense to remove neg mining strategy at all?
Also I keep getting this error when writing tests: FAILED tests/losses/test_lifted_structure_loss.py::TestLiftedStructLoss::test_lifted_struct_loss_test_mode_eager - TypeError: lifted_struct_loss() got multiple values for argument 'distance' for a test like this (just example one, not the correct): import tensorflow as tf
from absl.testing import parameterized
from tensorflow.python.framework import combinations
from tensorflow.keras.losses import Reduction
from tensorflow_similarity import losses
from . import utils
@combinations.generate(combinations.combine(mode=["graph", "eager"]))
class TestLiftedStructLoss(tf.test.TestCase, parameterized.TestCase):
def test_config(self):
lifted_obj = losses.LiftedStructLoss(
reduction=Reduction.SUM,
name="lifted_loss",
distance="cosine",
)
self.assertEqual(lifted_obj.distance.name, "cosine")
self.assertEqual(lifted_obj.name, "lifted_loss")
self.assertEqual(lifted_obj.reduction, Reduction.SUM)
def test_lifted_struct_loss(self):
"""Tests the LiftedStructLoss with different parameters."""
labels = tf.constant([1, 2, 2, 1], dtype=tf.int32)
embeddings = tf.random.normal([4, 10])
expected_loss = 1
lifted_obj = losses.LiftedStructLoss(reduction=Reduction.SUM)
loss = lifted_obj(labels, embeddings)
self.assertAlmostEqual(loss, expected_loss) |
Interesting, I'll have to look into that more. Maybe the kwargs also contain the distance and that is getting passed twice in the init call or something? I'll try and take a look at it tomorrow if I get a chance. |
havent thought about that, will try to do some stuff too |
Had some progress today, we have some problems in the implementation and the distance thing was missing arguments that were coming from ops closed sorry, did not meant to lol |
@owenvallis just implemented some sample tests, now they're running fine, errors are fixed although the loss values are really weird, so im not sure they are correct. Also did not finish to implement test cases. |
412e8d4
to
004c7f5
Compare
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
new pr #342 |
This PR refers to Issue #102 and it is implementing the LSL described in the article https://arxiv.org/pdf/1511.06452.pdf