You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For example, to address the problems issued, whether it is feasible to change the _set_label_by_threshold(self) function, by setting negative labels to -1?
def _set_label_by_threshold(self):
"""Generate 0/1 labels according to value of features.
According to ``config['threshold']``, those rows with value lower than threshold will
be given negative label, while the other will be given positive label.
See :doc:`../user_guide/data/data_args` for detail arg setting.
Note:
Key of ``config['threshold']`` if a field name.
This field will be dropped after label generation.
"""
threshold = self.config["threshold"]
if threshold is None:
return
self.logger.debug(f"Set label by {threshold}.")
if len(threshold) != 1:
raise ValueError("Threshold length should be 1.")
self.set_field_property(
self.label_field, FeatureType.FLOAT, FeatureSource.INTERACTION, 1
)
for field, value in threshold.items():
if field in self.inter_feat:
self.inter_feat[self.label_field] = (
self.inter_feat[field] >= value
).astype(int)
else:
raise ValueError(f"Field [{field}] not in inter_feat.")
if field != self.label_field:
self._del_col(self.inter_feat, field)
描述这个 bug
以ML-1M数据集为例,评分【1-5】。
生成的稀疏inter矩阵只存储了评分大于threshold的user-item。评分小于threshold的user-item,和未观测的user-item一同设为0。
这种做法没有有效利用显反馈负样本。把显反馈负样本和未观测样本都视作负样本。
问题和诉求
如何复现
复现这个 bug 的步骤:
在quick start中,于下列代码打断点观察即可。
train_data, valid_data, test_data = data_preparation(config, dataset)
实验环境:
The text was updated successfully, but these errors were encountered: