Give sample weight for each row in training data
See original GitHub issueI see the make_loss_fn has an argument weight_feature-name here https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/python/losses.py#L50
But I’m not sure what my training data row should look like if my data is in libsvm format. This is what a row currently looks like in my dataset 0 qid:236145 1:3.4222834 2:7.563366 3:-0.48238873 4:1.
Feature 4 is the feature corresponding to the sample weight (i.e. query frequency). I have created a ranking head like this
ranking_head = tfr.head.create_ranking_head(
loss_fn=tfr.losses.make_loss_fn(_LOSS, weights_feature_name='4'),
eval_metric_fns=eval_metric_fns(),
train_op_fn=_train_op_fn)
But I get the following error:
File "<ipython-input-32-0f108a655421>", line 1, in <module>
ranker.train(input_fn=lambda: input_data_fn(_TRAIN_DATA_PATH), steps=25000)
File "/Users/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 356, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/Users/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1181, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/Users/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1211, in _train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/Users/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1169, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/Users/lib/python2.7/site-packages/tensorflow_ranking/python/model.py", line 285, in _model_fn
features=features, mode=mode, logits=logits, labels=labels)
File "/Users/lib/python2.7/site-packages/tensorflow_ranking/python/head.py", line 196, in create_estimator_spec
features=features, mode=mode, logits=logits, labels=labels)
File "/Users/lib/python2.7/site-packages/tensorflow_ranking/python/head.py", line 146, in create_loss
training_loss = self._loss_fn(labels, logits, features)
File "/Users/lib/python2.7/site-packages/tensorflow_ranking/python/losses.py", line 156, in _loss_fn
loss_ops.append(loss_fn(**kwargs))
File "/Users/lib/python2.7/site-packages/tensorflow_ranking/python/losses.py", line 679, in _pairwise_logistic_loss
_loss, labels, logits, weights, lambda_weight, reduction=reduction)
File "/Users/lib/python2.7/site-packages/tensorflow_ranking/python/losses.py", line 586, in _pairwise_loss
labels, logits, weights)
File "/Users/lib/python2.7/site-packages/tensorflow_ranking/python/losses.py", line 481, in _sort_and_normalize
weights = array_ops.ones_like(labels) * weights
File "/Users/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 862, in binary_op_wrapper
return func(x, y, name=name)
File "/Users/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 1129, in _mul_dispatch
return gen_math_ops.mul(x, y, name=name)
File "/Users/lib/python2.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 5042, in mul
"Mul", x=x, y=y, name=name)
File "/Users/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/Users/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/Users/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3272, in create_op
op_def=op_def)
File "/Users/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1768, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): Incompatible shapes: [64,3] vs. [64,3,1]
[[{{node head/pairwise_logistic_loss/mul}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](head/pairwise_logistic_loss/ones_like, IteratorGetNext:3)]]
[64,3] in the error message correspond to [batch_size, list_size].
What is the best way to give a query_id a specific weight in the training data?
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
weights_column — H2O 3.38.0.3 documentation
This option specifies the column in a training frame to be used when determining weights. Weights are per-row observation weights and do not...
Read more >Why Weight? The Importance of Training on Balanced Datasets
It is important to train models on balanced data sets (unless there is a particular application to weight a certain class with more...
Read more >How do sample weights work in classification models?
Here C is the same for each training sample, assigning equal 'cost' to each instance. In the case that there are sample weights...
Read more >Adding custom weights to training data in PyTorch
More explicitly, I'd like to add a custom weight for every row in my dataset. By default, the weights are 1, which means...
Read more >How to create a sample from an R data frame if weights are ...
How to create a sample from an R data frame if weights are assigned to the row values? - To create a random...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

@PoorvaRane that should work for the losses that accept per-example weights (most of them do).
Thanks for the help, @eggie5. This was an issue before and we had a fix in the latest release and the code is here: https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/python/losses.py#L121.
You may want to upgrade your tf-ranking library or explicitly call tf.sequeeze(…, axis=2) in your transform_fn.