transformer_rankers.trainers.transformer_trainer.TransformerTrainer

class transformer_rankers.trainers.transformer_trainer.TransformerTrainer(model, train_loader, val_loader, test_loader, num_ns_eval, task_type, tokenizer, validate_every_epochs, num_validation_batches, num_epochs, lr, sacred_ex, validate_every_steps=- 1, max_grad_norm=0.5, validation_metric='ndcg_cut_10', num_training_instances=- 1)[source]

Bases: object

Performs optimization of the neural models

Logs nDCG at every num_validation_batches epoch. Uses all the visible GPUs.

Parameters
  • model – the transformer model from transformers library. Both SequenceClassification and ConditionalGeneration are accepted.

  • train_loader – pytorch train DataLoader.

  • val_loader – pytorch val DataLoader.

  • test_loader – pytorch test DataLoader.

  • num_ns_eval – number of negative samples for evaluation. Used to accumulate the predictions into lists of the appropiate size.

  • task_type – str with either ‘classification’ or ‘generation’ for SequenceClassification models and ConditionalGeneration models.

  • tokenizer – transformer tokenizer.

  • validate_every_epochs – int containing the number of epochs to calculate validation <validation_metric> when reached. Not used if validate_every_steps is used.

  • num_validation_batches – number of validation batches to use for calculating validation <validation_metric> (-1 if all otherwise the number of samples)

  • num_epochs – int containing the number of epochs to train the model (one epoch = one pass on every instance).

  • lr – float containing the learning rate.

  • sacred_ex – sacred experiment object to log train metrics. None if not to be used.

  • max_grad_norm – float indicating the gradient norm to clip.

  • validate_every_steps – int containing the number of steps (batches) to calculate validation <validation_metric> when reached. (-1 if no logging is required)

  • validation_metric – which evaluation metric to use for validation error (e.g. ndcg_cut_10). See transformer_rankers/evaluation for the metrics.

  • num_training_instances – int cointaining the number of instances to see before doing early stop (-1 if no early stop is required)

__init__(model, train_loader, val_loader, test_loader, num_ns_eval, task_type, tokenizer, validate_every_epochs, num_validation_batches, num_epochs, lr, sacred_ex, validate_every_steps=- 1, max_grad_norm=0.5, validation_metric='ndcg_cut_10', num_training_instances=- 1)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(model, train_loader, val_loader, …)

Initialize self.

fit()

Trains the transformer-based neural ranker.

predict(loader)

Uses trained model to make predictions on the loader.

predict_with_uncertainty(loader, foward_passes)

Uses trained model to make predictions on the loader with uncertainty estimations.

test()

Uses trained model to make predictions on the test loader.

test_with_dropout(foward_passes)

Uses trained model to make predictions on the test loader using MC dropout as bayesian estimation.

fit()[source]

Trains the transformer-based neural ranker.

predict(loader)[source]

Uses trained model to make predictions on the loader.

Parameters

loader – the DataLoader containing the set to run the prediction and evaluation.

Returns

Matrices (logits, labels, softmax_logits)

predict_with_uncertainty(loader, foward_passes)[source]

Uses trained model to make predictions on the loader with uncertainty estimations.

This methods uses MC dropout to get the predicted relevance (mean) and uncertainty (variance) by enabling dropout at test time and making K foward passes.

See “Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning” https://arxiv.org/abs/1506.02142.

Parameters
  • loader – DataLoader containing the set to run the prediction and evaluation.

  • foward_passes – int indicating the number of foward prediction passes for each instance.

Returns

The logits (mean) for every instance, labels, softmax_logits (mean) all predictions obtained during f_passes (foward_passes_logits) and the uncertainties (variance).

Return type

Matrices (logits, labels, softmax_logits, foward_passes_logits, uncertainties)

test()[source]

Uses trained model to make predictions on the test loader.

Returns

Matrices (logits, labels, softmax_logits)

test_with_dropout(foward_passes)[source]

Uses trained model to make predictions on the test loader using MC dropout as bayesian estimation.

Returns

Matrices (logits, labels, softmax_logits, foward_passes_logits, uncertainties)