transformer_rankers.eval.results_analyses_tools.evaluate_and_aggregate

transformer_rankers.eval.results_analyses_tools.evaluate_and_aggregate(preds, labels, metrics)[source]

Calculate evaluation metrics for a pair of preds and labels.

Aggregates the results only for the evaluation metrics in metrics arg.

Parameters
  • preds – list of lists of floats with predictions for each query.

  • labels – list of lists with of floats with relevance labels for each query.

  • metrics – list of str with the metrics names to aggregate.

Returns: dict with the METRIC results per model and query.