transformer_rankers.datasets.dataset.QueryDocumentDataset

class transformer_rankers.datasets.dataset.QueryDocumentDataset(data, tokenizer, data_partition, negative_sampler, task_type, max_seq_len, sample_data, cache_path, cache_mode='memmap')[source]

Bases: torch.utils.data.dataset.Dataset

Dataset for pointwise learning with <Query,Document> pairs. Generative transformers are not supported for the cached_mode ‘memmap’

__init__(data, tokenizer, data_partition, negative_sampler, task_type, max_seq_len, sample_data, cache_path, cache_mode='memmap')[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(data, tokenizer, data_partition, …)

Initialize self.