transformer_rankers.datasets.preprocess_crr.transform_dstc8_to_tsv¶
-
transformer_rankers.datasets.preprocess_crr.
transform_dstc8_to_tsv
(path)[source]¶ Transforms dstc8 json format to conversation response ranking tsv file.
See https://github.com/dstc8-track2/NOESIS-II/ for more details of the input format. The output format is label utterance_1 utterance_2 …… candidate_response. Since we do the negative sampling ourselves, we do not get the negative samples from the tsv files, and only read lines with label = 1.
- Parameters
path – str with the path for the json file.
Returns: list with the tsv lines.