Establishing Strong Baselines for TripClick Health Retrieval (2201.00365v1)

Published 2 Jan 2022 in cs.IR and cs.CL

Abstract: We present strong Transformer-based re-ranking and dense retrieval baselines for the recently released TripClick health ad-hoc retrieval collection. We improve the - originally too noisy - training data with a simple negative sampling policy. We achieve large gains over BM25 in the re-ranking task of TripClick, which were not achieved with the original baselines. Furthermore, we study the impact of different domain-specific pre-trained models on TripClick. Finally, we show that dense retrieval outperforms BM25 by considerable margins, even with simple training procedures.

Citations (11)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

GitHub

GitHub - sebastian-hofstaetter/tripclick: Establishing Strong Baselines for TripClick Health Retrieval; ECIR 2022 (5 stars)

Establishing Strong Baselines for TripClick Health Retrieval (2201.00365v1)

Summary

Related Papers

GitHub