Entity matching (EM) is the problem of determining whether two data records refer to the same real-world entity. A particularly challenging scenario is cross-dataset entity matching, where the matcher has to work with an unseen target dataset for which no labelled examples are available. Cross-dataset EM is crucial in scenarios where a high level of automation is required, and where it is unlikely or impractical to force a domain expert to manually label training data. Recently, approaches based on language models have become popular for EM, and often promise impressive transfer capabilities. However, there is a lack of a comprehensive and systematic study of the cross-dataset EM capabilities of these recent approaches. It is unclear, which categories of language models are actually applicable in a cross-dataset EM setting, how well current EM approaches perform when they are evaluated systematically under a cross-dataset setting, and what the relationship between the prediction quality and deployment cost of various large language model-based EM approaches is. We address these open questions with the first comprehensive and systematic study on cross-dataset entity matching, where we evaluate eight matchers on 11 benchmark datasets, cover a wide variety of model sizes and transfer learning approaches, and also explore and quantify the relation between prediction quality and deployment cost of the matching approaches. We find that fine-tuned small models can perform on par with prompted large models, that data-centric approaches outperform model-centric approaches and that approaches using well-performing small models can be deployed at an orders of magnitude lower cost than comparably performing approaches with large commercial models.