Posts

  • Jul 7, 2025 ‐ Our external PhD Student Barrie Kersbergen (co-supervised with Maarten de Rijke) has successfully defended his PhD at the University of Amsterdam! Barrie’s research on recommender systems has been deployed to millions of users at the European e-commerce platform bol.com..
  • Jun 10, 2025 ‐ Meet our lab at the SIGMOD conference in Berlin next week! We are part of the organizing committee of the conference and co-organise the DEEM workshop as well. Furthermore, we will present a workshop paper on Towards Automated Task-Aware Data Validation and run a tutorial on Navigating Data Errors in Machine Learning Pipelines on Friday..
  • May 2, 2025 ‐ Olga and Sebastian took part in the seminar on the Challenges and Opportunities of Table Representation Learning in Dagstuhl, which aims to connect the communities of data management, machine learning, and natural language processing to discuss the future of learning on tabular data..
  • Mar 25, 2025 ‐ Zeyu gave an invited talk about the efficient utilization of language models for table data preparation at the industry event on Next-Generation Data Management Systems at EDBT 2025 in Barcelona, and subsequently presented our paper on A Deep Dive Into Cross-Dataset Entity Matching with Large and Small Language Models..
  • Jan 11, 2025 ‐ Stefan will be co-organising the workshop on Data Management for End-to-End Machine Learning (DEEM) at SIGMOD 2025 in Berlin..
  • Dec 3, 2024 ‐ We have been co-organising a workshop on ‘The EU AI Act – Developing a technical perspective’ together with our colleagues from machine learning and law as well as industry practitioners..
  • Oct 10, 2024 ‐ Our research group has been covered in an interview on the #ai_berlin website..
  • Jun 1, 2024 ‐ We will be present at the upcoming VLDB conference in China with several contributions: .
  • Jun 1, 2024 ‐ At the upcoming SIGMOD conference in Chile, Till will present our paper on SchemaPile: A Large Collection of Relational Database Schemas. SchemaPile is a corpus of more than 200 thousand database schemas, which we envision to be a great resource for ML models dealing with structured data, e.g., in data integration tasks. Furthermore, Stefan will present his initial ideas on Interactively Improving ML Data Preparation Code via ‘Shadow Pipelines’ at the DEEM workshop..
  • May 12, 2024 ‐ Barrie and Zeyu will present two papers at the International Conference on Data Engineering (ICDE) in Utrecht. Barrie will discuss how to choose cost-efficient deployment options for neural recommendation models in e-commerce, while Zeyu will present initial ideas for zero-shot entity matching..