Rana Alotaibi, Yuanyuan Tian, Stefan Grafberger, Jesus Camacho-Rodriguez, Nicolas Bruno, Brian Kroth, Sergiy Matusevych, Ashvin Agrawal, Mahesh Behera, Ashit Gosalia, Cesar Galindo-Legaria, Milind Joshi, Milan Potocnik, Beysim Sezgin, Xiaoyu Li, Carlo Curino.
Towards Query Optimizer as a Service (QOaaS) in a Unified LakeHouse Platform: Can One QO Rule Them All?.
Conference on Innovative Data Systems Research (CIDR),
2024.
Tim Januschowski, Yuyang Wang, Jan Gasthaus, Syama Rangapuram, Caner Turkmen, Jasper Zschiegner, Lorenzo Stella, Michael Bohlke-Schneider, Danielle Maddix, Konstantinos Benidis, Alexander Alexandrov, Christos Faloutsos, Sebastian Schelter.
A Flexible Forecasting Stack.
International Conference on Very Large Databases (VLDB),
2024.
Sergey Redyuk, Zoi Kaoudi, Sebastian Schelter, Volker Markl.
Assisted Design of Data Science Pipelines.
The VLDB Journal — The International Journal on Very Large Data Bases,
2024.
Songgaojun Deng, Olivier Sprangers, Ming Li, Sebastian Schelter, Maarten de Rijke.
Domain Generalization in Time Series Forecasting.
ACM Transactions on Knowledge Discovery from Data (TKDD),
2024.
Olivier Sprangers, Wander Wadman, Sebastian Schelter, Maarten de Rijke.
Hierarchical Forecasting at Scale.
International Journal of Forecasting,
2024.
Sebastian Schelter.
Letter from the Special Issue Editor.
Special issue on “Directions Towards GDPR-Compliant Data Systems and Applications” of the IEEE Data Engineering Bulletin (Vol 45, Issue 1),
2022.
Stefan Grafberger, Paul Groth, Julia Stoyanovich, Sebastian Schelter.
Data Distribution Debugging in Machine Learning Pipelines.
The VLDB Journal — The International Journal on Very Large Data Bases (Special Issue on Data Science for Responsible Data Management),
2021.
Till Doehmen, Mark Raasveldt, Hannes Mühleisen, Sebastian Schelter.
DuckDQ: Data Quality Assertions for Machine Learning Pipelines.
Workshop on Challenges in Deploying and Monitoring ML Systems at the International Conference on Machine Learning (ICML),
2021.
Sebastian Schelter.
Letter from the Special Issue Editor.
Special issue on “Data validation for machine learning models and applications” of the IEEE Data Engineering Bulletin (Vol 44, Issue 1),
2021.
Sebastian Schelter, Julia Stoyanovich.
Taming Technical Bias in Machine Learning Pipelines.
IEEE Data Engineering Bulletin (Special Issue on Interdisciplinary Perspectives on Fairness and Artificial Intelligence Systems),
2020.
Edo Liberty, Zohar Karnin, Bing Xiang, Laurence Rouesnel, Baris Coskun, Ramesh Nallapati, Julio Delgado, Amir Sadoughi, Yury Astashonok, Piali Das, Can Balioglu, Saswata Charkravarty, Madhav Jha, Philip Gaultier, Tim Januschowski, Valentin Flunkert, Bernie Wang, Jan Gasthaus, Syama Rangapuram, David Salinas, Sebastian Schelter, David Arpin, Alexander Smola.
Elastic Machine Learning Algorithms in Amazon SageMaker.
ACM SIGMOD,
2020.
Amir Aghasadeghi, Vera Z. Moffitt, Sebastian Schelter, Julia Stoyanovich.
Zooming Out on an Evolving Graph.
International Conference on Extending Database Technology (EDBT),
2020.
Felix Biessmann, Tammo Rukat, Philipp Schmidt, Prathik Naidu, Sebastian Schelter, Andrey Taptunov, Dustin Lange, David Salinas.
DataWig - Missing Value Imputation for Tables.
Journal of Machine Learning Research (JMLR), open source software track,
2019.
Tilmann Rabl, Christoph Brücke-Wendorff, Philipp Härtling, Stella Stars, Rodrigo Escobar Palacios, Hamesh Patel, Satyam Srivastava, Christoph Boden, Jens Meiners, Sebastian Schelter.
AdaBench - Towards an Industry Standard Benchmark for Advanced Analytics.
TPC Technology Conference on Performance Evaluation & Benchmarking (TPCTC),
2019.
Sebastian Schelter, Stefan Grafberger, Philipp Schmidt, Tammo Rukat, Mario Kiessling, Andrey Taptunov, Felix Biessmann, Dustin Lange.
Differential Data Quality Verification on Partitioned Data.
International Conference on Data Engineering (ICDE),
2019.
Sebastian Schelter, Felix Biessmann, Dustin Lange, Tammo Rukat, Philipp Schmidt, Stephan Seufert, Andrey Taptunov.
Unit Testing Data with Deequ.
ACM SIGMOD (demo),
2019.
Sebastian Schelter, Stefan Grafberger, Philipp Schmidt, Tammo Rukat, Mario Kiessling, Andrey Taptunov, Felix Biessmann, Dustin Lange.
Deequ - Data Quality Validation for Machine Learning Pipelines.
Machine Learning Systems workshop at the conference on Neural Information Processing Systems (NeurIPS),
2018.
Sebastian Schelter, Dustin Lange, Meltem Celikel, Philipp Schmidt, Felix Biessmann, Andreas Grafberger.
Automating Large-Scale Data Quality Verification.
International Conference on Very Large Databases (VLDB),
2018.
Joos-Hendrik Böse, Valentin Flunkert, Jan Gasthaus, Tim Januschowski, Dustin Lange, David Salinas, Sebastian Schelter, Matthias Seeger, Yuyang Wang.
Probabilistic Demand Forecasting at Scale.
International Conference on Very Large Databases (VLDB),
2017.
Felix Biessmann, Pola Lehmann, Daniel Kirsch, Sebastian Schelter.
Predicting Political Party Affiliation from Text.
International Conference on the Advances in Computational Analysis of Political Text (PolText),
2016.
Sebastian Schelter, Douglas Burdick, Berthold Reinwald, Alexandre Evfimievski, Juan Soto, Volker Markl.
Efficient Sample Generation for Scalable Meta Learning.
IEEE International Conference on Data Engineering (ICDE),
2015.
Alexander Alexandrov, Rico Bergmann, Stephan Ewen, Johann-Christoph Freytag, Fabian Hueske, Arvid Heise, Odej Kao, Marcus Leich, Ulf Leser, Volker Markl, Felix Naumann, Mathias Peters, Astrid Rheinländer, Matthias Sax, Sebastian Schelter, Mareike Höger, Kostas Tzoumas, Daniel Warneke.
The Stratosphere platform for big data analytics.
The VLDB Journal — The International Journal on Very Large Data Bases,
2014.