Luis Oala

Hi I am Luis Oala. I am interested in composable systems for measuring, optimizing and exchanging data states across the entire data generating process in machine learning.

In regular intervals, I share my ideas through writing, code and presentations spanning topics such as data optimization [1, 2, 3, 4, 5], ML data formats [1, 2, 3] or measurement tools for ML systems [1, 2, 3, 4, 5].

I also enjoy promoting opportunities for community. I helped initiate machine learning venues such as Data-Centric Machine Learning Research (DMLR) and AI for Good and co-chaired conferences such as ICLR, the DMLR workshop series or ML4H.

I am Head of Machine Learning at Swiss company Dotphoton and a PhD research scientist at the Department of Artificial Intelligence of Wojciech Samek at Fraunhofer HHI in Berlin, Germany.

Writing

See Google Scholar

Talks and Presentations

2025.09.18 | A Fever Dream of Machine Learning Framework Composability: Croissant and Beyond | [program] | Invited Talk @ 2025 Berlin Summer School of Artificial Intelligence and Society | Berlin, Germany
2025.08.24 | Intro to Data-Centric Deep Learning | [program] | Invited Lecture @ 2025 TAIK AI Camp | Ethiopia, Cameroon, and Tanzania
2025.03.17 | Data Market for Healthcare AI | [slides] [notes] | Invited Talk @ National University Singapore | Singapore
2024.12.13 | Croissant: A Metadata Format for ML-Ready Datasets | [abstract] [slides] [poster] | Poster^spotlight @ NeurIPS 2024 | Vancouver, Canada
2024.12.11 | Generative Fractional Diffusion Models | [abstract] [slides] [poster] | Poster @ NeurIPS 2024 | Vancouver, Canada
2024.12.04 | A Fever Dream of Machine Learning Framework Composability | [abstract] [video] | Invited Talk @ Microsoft Research | Nairobi, Kenya
2024.09.17 | Dotphoton: Your Image Data, Fit for AI | [slides] | Invited Talk @ Innosuisse | San Francisco, USA
2024.09.02 | Paradoxes in Data-Centric Machine Learning | [slides] [abstract] | Invited Talk @ Deep Learning Indaba 2024 | Dakar, Senegal
2024.06.09 | Croissant: A Metadata Format for ML-Ready Datasets | [paper] [slides] | Contributed Talk^{best paper award} @ SIGMOD/PODS Data Management for End-to-End Machine Learning Workshop | Santiago, Chile
2024.05.31 | From Diverse Datasets to United Nations Public Good Tasks | [program] [slides] | Opening Remarks @ ITU AI for Good Summit | Geneva, Switzerland
2024.01.12 | DTX (Data-Transform Exchange): A Protocol for Composable Data and Transform Transactions | [slides] | Session Chair @ Schloss Dagstuhl Open Machine Learning 2024 Winter Workshop | Dagstuhl, Germany
2023.12.12. | DiffInfinite: Large Mask-Image Synthesis via Parallel Random Patch Diffusion in Histopathology | [abstract] | Poster^spotlight @ NeurIPS 2023 | New Orleans, USA
2023.11.30 | Metrological Machine Learning (2ML) | [slides] [program] | Invited Talk @ IEEE BIP Tecnológico de Costa Rica | San Carlos, Costa Rica
2023.11.01 | Metrological Machine Learning (2ML) | [slides] [report] | Invited Talk @ King Abdulaziz City for Science and Technology (KACST) | Riyadh, Saudi Arabia
2023.10.02 | Inspiration Exchange - Data-Centric AI | [recording] | Panelist @ Mihaela van der Schaar Lab University of Cambridge | Cyberspace
2023.07.12 | Interview | [abstract] [raw video] | Interview @ IEEE TEMS/ACM with Stephen Ibaraki | Cyberspace
2023.03.24 | Data and AI solution Assessment Methods | [slides] | Invited Talk @ Harvard University | Cambridge, USA
2023.01.10 | DMLR: Data-centric Machine Learning Research - Past, Present and Future | [notes] | Session Chair @ Asilomar Retreat on Future of Datasets | Monterey, USA
2022.11.28 | Q&A: Emmanuel Candes | [recording] [program] | Session Chair @ ML4H 2022 | New Orleans, USA
2022.11.28 | Q&A: Ben Recht | [recording] [program] | Session Chair @ ML4H 2022 | New Orleans, USA
2022.11.28 | Panel with Himabindu Lakkaraju, Zack Lipton & Mihaela van der Schaar | [recording] [program] | Session Chair @ ML4H 2022 | New Orleans, USA
2022.05.23 | The Audit of a Diabetic Retinopathy Classification Model | [poster] | Poster @ SAIL 2022 | Hamilton, Bermuda
2021.03.17 | Interval Neural Networks as Instability Detectors for Image Reconstructions | [recording] [program] [paper] | Contributed Talk^{best paper award} @ BVM 2021 | Regensburg, Germany
2020.12.11 | ML4H Auditing: From Paper to Practice | [recording] [poster] [paper] | Contributed Talk^spotlight @ ML4H 2020 | Cyberspace
2020.07.17 | Detecting Failure Modes in Image Reconstructions with Interval Neural Network Uncertainty | [recording] [paper] [program] | Contributed Talk^spotlight @ ICML UDL 2020 | Cyberspace
2020.01.22 | AI Test Metric Specification | [slides] | Invited Talk @ WHO PAHO | Brasilia, Brazil
2019.11.12 | Data and AI Solution Assessment Methods | [slides] | Invited Talk @ Department of Telecommunications of India | New Delhi, India
2019.09.04 | Data and AI Solution Assessment Methods | [slides] | Invited Talk @ Universal Communications Service Access Fund of Tanzania | Zanzibar, Tanzania