To receive news from us, please register to the mailing list. You will receive information about our upcoming seminars and the link to participate. Our seminars are on Mondays, 16.30-17.30 CET, and we typically send the link on Monday morning.
EURO – The Association of European Operational Research Societies has a new instrument: the EURO Online Seminar Series (EURO OSS), and this is the page of the EURO OSS on Operational Research and Machine Learning, with website https://euroorml.euro-online.org/ and link to register to the mailinglist at https://forms.gle/YWLb6EPKRQnQert68.
The EURO OSS on Operational Research and Machine Learning is an online seminar series with the goal to brand the role of Operational Research in Artificial Intelligence. The format is a weekly session that takes place every Monday, 16.30-17.30 (CET). It is 100% online-access, and it has leading speakers from Operational Research, as well as neighbouring areas, that will cover important topics such as explainability, fairness, fraud, privacy, etc. We also have the YOUNG EURO OSS on Operational Research and Machine Learning. In each YOUNG session, three junior academics will show their latest results in this burgeoning area. All talks are recorded, and the videos are uploaded to the website.
The EURO OSS on Operational Research and Machine Learning is organized by Emilio Carrizosa (IMUS – Instituto de Matemáticas de la Universidad de Sevilla) and Dolores Romero Morales (CBS – Copenhagen Business School) with the collaboration of PhD students Nuria Gómez-Vargas (IMUS) and Thomas Halskov (CBS). The Online Seminar Series is free thanks to the support given by EURO, as well as Universidad de Sevilla (US) and Copenhagen Business School (CBS). This is gratefully acknowledged.
For the academic year 2024/25, we have confirmed the participation of the following speakers (in alphabetical order)
YOUNG EURO OSS on Operational Research and Machine Learning speakers
The EURO OSS on Operational Research and Machine Learning is the sequel of a weekly online event that we have run January 2021-May2024, https://congreso.us.es/mlneedsmo/. We had more than 100 speakers mainly from Europe and North America. There were more than 2000 colleagues registered to receive updates. For some of our speakers we have had more than 200 attendees, but there are also quite a few colleagues that watch the videos instead at our YT channel. This YT channel has more than 1000 subscribers, and some of the talks in our Online Seminar Series have more than 1,000 views so far.
The differentiable Feasibility Pump
To attend the seminar, please click here: https://cbs-dk.zoom.us/j/68108652405
Speaker: Prof Andrea Lodi
Andrew H. and Ann R. Tisch Professor
Jacobs Technion-Cornell Institute
Cornell University
USA
Abstract:
Although nearly 20 years have passed since its conception, the feasibility pump algorithm remains a widely used heuristic to find feasible primal solutions to mixed-integer linear problems. Many extensions of the initial algorithm have been proposed. Yet, its core algorithm remains centered around two key steps: solving the linear relaxation of the original problem to obtain a solution that respects the constraints, and rounding it to obtain an integer solution. This paper shows that the traditional feasibility pump and many of its follow-ups can be seen as gradient-descent algorithms with specific parameters. A central aspect of this reinterpretation is observing that the traditional algorithm differentiates the solution of the linear relaxation with respect to its cost. This reinterpretation opens many opportunities for improving the performance of the original algorithm. We study how to modify the gradient-update step as well as extending its loss function. We perform extensive experiments on MIPLIB instances and show that these modifications can substantially reduce the number of iterations needed to find a solution. (Joint work with M. Cacciola, A. Forel, A. Frangioni).
Introducing the Cloud of Spheres Classification: An Alternative to Black-Box Models
To receive the link to attend the seminar, please register to the mailinglist here.
Speaker: Prof Paula Alexandra Amaral
Associate Professor, Department of Mathematics & Nova Math
Nova School of Science & Technology
University “Nova de Lisboa”
Portugal
Abstract:
Machine learning models have been widely applied across various domains, often without critically examining the underlying mechanisms. Black-box models, such as Deep Neural Networks, pose significant challenges in terms of counterfactual analysis, interpretability, and explainability. In situations where understanding the rationale behind a model’s predictions is essential, exploring more transparent machine learning techniques becomes highly advantageous.
In this presentation, we introduce a novel binary classification model called a Cloud of Spheres. The model is formulated as a Mixed-Integer Nonlinear Programming (MINLP) problem that seeks to minimize the number of spheres required to classify data points. This approach is particularly well-suited for scenarios where the structure of the target class is highly non-linear and non-convex, but it can also adapt to cases with linear separability.
Unlike neural networks, this classification model retains data in its original feature space, eliminating the need for kernel functions or extensive hyperparameter tuning. Only one parameter may be required if the objective is to maximize the separation margin, similar to Support Vector Machines (SVMs). By avoiding black-box complexity, the model enhances transparency and interpretability.
A significant challenge with this method lies in finding the global optima for large datasets. To address this, we propose a heuristic approach that has demonstrated good performance on a benchmark set. When compared to state-of-the-art algorithms, the heuristic delivers competitive results, showcasing the model’s effectiveness and practical applicability.
II YOUNG Session
To receive the link to attend the seminar, please register to the mailinglist here.
A unified approach to extract interpretable rules from tree ensembles via Integer Programming
Speaker 1: Lorenzo Bonasera
PhD student at German Aerospace Center (DLR) – Institute for AI Safety and Security, Germany
Abstract:
Tree ensemble methods represent a popular machine learning model, known for their effectiveness in supervised classification and regression tasks. Their performance derives from aggregating predictions of multiple decision trees, which are renowned for their interpretability properties. However, tree ensemble methods do not reliably exhibit interpretable output. Our work aims to extract an optimized list of rules from a trained tree ensemble, providing the user with a condensed, interpretable model that retains most of the predictive power of the full model. Our approach consists of solving a partitioning problem formulated through Integer Programming. The proposed method works with either tabular or time series data, for both classification and regression tasks, and it can include any kind of nonlinear loss function. Through rigorous computational experiments, we offer statistically significant evidence that our method is competitive with other rule extraction methods, and it excels in terms of fidelity towards the opaque model.
A decomposition algorithm for sparse and fair soft regression trees
Speaker 2: Dr Antonio Consolo
Postdoctoral researcher at Politecnico di Milano, Department of Electronics, Information and Bioengineering, Italy
Abstract:
In supervised Machine Learning (ML) models, achieving sparsity in input features is crucial not only for feature selection but also for enhancing model interpretability and potentially improving testing accuracy. Another crucial aspect is to take into account fairness measures, which allows to guarantee comparable performance for individuals belonging to sensible groups. This talk focuses on sparsity and fairness in decision trees for regression tasks, a widely used interpretable ML model.
We recently proposed a new model variant for Soft Regression Trees (SRTs), which are decision trees with probabilistic decision rules at each branch node and a linear regression at each leaf node. Unlike in the previous SRTs approaches, in our variant each leaf node provides a prediction with its associated probability but, for any input vector, the output of the SRT is given by the regression associated with a single leaf node, namely, the one reached by following the branches with the highest probability. Our SRT variant exhibits the “conditional computation” property (a small portion of the tree architecture is activated when generating a prediction for a given data point), provides posterior probabilities, and is well-suited for decomposition. The convergent decomposition training algorithm turned out to yield accurate soft trees in short cpu time.
In this work, we extend the SRT model variant and the decomposition training algorithm to take into account model sparsity and fairness measures. Experiments on datasets from the UCI and KEEL repositories indicate that our approach effectively induces sparsity. Computational results on different datasets show that the decomposition algorithm, adapted to take into account fairness constraints, yield accurate SRTs which guarantee comparable performance between sensitive groups, without significantly affecting the computational load.
Evaluating Large Language Models for Real Estate Valuation: Insights into Performance and Interpretability
Speaker 3: Margot Geerts
PhD student at Research Centre for Information Systems Engineering (LIRIS), Faculty of Economics and Business, KU Leuven, Belgium
Abstract:
The real estate market is vital globally, yet stakeholders often struggle with information asymmetry in property valuation. This study explores how Large Language Models (LLMs) can improve transparency by accurately predicting house prices and providing insights into predictions. Evaluating GPT-4o-mini, Llama 3, and others across diverse datasets, we find that LLMs leverage hedonic variables effectively, achieving accuracy comparable to Gradient Boosted Trees and surpassing k-Nearest Neighbors. Tailored prompting strategies enhance their performance, with market-specific adjustments remaining crucial. While LLM explanations align with state-of-the-art models, their prediction intervals are often overconfident, revealing limitations in reliability and spatial reasoning. This research underscores LLMs’ potential to reduce knowledge gaps, offering accessible, data-driven insights for more transparent valuations.
Bridging smooth regression and mathematical optimization
To receive the link to attend the seminar, please register to the mailinglist here.
Speaker: Prof Vanesa Guerrero
Associate Professor
Department of Statistics
Universidad Carlos III de Madrid
Spain
Abstract:
In today’s data-driven decision-making landscape, it is crucial to develop systems that integrate human knowledge and provide valuable support to the decider. This talk explores the intersection of statistical modeling and mathematical optimization to address the estimation of smooth functions which verify structural properties (e.g. about sign, monotonicity or curvature). By leveraging these shape-constrained functions, we construct surrogate models for complex functions, enabling their use in mixed-integer nonlinear programming (MINLP). This approach exploits separability to enhance tractability, offering a practical framework for tackling challenging optimization problems in real-world applications.
Counterfactual Token Generation in Large Language Models
To receive the link to attend the seminar, please register to the mailinglist here.
Abstract:
“Sure, I am happy to generate a story for you: Captain Lyra stood at the helm of her trusty ship, the Maelstrom’s Fury, gazing out at the endless sea. […] Lyra’s eyes welled up with tears as she realized the bitter truth – she had sacrificed everything for fleeting riches, and lost the love of her crew, her family, and herself.” Although this story, generated by a large language model, is captivating, one may wonder—how would the story have unfolded if the model had chosen “Captain Maeve” as the protagonist instead? We cannot know. State-of-the-art large language models are stateless—they maintain no internal memory or state. Given a prompt, they generate a sequence of tokens as an output using an autoregressive process. As a consequence, they cannot reason about counterfactual alternatives to tokens they have generated in the past. In this work, our goal is to enhance them with this functionality. To this end, we develop a causal model of token generation that builds upon the Gumbel-Max structural causal model. Our model allows any large language model to perform counterfactual token generation at almost no cost in comparison with vanilla token generation, it is embarrassingly simple to implement, and it does not require any fine-tuning nor prompt engineering. We implement our model on Llama 3 8B-Instruct and Ministral-8B-Instruct and conduct a qualitative and a quantitative analysis of counterfactually generated text. We conclude with a demonstrative application of counterfactual token generation for bias detection, unveiling interesting insights about the model of the world constructed by large language models.
III YOUNG Session
To receive the link to attend the seminar, please register to the mailinglist here.
Accelerating Benders decomposition for the p-median problem through variable aggregation
Speaker 1: Rick Willemsen
PhD candidate at Erasmus University Rotterdam, The Netherlands
Abstract:
The p-median problem is a classical location problem where the goal is to select p facilities while minimizing the sum of distances from each location to its nearest facility. Recent advancements in solving the p-median and related problems have successfully leveraged Benders decomposition methods. The current bottleneck is the large number of variables and Benders cuts that are needed. We consider variable aggregation to reduce the size of these models. We propose to partially aggregate the variables in the model based on a start solution; aggregation occurs only when the corresponding locations are assigned to the same facility in the initial solution. In addition, we propose a set of valid inequalities tailored to these aggregated variables. Our computational experiments indicate that our model, post-initialization, provides a stronger lower bound, thereby accelerating the resolution of the root node. Furthermore, this approach seems to positively impact the branching procedure, leading to an overall faster Benders decomposition method.
Fair and Accurate Regression: Strong Formulations and Algorithms
Speaker 2: Anna Deza
IEOR PhD candidate at UC Berkeley, California, USA
Abstract:
We introduce mixed-integer optimization methods to solve regression problems with fairness metrics. To tackle this computationally hard problem, we study the polynomially-solvable single-factor and single-observation subproblems as building blocks and derive their closed convex hull descriptions. Strong formulations obtained for the general fair regression problem in this manner are utilized to solve the problem with a branch-and-bound algorithm exactly or as a relaxation to produce fair and accurate models rapidly. To handle large-scale instances, we develop a coordinate descent algorithm motivated by the convex-hull representation of the single-factor fair regression problem to improve a given solution efficiently. Numerical experiments show competitive statistical performance with state-of-the-art methods while significantly reducing training times.
A Model-Agnostic Framework for Collective and Sparse “Black-Box” Explanations
Speaker 3: Thomas Halskov
PhD Fellow, Copenhagen Business School, Denmark
Abstract:
The widespread adoption of “black-box” models in critical applications highlights the need for transparent and consistent explanations. While local explainer methods like LIME offer insights for individual predictions, they do not consider global constraints, such as continuity and sparsity. We develop a global explanation framework that enforces these constraints on top of local explanations. We illustrate on real-world datasets that we can generate stable explanations across multiple observations, while maintaining desirable global properties.
Data-Driven Algorithm Design and Verification for Parametric Convex Optimization
To receive the link to attend the seminar, please register to the mailinglist here.
Speaker: Prof Bartolomeo Stellato
Assistant Professor
Department of Operations Research and Financial Engineering
Princeton University
USA
Abstract:
We present computational tools to analyze and design first-order methods in parametric convex optimization. First-order methods are popular for their low per-iteration cost and warm-starting capabilities. However, precisely quantifying the number of iterations needed to compute high-quality solutions remains a key challenge, especially in fast applications where real-time execution is crucial. In the first part of the talk, we introduce a numerical framework for verifying the worst-case performance of first-order methods in parametric quadratic optimization. We formulate the verification problem as a mixed-integer linear program where the objective is to maximize the infinity norm of the fixed-point residual after a given number of iterations. Our approach captures a wide range of gradient, projection, proximal iterations through affine or piecewise affine constraints. We derive tight polyhedral convex hull formulations of the constraints representing the algorithm iterations. To improve the scalability, we develop a custom bound tightening technique combining interval propagation, operator theory, and optimization-based bound tightening. Numerical examples show that our method closely matches the true worst-case performance by providing several orders of magnitude reductions in the worst-case fixed-point residuals compared to standard convergence analyses. In the second part of the talk, we present a data-driven approach to analyze the performance of first-order methods using statistical learning theory. We build generalization guarantees for classical optimizers, using sample convergence bounds, and for learned optimizers, using the Probably Approximately Correct (PAC)-Bayes framework. We then use this framework to learn accelerated first-order methods by minimizing the PAC-Bayes bound directly over the key parameters of the algorithms (e.g., gradient steps, and warm-starts). Numerical experiments show that our approach provides strong generalization guarantees for both classical and learned optimizers with statistical bounds that are very close to the true out of sample performance.
Pareto sensitivity, most-changing sub-fronts, and optimal knee solutions
To receive the link to attend the seminar, please register to the mailinglist here.
Speaker: Prof Luis Nunes Vicente
Timothy J. Wilmott Endowed Chair Professor and Department Chair
Department of Industrial and Systems Engineering
Lehigh University
USA
Abstract:
When dealing with a multi-objective optimization problem, obtaining a comprehensive representation of the Pareto front can be computationally expensive. Furthermore, identifying the most representative Pareto solutions can be difficult and sometimes ambiguous. A popular selection are the so-called Pareto knee solutions, where a small improvement in any objective leads to a large deterioration in at least one other objective. In this talk, using Pareto sensitivity, we show how to compute Pareto knee solutions according to their verbal definition of least maximal change. We refer to the resulting approach as the sensitivity knee (snee) approach, and we apply it to unconstrained and constrained problems. Pareto sensitivity can also be used to compute the most-changing Pareto sub-fronts around a Pareto solution, where the points are distributed along directions of maximum change, which could be of interest in a decision-making process if one is willing to explore solutions around a current one. Our approach is still restricted to scalarized methods, in particular to the weighted-sum or epsilon-constrained methods, and require the computation or approximations of first- and second-order derivatives. We include numerical results from synthetic problems and multi-task learning instances that illustrate the benefits of our approach.
Concepts from cooperative game theory for feature selection
To receive the link to attend the seminar, please register to the mailinglist here.
Speaker: Prof Marleen Balvert
Associate professor Operations Research & Machine Learning
Zero Hunger Lab, Tilburg University
The Netherlands
Abstract:
In classification problems, one is often interested in finding the features that determine the class of a sample. One approach is to first train a classification model – such as a random forest or a support vector machine – and then compute feature importance scores using e.g. SHAP or LIME. SHAP is a popular approach to compute feature importance scores, based on the concept of the Shapley value from cooperative game theory. Following the idea of using the Shapley value for feature importance scores, we look into the applicability of other metrics from cooperative game theory as feature importance scores.
Offline Reinforcement Learning for Combinatorial Optimization: A Data-Driven End-to-End Approach to Job Shop Scheduling
To receive the link to attend the seminar, please register to the mailinglist here.
Speaker: Prof Yingqian Zhang
Associate Professor of AI for Decision Making
Department of Industrial Engineering and Innovation Sciences
Eindhoven University of Technology
The Netherlands
Abstract:
The application of (deep) reinforcement learning (RL) to solve combinatorial optimization problems (COPs) has become a very active research field in AI and OR over the past five years. Most existing approaches rely on online RL methods, where an agent learns by interacting with a surrogate environment, typically implemented as a simulator or simply a function. While online RL methods have demonstrated their effectiveness for many COPs, they suffer from two major limitations: sample inefficiency, requiring numerous interactions to learn effective policies, and inability to leverage existing data, including (near-optimal) solutions generated by well-established algorithms. In contrast, Offline (or data-driven) RL learns directly from pre-existing data, offering a promising alternative. In this talk, I will present the first fully end-to-end offline RL method for solving (flexible) job shop scheduling problems (JSSPs). Our approach trains on data generated by applying various algorithms to JSSP instances. We show that, with only 100 training instances, our offline RL approach achieves comparable or better performance to its online RL counterparts and outperforms the algorithms used to generate the training data.
Linear and nonlinear learning of optimization problems
To receive the link to attend the seminar, please register to the mailinglist here.
Speaker: Prof Coralia Cartis
Professor of Numerical Optimization
Mathematical Institute
University of Oxford
United Kingdom
Abstract:
We discuss various ways of improving scalability and performance of optimisation algorithms when special structure is present: from linear embeddings to nonlinear ones, from deterministic to stochastic approaches. Both local and global optimization methods will be addressed.
Feature selection in linear Support Vector Machines via a hard cardinality constraint: a scalable conic decomposition approach
To receive the link to attend the seminar, please register to the mailinglist here.
Speaker: Prof Laura Palagi
Department of Computer, Control and Management Engineering A. Ruberti
Sapienza University of Rome
Italy
Abstract:
In this talk, we present the embedded feature selection problem in linear Support Vector Machines (SVMs), in which an explicit cardinality constraint is employed. The aim is to obtain an interpretable classification model.
The problem is NP-hard due to the presence of the cardinality constraint, even though the original linear SVM amounts to a problem solvable in polynomial time. To handle the hard problem, we introduce two mixed-integer formulations for which novel semidefinite relaxations are proposed. Exploiting the sparsity pattern of the relaxations, we decompose the problems and obtain equivalent relaxations in a much smaller cone, making the conic approaches scalable. We propose heuristics using the information on the optimal solution of the relaxations. Moreover, an exact procedure is proposed by solving a sequence of mixed-integer decomposed semidefinite optimization problems. Numerical results on classical benchmarking datasets are reported, showing the efficiency and effectiveness of our approach.
Joint paper with Immanuel Bomze, Federico D’Onofrio, Bo Peng.
Deep generative models in stochastic optimization
To receive the link to attend the seminar, please register to the mailinglist here.
Abstract:
Many important decisions are taken under uncertainty since we do not know the development of various parameters. In particular, the ongoing green transition requires large and urgent societal investments in new energy modes, infrastructure and technology. The decisions are spanning over a very long time-horizon, and there are large uncertainty towards energy prices, demand of energy, and production from renewable sources.
Stochastic optimization methods can help us make better investment decisions. However, the found results are limited by the quality of the scenarios we use as input.
Recently, deep generative models have been developed that can generate a plenitude of realistic scenarios, but current solvers can frequently not handle more than a handful of scenarios before the model becomes too complex to solve.
In the present talk we will show how to integrate deep generative models as part of the solution process in stochastic optimization, such that we both get faster solution times and more well-founded decisions.
Computational results will be presented for some 2-stage optimization problems, showing the benefits of the integrated approach. The studied problems include 2-stage facility location problems and 2-stage transportation problems.
The research is part of the ERC advanced grant project “DECIDE”.
TBA
To receive the link to attend the seminar, please register to the mailinglist here.
Abstract:
TBA
Gurobi Machine Learning
To receive the link to attend the seminar, please register to the mailinglist here.
Abstract:
Gurobi-machinelearning is an open-source python package to formulate trained regression models in a gurobipy model that can then be solved with the Gurobi solver. We will present this package with a focus on trained neural networks (NN). Representing such as constraints in a MIP leads to large models. Optimization based bound tightening (OBBT) can be used to improve performance for solving such models. We will discuss how Gurobi has been extended to perform OBBT tailored to NN models and present performance results.
Learning Fair and Robust Support Vector Machine Models
To receive the link to attend the seminar, please register to the mailinglist here.
Speaker: Prof Francesca Maggioni
Professor of Operations Research
Department of Management, Information and Production Engineering
University of Bergamo
Italy
Abstract:
In this talk, we present new optimization models for Support Vector Machine (SVM) and Twin Parametric Margin Support Vector Machine (TPMSVM), with the aim of separating data points in two or more classes. To address the problem of data perturbations and protect the model against uncertainty, we construct bounded-by-norm uncertainty sets around each training data and apply robust and distributionally robust optimization techniques. We derive the robust counterpart extension of the deterministic aproaches, providing computationally tractable reformulations. Closed-form expressions for the bounds of the uncertainty sets in the feature space have been formulated for typically used kernel functions.
To improve the fairness of traditional SVM approaches, we further propose a new optimization model which incorporate fairness constraints. These constraints aim to reduce bias in the model’s decision-making process, particularly with respect to sensitive attributes. The fairness-aware models are then evaluated against traditional SVM formulations and other fairness-enhanced approaches found in the literature.
Extensive numerical results on real-world datasets show the benefits of the robust and fair approaches over conventional SVM alternatives.
Sum of squares submodularity
To receive the link to attend the seminar, please register to the mailinglist here.
Abstract:
Submodularity is a key property of set-valued (or pseudo-Boolean) functions, appearing in many different areas such as game theory, machine learning, and optimization. It is known that any set-valued function can be represented as a polynomial of degree less than or equal to n, and it is also known that testing whether a set-valued function of degree greater than or equal to 4 is submodular is NP-hard. In this talk, we consequently introduce a sufficient condition for submodularity based on sum of squares polynomials, which we refer to as sum of squares (sos) submodularity. We show that this condition can be efficiently checked via semidefinite programming and investigate the gap between submodularity and sos submodularity. We also propose applications of this new concept to three areas: (i) learning a submodular function from data, (ii) difference of submodular function optimization, and (iii) approximately submodular functions.
EURO Online Seminar Series on Operational Research and Machine Learning is proudly powered by WordPress