Schedule

All stationary meetings take place in the building of the Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa 75 Street, 00-662 Warszawa, Poland. Please note that the meetings may take longer than planned. The actual duration depends on the length of the discussions.

The meetings already confirmed for this semester are as follows:

Date Time Room Speaker Subject Abstract
2025-11-06 (Thursday) 15:10-16:00 105 Przemysław Wrona Enabling Semi-Supervised Travel Mode Prediction Through Synthetic Unlabelled Trip Instances
Abstract:

Predicting travel mode choices is crucial, yet the development of classification models is constrained by two factors: limited volumes of real trip data and concept drift. Drift occurs as public transport services change (real drift) and traveler preferences evolve (virtual drift). To overcome the challenge of data scarcity and system evolution, we propose a novel method for generating synthetic unlabelled trip examples. These examples, which include estimated features for trip alternatives, can be injected into data streams at customizable volumes and periods. This allows us to study the optimal way to mix real and synthetic data for semi-supervised learning. Our evaluation across five real urban prediction tasks demonstrates that using synthetic instances significantly improves mode choice model accuracy.

2025-11-12 (Wednesday) 15:15-16:00 online Mieszko Mirgos The use of monotonicity constraints in decision trees for data streams
Abstract:

The growing importance of streaming data calls for an increase in use of streaming machine learning methods, such as Hoeffding trees, that can dynamically respond to the rapid changes that can occur in this environment. However, the use of machine learning methods in high stakes environments requires increased use of prior domain knowledge to maximize accuracy and predictability of decisions. In this context we combine streaming machine learning methods for binary classification tasks with monotonicity constraints. Our modification to Hoeffding trees allows controlled promotion of monotonicity constraints while maintaining the benefits of learning from streaming data. The experiments conducted show that monotonicity constrained Hoeffding trees performed on par with or better than their unconstrained versions.

2025-11-27 (Thursday) 15:10-16:00 105 Paweł Dąbrowski-Tumański Mathematical entanglement - theory, implementation, and applications to live sciences
Abstract:

Knot theory evolved from an attempt to describe natural objects, such as entanglements in the exact language of mathematics. However, it quickly ceased to be merely a curiosity and became a serious mathematical discipline, with deep connections to physics (particularly statistical and quantum mechanics) and explaining some biological processes. During the seminar, I will discuss the history of knot theory, its mathematical foundations, and the invariants of knot theory, which have been implemented as working algorithms. I will discuss both the current state of art and possible directions for the field's development from a computer science perspective. Finally, I will discuss the application of algorithms to solving chemical and biological problems, as well as the connections between knot theory and physics, or the extensions of knot theory to more complex spatial objects.

2025-12-11 (Thursday) 15:10-16:00 105 Łukasz Brzozowski The Price-Pareto Citation Model With Global Community Structure
Abstract:

Citation networks play a central role in bibliometrics and the science of science, yet existing models often neglect the influence of community structure – an essential feature in many real-world networks. While communities are known to shape the evolution of social and biological networks, their impact on citation dynamics remains underexplored. In our work, we generalize and extend an existing citation model – the 3DSI model – by introducing community-aware growth. Each node is assigned to a known community (e.g., a scientific discipline), and the model accounts for intra- and inter-community citation dynamics. The 3DSI model in its basic formulation assumes that the citations are governed not only by the preferential attachment, but also by an accidental one; the same assumptions in a new setting enable us to derive new analytical formulas for many network parameters, such as degree distributions within individual communities and estimates of the ratio between preferential and accidental citations. We validate our model on the DBLP citation network, showing that incorporating community structure significantly improves model fit. Moreover, it provides new insights into citation behaviors across disciplines by separately modeling community-specific degree sequences. Our framework enhances the interpretability of citation networks and offers a flexible foundation for studying their structure and growth.

2026-01-22 (Thursday) 15:10-16:00 105 Szymon Zyguła Viability of Parallelization of Saturation and Extraction Algorithms for E-Graphs
Abstract:

An equivalence graph (e-graph) is a data structure used in rewriting systems for representing expressions from some language, e.g., a programming language. Unlike structures traditionally utilized for representing expressions, e-graphs can represent multiple equivalent expressions at the same time. An algorithm called equality saturation expands an e-graph representing a single expression to an e-graph representing all expressions equivalent to the starting one in accordance with a set of rewrite rules. The rewrite rules are applied to the e-graph until no rule can be applied anymore. Then, the extraction algorithm finds the optimal expression in the e-graph using a given cost function. Although this approach to rewriting is quite flexible and easy to implement, its running time in many cases is prohibitively large. Despite this, no literature gives any attempts at parallelization of the saturation and extraction algorithms. This is a well-known problem in the e-graphs community. We evaluate the effectiveness of parallelizing equality saturation and extraction, demonstrating significant speedups in matching and extraction, while highlighting fundamental limitations in the application phase. Our tests are based on egg, one of the most popular general purpose e-graphs libraries. The library splits the saturation algorithm into three phases: matching, applying, and rebuilding. We develop and evaluate performant parallel algorithms for matching and extraction. We also show that a successful parallelization of the application stage is not possible with egg's current approach to implementing e-graphs. In addition, we develop a parallel Union-Find data structure for use in our algorithms, which allows for, among other things, parallel path compression. Its general design allows for usage in other algorithms, unrelated to e-graphs. The structure leverages atomic variables with relaxed memory ordering, which provide overhead-free synchronization on many platforms, including the popular x86 processor architecture.