Research funding

SolidTree (2026-2031)

NWO KIC (Nature-inclusive decision making in business) project Soil life integrated tree management (SolidTree). PI, WP co-lead (WP on data science and AI). Amount funded to the consortium: 1.82 mil. + cofunding. Coordinator: HAS Green Academy. 5 academic PIs (in soil science, biology, bioinformatics, data science, governance) and many industrial, all in NL.

DECIDE (2026-2031)

NWA ORC (Research along Routes by Consortia) project Democratizing AI, Empowering Citizens through Transparent Decision-making (DECIDE). Co-PI, WP co-lead (WP on AI for public mobility). Amount funded to the consortium: 6.83 mil. Coordinator: UTwente. Many co-PIs, citizen representatives, and companies. Diverse and inclusive project focused on AI ethics in NL.

RealCare (2025-2030)

EC HORIZON project Real-time biomarker detection systems for rapid medical decision-making in cancer and cardiac diseases (RealCare). AI Ethics Advisor for the consortium, WP lead (WP on AI Ethics). Funded up to 80 k based on WP workload. Coordinator: UTwente BIOS. Large biomedical project, with PIs in 7 EU countries.

SoilProS (2023-2028)

NWO Perspectief Soil biodiversity analysis for sustainable production systems (SoilProS). PI, WP co-lead (WP on data science and AI). Amount funded to the consortium: 3.27 mil. + cofunding. Coordinator: Terrestrial Ecology at NIOO-KNAW. 7 academic PIs (in different sciences: biology, chemistry, bioinformatics, data science) and many industrial, all in NL. [News]

FAIR Data Fund (2022)

Dutch 4TU FAIR Data Fund to curate a dataset of constellation line figures from many world astronomies. PI. Amount funded: 3.5 k. [News]

Ecological networks: from co-occurrence to functional models

In the SoilProS NWO Perspectief project we learn functional (or: causal) ecological models among soil microbiota (species), abiotics (physico-chemical properties), management actions, and health measurements. This would allow to steer the ecosystem, to best restore and stabilise the biology of the soil. The input is observational and interventional data, plus domain knowledge. The problem is challenging, because (1) the organisms are microscopic, so their interactions unobservable, and (2) there are thousands of taxa, so we must first reduce dimensionality by learning functional groups of taxa.

In EleMi (CompleNet 2024, authors' accepted version), we show a method (still correlational) to infer spatial co-occurrence networks of microorganisms. EleMi does multi-regression with shared parameters. It is more robust and provides clearer community structure than the existing methods. In gFlora (BIOKDD 2024 or IEEE Tr. Comp. Bio. Bioinf. 2025) we learn functional co-response groups: groups of taxa whose total co-response effect associates well with a soil function. The novelty is in using the spatial co-occurrence network of taxa, plus the abundance of the biota, such that sparse but important taxa are also considered. In Dragon (Ecological Informatics 2025) we obtain causal models by causal discovery purely from observational data, with statistical confidence on the causal links. (Image: a causal model for soil biota.) To validate how far from reality the co-occurrence networks of biota are (as they are obtained from soil samples), we also developed a spatial soil simulator, which shows the patterning of microscopic species underground (in progress).

A causal model for soil biota

Social networks: influence and network location

Among the many methods for influence maximisation, we were among the first to propose metaheuristics based on evolutionary algorithms: single-objective (2016), then multi-objective (2017, best paper award). This was followed by more efficient fitness functions and genetic operators, and a method using the downscaling of communities (2022) which drastically scales up the case studies if the network is modular. These metaheuristics lack explainability of the results: it is not clear why a certain network node has good spreading ability, by itself or in a group. Some explanations could be found for single influencers by linking models of influence diffusion with network statistics: one's influence can be predicted well by combinations of node centrality metrics first in small synthetic networks (2020), then also in large empirical networks (2020), and there is some common pattern of influential network positions across networks. I gave a keynote on this topic at Parallel Problem Solving from Nature (PPSN 2022, slides). (Image: network of Facebook pages with top influencers marked.)

Network of Facebook pages with top influencers marked

Information networks: predictive models of human and machine behaviour

Books form networks by their readers' co-buying habits. This provides information about readers: they are expected to prefer authors of their own gender, but how large is the bias, and with what consequences? In Gender homophily in online book networks (Information Sciences, 2019), I find that author gender assortativity reaches 0.50 : gender segregation is present, but not uniform: it is stronger in certain genres. Since female authors are a minority (33% of all authors), readers (likely female) with a positive bias to female authors end up reading equally from both genders; readers with a bias against female authors end up reading on median only 10-11% female authors. I gave a keynote on such intangible information networks at Network Traffic Measurement and Analysis (TMA 2022, slides). (Image: A community of books on sale on Amazon.com.)

In Learning the mechanisms of network growth (Scientific Reports, 2024) we learn which model of network growth (combinations of preferential attachment, fitness, aging) fit real-world citation networks best, and find that growth models themselves are easy to discriminate from observed dynamics, but the diagnosis of real-world citation networks is inconclusive---so citation networks are not accurately described by any of these typical models.

In Understanding Sparse Neural Networks from their topology via multipartite graph representations (Tr. Machine Learning Research, 2024) we do a topological analysis of SNNs with both linear and convolutional layers, with (i) a new input-aware Multipartite Graph Encoding (MGE), and (ii) new end-to-end topological metrics over the MGE. We show that these topological metrics are much better predictors of the accuracy drop than metrics computed from current input-agnostic single-layer encodings, and that which topological metrics are important varies at different sparsity levels and for different architectures.

I started modelling human-played games with piece captures from an ecological point of view. Here's a summary of empirical chess food webs (poster, CCS'24).

Community of comic books, linked by co-buying relationships

Constellation line figures: network structure, semantics, and geometry

Star constellations may be represented as line figures (spatial graphs in spherical coordinates). I digitised a dataset of constellation line figures (GitHub, .json format in progress) from scholarly literature, extending the sky cultures of the astronomical software Stellarium to 1900+ constellations from 75 cultures (from tribes to empires). Part 1 of the analysis measures the association between the type of culture and the network structure of constellations: The network signature of constellation line figures (PLOS ONE 2022, or ArXiv). This shows that the constellations cluster globally by network typology, as do the cultures. There is great diversity among the topologies drawn around the same root star, with only a minority being universal (those characterised by linear star patterns). Part 2 looks at the association between the semantics (symbolism) of constellations and the star pattern: Parallels in the symbolism of star constellations (ArXiv). This shows where in the world semantic similarities occur in the same regions of the sky, or over the same types of star clusters, finding many more semantic parallels than previously documented; I hypothesise which are natural effects of the star pattern, and which are likely cultural effects. Part 3, on the association between star pattern and line geometry, is work-in-progress. Gave a talk at The Artificial Sky meeting on data in cultural astronomy (slides, 2023). (Image: the reconstructed Golden Feline of the Inca in South America.)

The constellation Golden Feline of the Inca in South America