GAPinDNNs Seminar

We organize the GAPinDNNs seminar at the Department for Mathematical Sciences at Chalmers and the University of Gothenburg.

The topics of the seminar are broad and lie at the intersection of machine learning (in particular deep learning), pure mathematics and theoretical physics. We have both more theoretical and more applied speakers. If you would like to receive invitations to upcoming talks, please let the seminar organizers know and we will add you to our email list.

Seminars usually happen in-person only on Thursdays at 10.30am. After the seminar we go for lunch with the speaker and there is a speaker dinner in the evening before or after the talk.

Current seminar organizers are Jan Gerken and Max Guillen.

Talks

Equivariant Manifold Neural ODEs and Differential Invariants
Emma Andersdotter Svensson (Umeå University)
30 May 2024 10:30
MVL-14

Neural ODEs are neural network models where the network is not specified by a discrete sequence of hidden layers. Instead, the network is defined by a vector field describing how the data evolves continuously over time governed by an ordinary differential equation (ODE). These models can be generalized for data living on non-Euclidean manifolds, a concept known as manifold neural ODEs. In our paper, we develop a geometric framework for equivariant manifold neural ODEs. Our work includes a novel formulation of equivariant neural ODEs in terms of differential invariants, based on Lie theory for symmetries of differential equations. We also construct augmented manifold neural ODEs and show that they are universal approximators of equivariant diffeomorphisms on any path-connected manifold.

Combinatorics and Geometry of Complex Causal Networks
Liam Solus (KTH)
23 May 2024 10:30
MVL-14

The field of causality has recently emerged as a subject of interest in machine learning, largely due to major advances in data collection methods in the biological sciences and tech industries where large-scale observational and experimental data sets can now be efficiently and ethically obtained. The modern approach to causality decomposes the inference process into two fundamental problems: the inference of causal relations between variables in a complex system and the estimation of the causal effect of one variable on another given that such a relation exists. The subject of this talk will be the former of the two problems, commonly called causal discovery, where the aim is to learn a complex causal network from the available data. We will give a soft introduction to the basics of causal modeling and causal discovery, highlighting where combinatorics and geometry have already started to contribute. Going deeper, we will analyze how and when geometry and combinatorics help us identify causal structure without the use of experimental data.

Geometric Deep Learning Using Spherical Neurons
Mårten Wadenbäck (Linköping University)
16 May 2024 10:30
MVL-14

We start from geometric first principles to construct a machine learning framework for 3D point set analysis. We argue that spherical decision surfaces are a natural choice for this type of problems, and we represent them using a non-linear embedding of 3D Euclidean space into a Minkowski space, represented by a 5D Euclidean space. Via classification experiments on a 3D Tetris dataset, we show that we can get a geometric handle on the network weights, allowing us to directly apply transformations to the network. The model is further extended into a steerable filter bank, facilitating classification in arbitrary poses. Additionally, we study equivariance and invariance properties with respect to \(O(3)\) transformations.

Equivariant Neural Networks for Biomedical Image Analysis
Karl Bengtsson Bernander
02 May 2024 10:30
MVL-14

In this talk I present an overview to my recently defended PhD thesis conducted within the WASP program. While artificial intelligence and deep learning have revolutionized many fields in the last decade, one of the key drivers has been access to data. This is especially true in biomedical image analysis where expert annotated data is hard to come by. The combination of Convolutional Neural Networks (CNNs) with data augmentation has proven successful in increasing the amount of training data at the cost of overfitting. In our research, equivariant neural networks have been used to extend the equivariant properties of CNNs to more transformations than translations. The networks have been trained and evaluated on biomedical image datasets, including bright-field microscopy images of cytological samples indicating oral cancer, and transmission electron microscopy images of virus samples. By designing the networks to be equivariant to e.g. rotations, it is shown that the need for data augmentation is reduced, that less overfitting occurs, and that convergence during training is faster. Furthermore, equivariant neural networks are more data efficient than CNNs, as demonstrated by scaling laws. These benefits are not present in all problem settings and which benefits will occur is somewhat unpredictable. We have identified that the results to some extent depend on architectures, hyperparameters and datasets. Further research may broaden the performed studies to explain how the results occur with new theory.

Understanding Linear Convolutional Neural Networks via Sparse Factorizations of Real Polynomials (and Decomposing Linear Group-Equivariant Networks)
Kathlén Kohn (KTH)
04 Apr 2024 10:30
MVL-14

This talk will explain that Convolutional Neural Networks without activation parametrize polynomials that admit a certain sparse factorization. For a fixed network architecture, these polynomials form a semialgebraic set. We will investigate how the geometry of this semialgebraic set (e.g., its singularities and relative boundary) changes with the network architecture. Moreover, we will explore how these geometric properties affect the optimization of a loss function for given training data. We prove that for architectures where all strides are larger than one and generic data, the non-zero critical points of the squared-error loss are smooth interior points of the semialgebraic function space. This property is known to be false for dense linear networks or linear convolutional networks with stride one. (For linear networks, that are equivariant under the action of some group, we prove that no fixed network architecture can parametrize the whole space of functions, but that finitely many architectures can exhaust the whole space of linear equivariant functions.) This talk is based on joint work with Joan Bruna, Guido Montúfar, Anna-Laura Sattelberger, Vahid Shahverdi, and Matthew Trager.