GAPinDNNs Seminar

We organize the GAPinDNNs seminar at the Department for Mathematical Sciences at Chalmers and the University of Gothenburg.

The topics of the seminar are broad and lie at the intersection of machine learning (in particular deep learning), pure mathematics and theoretical physics. We have both more theoretical and more applied speakers. If you would like to receive invitations to upcoming talks, please let the seminar organizers know and we will add you to our email list.

Seminars usually happen in-person only on Mondays at 13.15am. After the seminar we go for lunch with the speaker and there is a speaker dinner in the evening before or after the talk.

Current seminar organizers are Jan Gerken and Max Guillen.

Talks

Using geometry and domain knowledge for improved interpretation of deep learning models #
Aasa Feragen (Technical University of Denmark)
29 Oct 2024 10:30
MVL-22

Visualization and uncertainty quantification are often used to support our interpretation of deep learning models. In this talk, we show through examples how both visualization and uncertainty quantification can lead to misinterpretation if applied naïvely. Our examples will include equivariant neural networks for graphs and images, as well as uncertainty quantification with structured label variation.

Equivariant and Coordinate Independent Convolutional Networks #
Maurice Weiler (University of Amsterdam)
28 Oct 2024 10:30
MVL-22

Equivariance imposes symmetry constraints on the connectivity of neural networks. This talk investigates the case of equivariant networks for fields of feature vectors on Euclidean spaces or other Riemannian manifolds. Equivariance is shown to lead to requirements for 1) spatial (convolutional) weight sharing, and 2) symmetry constraints on the shared weights themselves. We investigate the symmetry constraints imposed on convolution kernels and discuss how they can be solved and implemented. A gauge theoretic formulation of equivariant CNNs shows that these models are not only equivariant under global transformations, but under more general local gauge transformations as well.

The Geometry of Neuromanifolds #
Giovanni Luca Marchetti (KTH)
14 Oct 2024 13:15
MVL-22

Neural networks parametrize spaces of functions, sometimes referred to as `neuromanifolds’. Their geometry is intimately related to fundamental machine learning aspects, such as expressivity, sample complexity, and training dynamics. For polynomial activation functions, neuromanifolds are (semi-) algebraic varieties, enabling the application of tools and ideas from algebraic geometry to deep learning. In this talk, we will first review the general theory of neuromanifolds, and then present our recent results for deep convolutional networks with monomial activations. In this case, we show that the parametrization is finite, birational, and regular, factoring through the Segre-Veronese embedding. Moreover, by appealing to the theory of the generic Euclidean distance degree, we compute the number of critical points of the (complexified) regression objective for a generic large dataset.

Equivariant Manifold Neural ODEs and Differential Invariants #
Emma Andersdotter Svensson (Umeå University)
30 May 2024 10:30
MVL-14

Neural ODEs are neural network models where the network is not specified by a discrete sequence of hidden layers. Instead, the network is defined by a vector field describing how the data evolves continuously over time governed by an ordinary differential equation (ODE). These models can be generalized for data living on non-Euclidean manifolds, a concept known as manifold neural ODEs. In our paper, we develop a geometric framework for equivariant manifold neural ODEs. Our work includes a novel formulation of equivariant neural ODEs in terms of differential invariants, based on Lie theory for symmetries of differential equations. We also construct augmented manifold neural ODEs and show that they are universal approximators of equivariant diffeomorphisms on any path-connected manifold.

Combinatorics and Geometry of Complex Causal Networks #
Liam Solus (KTH)
23 May 2024 10:30
MVL-14

The field of causality has recently emerged as a subject of interest in machine learning, largely due to major advances in data collection methods in the biological sciences and tech industries where large-scale observational and experimental data sets can now be efficiently and ethically obtained. The modern approach to causality decomposes the inference process into two fundamental problems: the inference of causal relations between variables in a complex system and the estimation of the causal effect of one variable on another given that such a relation exists. The subject of this talk will be the former of the two problems, commonly called causal discovery, where the aim is to learn a complex causal network from the available data. We will give a soft introduction to the basics of causal modeling and causal discovery, highlighting where combinatorics and geometry have already started to contribute. Going deeper, we will analyze how and when geometry and combinatorics help us identify causal structure without the use of experimental data.

Geometric Deep Learning Using Spherical Neurons #
Mårten Wadenbäck (Linköping University)
16 May 2024 10:30
MVL-14

We start from geometric first principles to construct a machine learning framework for 3D point set analysis. We argue that spherical decision surfaces are a natural choice for this type of problems, and we represent them using a non-linear embedding of 3D Euclidean space into a Minkowski space, represented by a 5D Euclidean space. Via classification experiments on a 3D Tetris dataset, we show that we can get a geometric handle on the network weights, allowing us to directly apply transformations to the network. The model is further extended into a steerable filter bank, facilitating classification in arbitrary poses. Additionally, we study equivariance and invariance properties with respect to \(O(3)\) transformations.

Equivariant Neural Networks for Biomedical Image Analysis #
Karl Bengtsson Bernander
02 May 2024 10:30
MVL-14

In this talk I present an overview to my recently defended PhD thesis conducted within the WASP program. While artificial intelligence and deep learning have revolutionized many fields in the last decade, one of the key drivers has been access to data. This is especially true in biomedical image analysis where expert annotated data is hard to come by. The combination of Convolutional Neural Networks (CNNs) with data augmentation has proven successful in increasing the amount of training data at the cost of overfitting. In our research, equivariant neural networks have been used to extend the equivariant properties of CNNs to more transformations than translations. The networks have been trained and evaluated on biomedical image datasets, including bright-field microscopy images of cytological samples indicating oral cancer, and transmission electron microscopy images of virus samples. By designing the networks to be equivariant to e.g. rotations, it is shown that the need for data augmentation is reduced, that less overfitting occurs, and that convergence during training is faster. Furthermore, equivariant neural networks are more data efficient than CNNs, as demonstrated by scaling laws. These benefits are not present in all problem settings and which benefits will occur is somewhat unpredictable. We have identified that the results to some extent depend on architectures, hyperparameters and datasets. Further research may broaden the performed studies to explain how the results occur with new theory.

Understanding Linear Convolutional Neural Networks via Sparse Factorizations of Real Polynomials (and Decomposing Linear Group-Equivariant Networks) #
Kathlén Kohn (KTH)
04 Apr 2024 10:30
MVL-14

This talk will explain that Convolutional Neural Networks without activation parametrize polynomials that admit a certain sparse factorization. For a fixed network architecture, these polynomials form a semialgebraic set. We will investigate how the geometry of this semialgebraic set (e.g., its singularities and relative boundary) changes with the network architecture. Moreover, we will explore how these geometric properties affect the optimization of a loss function for given training data. We prove that for architectures where all strides are larger than one and generic data, the non-zero critical points of the squared-error loss are smooth interior points of the semialgebraic function space. This property is known to be false for dense linear networks or linear convolutional networks with stride one. (For linear networks, that are equivariant under the action of some group, we prove that no fixed network architecture can parametrize the whole space of functions, but that finitely many architectures can exhaust the whole space of linear equivariant functions.) This talk is based on joint work with Joan Bruna, Guido Montúfar, Anna-Laura Sattelberger, Vahid Shahverdi, and Matthew Trager.