Artificial Intelligence
Machine learning and artificial intelligence (AI) aim to create algorithms that solve difficult problems and simulate complex intelligent behavior. Many of these algorithms are based on findings and theory from the study of the brain and mind.
Recent rapid advances in these fields have seen the creation of algorithms and agents that can—finally—solve complex real-world problems across a wide range of domains. What are these advances, and how can we take them further? What remains beyond their capacity, and how can we overcome that? What might forever lie beyond their capabilities—or will anything?
In this session we will hear from some of the world’s leading experts in academia and tech. We will also hear from proponents of structure, and from proponents of scale. And we will also hear some radical suggestions for reframing many fundamental problems of intelligence.
Keynote Talks
Dr Feryal Behbahani (Google DeepMind)
Professor Kevin Ellis (Cornell): Doing experiments and acquiring concepts using language and code
Session Chairs
Dr Ishita Dasgupta (Google DeepMind)
Dr Ilia Sucholutsky (Princeton University)
Invited Talks
Professor Najoung Kim (BU, Google): Comparing human and machine inductive biases for compositional linguistic generalization using semantic parsing: Results and methodological challenges
Professor Rafal Bogacz (Oxford): Modelling diverse learning tasks with predictive coding
Dr André Barreto (DeepMind): Generalised policy updates and neuroscience
Dr Wilka Carvalho (Harvard): Predictive representations: building blocks of intelligence
Spotlight Talks
Quentin Ferry (MIT): Emergence and Function of Abstract Representations in Self-Supervised Transformers
Michael Spratling (University of Luxembourg): A margin-based replacement for cross-entropy loss that improves the robustness of deep neural networks on image classification tasks
Luke Eilers (University of Bern): A generalized neural tangent kernel for surrogate gradient learning
Samuel Lippl (Columbia University): The impact of task structure, representational geometry, and learning mechanism on compositional generalization
Anita Keshmirian (Ludwig Maximilian University of Munich): Investigating Causal Judgments in Humans and Large Language Models
Sunayana Rane (Princeton): Can Generative Multimodal Models Count to Ten?
Michael Lepori (Brown): A Mechanistic Analysis of Same-Different Relations in ViTs
Paul Riechers (Beyond Institute for Theoretical Science; BITS): Computational mechanics predicts internal representations of transformers
Aly Lidayan (UC Berkeley): RL Algorithms Are BAMDP Policies: Understanding Exploration, Intrinsic Motivation, and Optimality
Nasir Ahmad (Donders Institute for Brain, Cognition and Behaviour): Correlations are ruining your gradient descent
Motahareh Pourrahimi (McGill; Mila): Human-like Behavior and Neural Representations Emerge in a Neural Network Trained to Search for Natural Objects from Pixels
Pablo Lanillos (Spanish National Research Council): Object-centric reasoning and control from pixels
Chiara Mastrogiuseppe (Universitat Pompeu Fabra): Controlled Maximal Variability Leads to Reliable Performance in Recurrent Neural Networks
Keynote Talks
Cornell University
Doing experiments and acquiring concepts using language and code
Human inductive learning is rapid: From relatively few examples, we can learn the rules of a new game or the norms of a new culture. Inductive learning is also broad: the space of learnable concepts is effectively unbounded, because simpler concepts can compose to build bigger ones. In this talk I propose an inductive learning model whose aim is to be more humanlike in that it is broader-coverage, while supporting learning that is rapid both in the number of examples and the amount of compute required. The model combines language models with Bayesian reasoning and neural code generation. I will also discuss an extension of the model which performs active learning, where the model proposes experiments that best triangulate the target concept. The resulting model approaches human performance on three concept-learning setups, and gives a reasonably close fit to human behavioral data. Together these results suggest an architecture for more human-like inductive learners, which can both learn from examples and also propose basic experiments and ask questions, and which is organized around the approach of equipping language models with explicitly Bayesian machinery.
Joint work with students Top Piriyakulkij, Cassidy Langenfeld
Invited Talks
Boston University/ Google
Comparing human and machine inductive biases for compositional linguistic generalization using semantic parsing: Results and methodological challenges
Compositionality is considered a central property of human language. One key benefit of compositionality is the generalization it enables---the production and comprehension of novel expressions analyzed as new compositions of familiar parts. Whether artificial neural networks (ANNs) can generalize in such a way has been part of a longstanding debate. In this talk, I will discuss several semantic parsing tests we proposed to evaluate compositional linguistic generalization in ANNs, reviewing modeling results from the past few years and comparing them to human generalization patterns. In short, models can match human patterns in cases where only lexical substitution is required, but fail to do so when the generalization targets are structurally novel without being augmented with targeted structural scaffolding. In addition to this general picture, I will discuss the difficulty of testing generalization in the current modeling landscape without open access to the training data.
University of Oxford
Modelling diverse learning tasks with predictive coding
Predictive coding (Rao & Ballard, 1999) is an influential model describing information processing in hierarchically organized cortical circuits. It can learn effectively while only relying on local Hebbian plasticity. The predictive coding model was originally developed to describe unsupervised learning of representation in the visual cortex. This talk will give an overview of recent work extending predictive coding to diverse tasks, including: probabilistic inference, temporal prediction, supervised learning, memory, and novelty detection. The versatility of predictive coding supports that it is a promising model for a fundamental algorithm employed by cortical circuits.
Google DeepMind
Generalised policy updates and neuroscience
Value-based reinforcement learning rests on two basic operations: policy evaluation and policy improvement. All algorithms of the area can be understood as alternating between variations of these two operations. Recently, we introduced generalised policy evaluation (GPE) and generalised policy improvement (GPI). In GPE a policy is evaluated under multiple reward functions simultaneously, while GPI produces an improved policy based on a set of policies rather than a single one. Together, GPE and GPI allow an agent to quickly adapt to a new reward function with little or no learning involved. In this talk, I will present GPE and GPI, explore their unique properties, discuss some previous applications, and speculate about potential connections with neuroscience.
Harvard University
Predictive representations: building blocks of intelligence
Recent progress in AI has shown that intelligent agents can learn about the world via relatively simple prediction algorithms. But the predictions that modern AI systems currently learn typically focus on predicting things within the present or in the near future. An alternative is to learn predictive representations, which summarize aspects of the present with the potentially distant futures that they lead to---that we represent roads by the destinations they afford, rooms by the objects we will use within them, or restaurants by the meals we will experience. In this talk, I'll discuss how the successor representation describes a promising family of predictive representations that enable building blocks of intelligence such as exploration and knowledge transfer.
Princeton University
Learning from almost no data
Deep learning has resulted in substantial progress on a variety of problems, but learning from small amounts of data remains a challenge. The current paradigm of training increasingly large models on increasingly large datasets is not feasible for many settings with limited or expensive data (e.g., rare diseases, niche modalities, low-resource languages) or computational resources. By contrast, humans seem to have an incredible ability to generalize and draw accurate inferences from very little data. My research draws on insights from both cognitive science and computer science to study humans and machines in a mutually informative way. Training ML systems normally requires labeling thousands or even millions of examples of every object the system needs to be able to recognize, yet people can recognize thousands or even millions of different objects without having seen even a single labeled photo of many of them. I show that people leverage relational information like soft labels -- which encode the relationship between each training example to multiple object types -- and a graded sense of similarity between different objects, to efficiently learn about many new objects from few examples. Drawing upon these insights, I challenge the assumption that deep learning requires big data and develop methods for humans to teach AI systems in ways that better support efficient generalization and personalization. I also show that these representationally-aligned AI systems open new doors for studying human cognition in diverse domains by greatly reducing the amount of behavioral data cognitive scientists have to collect when probing people's mental representations.
Spotlight talks
Massachusetts Institute of Technology
Emergence and Function of Abstract Representations in Self-Supervised Transformers
Our brain's ability to form and exploit abstract world models allows us to rapidly navigate new situations, a trait that deep learning systems have historically struggled to replicate. Motivated by recent results showing that foundation models are few-shot learners, this study asks whether self-supervised transformers learn in silico abstract world models. We test this hypothesis by studying small-scale transformers trained to reconstruct partially masked visual scenes generated from a latent blueprint. We show that the networks develop linearly separable low-dimensional manifolds that encode all semantic features of the blueprint, hence forming an abstract model of the dataset. Using precise manipulation experiments, we demonstrate that these abstractions are central to the network's decision-making process. Additionally, we find that transformers organize abstractions in part-whole hierarchies that capture the compositional nature of the dataset. Finally, we introduce a novel architecture that grants us access to the learned abstractions, allowing us to more readily steer the network's decision-making process.
University of Luxembourg
A margin-based replacement for cross-entropy loss that improves the robustness of deep neural networks on image classification tasks
CE loss is standard for training DNNs to perform classification. However, there are many sub-tasks within the domain of classification where CE is sub-optimal. To address this issue we propose high-error margin (HEM) loss. Due to being a margin loss, and in contrast to CE loss, HEM stops modifying the weights once the activation of the target logit sufficiently exceeds the values of the other logits. This helps prevent the trained network making over-confident predictions and reduces over-fitting. Experiments show that HEM is as, or more, effective than CE across a wide range of tasks: unknown class rejection, adversarial robustness, learning with imbalanced data, continual learning, and semantic segmentation. The sub-optimal performance of CE has previously inspired several specialised losses. For example, LogitNorm (LN) trains networks better able to identify, and reject, samples from unknown classes. Logit adjusted (LA) loss improves performance when training data contains imbalanced numbers of samples in each category. For semantic segmentation DICE is one specialised loss among many. Our results show that HEM often out-performs these specialised losses, and in contrast to them, is a general-purpose replacement for CE loss.
University of Bern, Department of Physiology
A generalized neural tangent kernel for surrogate gradient learning
State-of-the-art neural network training methods depend on the gradient of the network function. For networks with activation functions without well-defined derivatives, such as spiking neural networks or binary neural networks, these highly successful methods therefore cannot be applied directly. To overcome this problem, the activation function's derivative is substituted with a surrogate derivative, giving rise to surrogate gradient learning, also known as straight-through estimation. As this method works well in practice, the development of a better theoretical understanding is desirable. In this work, we consider randomly initialized networks in the infinite-width limit using the neural tangent kernel (NTK). We study an extension of the NTK to activation functions with jumps and generalize the NTK to gradient descent with surrogate derivatives, i.e., surrogate gradient learning. This generalization includes the key theorems on the NTK, providing a mathematical foundation for the analysis of surrogate gradient learning in the infinite-width limit. Finally, we link our theoretical analysis to numerical investigations in the literature and conduct numerical experiments to illustrate our results.
University Pompeu Fabra
Controlled Maximal Variability Leads to Reliable Performance in Recurrent Neural Networks
Natural behaviors, even stereotyped ones, exhibit variability. Variability at both the behavioral and neural levels can facilitate exploration and learning. We ask what kind of neural variability does not compromise behavioral performance. We investigate the possibility of generating maximal neural variability while preserving the network's functionality by building on the maximum occupancy principle (MOP) developed for behavior. We consider a random recurrent neural network (RNN) of fixed weights interacting with a random input current controller. Specifically, our MOP controller is designed to maximize cumulative future input entropy while not compromising the network performance in tasks. We show that large variability can be induced in the RNN's units while avoiding terminal states of high – saturating or diverging – activities or by satisfying a maximum energy constraint. Also, we show that the input controller can drive the RNN to perform a context-dependent writing task. Our network switches between stochastic and deterministic modes as needed. These results contribute to a novel theory of neural variability based on future entropy production, reconciling stochastic and deterministic behaviors within a single framework.
Columbia University
The impact of task structure, representational geometry, and learning mechanism on compositional generalization
Compositional generalization (the ability to respond correctly to novel arrangements of familiar components) is thought to be a cornerstone of intelligent behavior. However, a theory of how and why models generalize compositionally across diverse tasks remains lacking. To make progress on this topic, we consider compositional generalization for kernel models with fixed, potentially nonlinear representations and a trained linear readout. We prove that they are limited to conjunction-wise additive compositional computations, and identify compositionality failure modes that arise from the data and model structure. For models in the representation learning (or rich) regime, we show that networks can generalize on an important non-additive task (transitive equivalence) and give a mechanistic explanation for why. Finally, we validate our theory empirically, showing that it captures the behavior of a convolutional neural network trained on a set of compositional tasks. In sum, our theory characterizes the principles giving rise to compositional generalization in kernel models, shows how representation learning can overcome their limitations, and provides a taxonomy of compositional tasks that may be useful beyond the models studied here.
Ludwig Maximilian University of Munich
Investigating Causal Judgments in Humans and Large Language Models
This study investigates biases in causal reasoning among humans and Large Language Models (LLMs) using Causal Bayesian Networks (CBNs), focusing on Canonical Chain (A→B→C) and Common Cause (A←B→C) structures. In these structures, once the intermediate variable (B) is known, the probability of the outcome (C) is normatively independent of the initial cause (A). However, studies have shown that humans often ignore this independence. We tested the mutually exclusive predictions of three theories that could account for this bias (N=300). We found that humans tend to perceive causes in Chain structures as significantly more powerful, providing support for only one of the hypotheses. We then examined if LLMs—trained on language data, including GPT3.5-Turbo, GPT4, and Luminous Supreme Control—exhibit similar biases by adjusting a key 'temperature' hyperparameter. By computing Earthmover’s distance, findings reveal that LLMs, especially at higher temperatures, display a comparable inclination towards Chain structures, suggesting this bias partly arises from language use. These results have significant implications for understanding causal reasoning in humans and Large Language Models.
Princeton University
Can Generative Multimodal Models Count to Ten?
We adapt a developmental psychology paradigm to characterize the counting ability of the foundation model Parti. The Give-N task is often used as the gold-standard for measuring a child's understanding of number concepts. Generative vision and language models can now be probed using something similar to the Give-N task, prompted with text like "five lemons" and asked to generate an image from scratch. We adapt the Give-N task to show that three model scales of the Parti model (350m, 3B, and 20B parameters respectively) each have some counting ability, with a significant jump in performance between the 350m and 3B model scales. We also demonstrate that it is possible to interfere with these models' counting ability simply by incorporating unusual descriptive adjectives for the objects being counted into the text prompt (such as "one hairy orange" -- see Figure 1). We analyze our results in the context of the knower-level theory of child number learning, and illustrate the corresponding gaps in model learning. Notably, we find that the performance boost children gain once they understand the inductive step of counting (that each subsequent number can be counted by adding one to the previous number) is missing from all three scales of model behavior. Our results show that we can gain experimental intuition for how to probe model behavior by drawing from a rich literature of behavioral experiments on humans, and, perhaps most importantly, by adapting human developmental benchmarking paradigms to AI models, we can characterize and understand their behavior with respect to our own.
UC Berkeley, Center for Human Compatible AI
RL Algorithms Are BAMDP Policies: Understanding Exploration, Intrinsic Motivation, and Optimality
We could better design and understand RL algorithms if we had analysis tools as powerful as those that we have for MDP policies. In this work, we cast RL algorithms as policies designed for Bayes-Adaptive MDPs (BAMDPs), which start with a prior over possible MDPs and model learning as the accumulation of information about the actual MDP through experience. Framing RL algorithms as BAMDP policies makes it straightforward to apply the tools for analyzing policies to RL algorithms themselves. One such tool is the potential-based shaping theorem (Ng et al., 1999), which we carefully extend to BAMDPs to show that when intrinsic motivation (IM) terms are BAMDP Potential-based shaping Functions (BAMPFs) they preserve optimal, or approximately optimal, behavior of RL algorithms; otherwise, they can harm the performance of even optimal algorithms. We explain how properly formed IM can help RL algorithms by adding back unanticipated BAMDP value, which we decompose into the value of collecting information and the value of the physical state. We also explain how to design or convert existing IM terms to BAMPFs by expressing these values as potential functions on BAMDP states.
Brown University
A Mechanistic Analysis of Same-Different Relations in ViTs
Vision transformers (ViTs) have achieved state of the art performance on a variety of tasks, yet little is known about the algorithms they learn to solve these tasks. To investigate this, we adopt techniques from mechanistic interpretability to uncover how ViTs perform a simple abstract visual reasoning task: judging whether two objects are the same or different. Even when models achieve similar test accuracy, we reveal that pretrained ViTs adopt qualitatively different algorithms than ViTs trained from scratch on this task. Specifically, the pretrained models’ strategy aligns with the algorithms implemented by computational models of human visual reasoning. Additionally, we reveal that the representations learned by pretrained ViTs are mathematically well-structured, separating distinct visual features (i.e. shape and texture) into separate linear subspaces. This enables precise control over model behavior with surgical interventions. Finally, we relate the linear structure of features in hidden representations to generalization behavior. Our work provides a case study for applying mechanistic interpretability techniques to ViTs while also providing insights into the algorithmic and representational benefits of pretraining.
Beyond Institute for Theoretical Science (BITS)
Computational mechanics predicts internal representations of transformers
Computational mechanics studies the limits of prediction: How much can you predict? What kind of structure is required for optimal prediction? These questions are relevant to both anticipating and interpreting advanced AI systems. Adapting the mathematical framework of computational mechanics, we have been able to predict both (i) internal representations and (ii) the precise decay of next-token entropy as a function of context position, for transformers pre-trained as usual to minimize next-token-prediction loss. We train small transformers across a variety of increasingly complex correlated stochastic processes and verify that (i) activations in the residual stream and (ii) in-context learning indeed both conform to our predictions.
Donders Institute for Brain, Cognition and Behaviour
Correlations are ruining your gradient descent
Biological nervous systems are noisy. Thus, the manner by which robust learning occurs at synaptic connections remains unclear. Furthermore, biologically plausible and local learning rules often struggle to scale to deep networks and to difficult tasks. This has meant that a number of algorithms have been put aside during ongoing successes of modern machine learning. Here we show that node perturbation (NP), a local learning algorithm which relies upon noise injection into a neural network, can be scaled to deep networks and difficult tasks. First, we relate noise-based learning to directional derivatives and show that even systems in which the noise source is inaccessible can be trained. Second, we add to our network architectures a neural activity decorrelating mechanism. This decorrelation mechanism enables biologically plausible and local learning algorithms to scale beyond shallow networks. We theoretically show that this decorrelation is a bridge from regular gradient descent to natural gradient descent - helping to overcome scaling issues in the parameter-loss relation. Finally, we show that this bridge enables significantly faster training by backpropagation, as well as scaling up of alternative learning algorithms.
Motahareh Pourrahimi
McGill University, Mila
Human-like Behavior and Neural Representations Emerge in a Neural Network Trained to Search for Natural Objects from Pixels
Visual search, locating a specific item among visually presented objects, is a key paradigm in visual attention studies. Here, we showed that a neural signature akin to the priority map representation in the primate fronto-parietal attentional control network emerged in the learned representations of a performance-optimized artificial neural network (ANN) model of visual search. We trained an ANN consisting of a model of the primate retina, a convolutional neural network mimicking the ventral visual pathway, and a recurrent neural network (RNN) model of the fronto-parietal network on visual search tasks. After training: RNN units exhibited cue-dependent response patterns similar to those observed in the primate fronto-parietal attention network during visual search; Cue-similarity (a key indicator of priority) was linearly decodable from the RNN units, indicating a distributed representation of the priority map; Decodability of cue-similarity exponentially decreased with increasing spatial distance, suggesting that the priority map was continuously represented within the RNN latent space. Altogether, we presented a neurally-plausible, image-computable model of visual search in which brain-like priority map representations emerged.
Spanish National Research Council
Object-centric reasoning and control from pixels
Autonomous intelligent agents must bridge computational challenges at disparate levels of abstraction, from the low-level spaces of sensory input and motor commands to the high-level domain of abstract reasoning and planning. A key question in designing such agents is how best to instantiate the representational space that will interface between these two levels—ideally without requiring supervision in the form of expensive data annotations. We hypothesize that these objectives can be efficiently achieved by representing the world in terms of objects (grounded in perception and action). In this work, we present a novel, brain-inspired, deep-learning architecture that learns from pixels to interpret, control, and reason about its environment, using object-centric representations. We demonstrate the utility of our approach through tasks in synthetic environments that require a combination of (high-level) logical reasoning and (low-level) continuous control. Results show that the agent can learn emergent conditional relationships, such as (A implies B) and (~A implies C), as well as logical composition ((A implies B) and (A implies C)) therefore A implies (B and C)) and XOR operations, and successfully controls its environment to satisfy objectives deduced from these logical rules. The agent can adapt online to unexpected changes in its environment and is robust to mild violations of its world model. While the present results are limited to synthetic settings (2D and 3D activated versions of dSprites), which fall short of real-world levels of complexity, the proposed architecture shows how grounded object representations, as a key inductive bias for unsupervised learning, enable behavioral reasoning.