Cognitive Science
Design by Amey Zhang
How should an intelligent agent behave in order to best realize their goals? What inferences or actions should they make in order to solve an important computational task? Cognitive science aims to answer these questions at an abstract computational level, using tools from probability theory, statistical inference, and elsewhere.
In this session we will discuss how such optimal behavior should change under different conditions of uncertainty, background knowledge, multiple agents, or constraints on resource. This can be used to understand human behavior in the real world or the lab, as well as build artificial agents that learn robust and generalizable world models from small amounts of data.
Session Chairs
Dr Ruairidh Battleday (Harvard / MIT)
Dr Antonella Maselli (NRC Italy)
Keynote Talks
Professor Anne Collins (UC Berkeley): Pitfalls and advances in computational cognitive modeling
Dr Giovanni Pezzulo (National Research Council of Italy, Rome): Embodied decision-making and planning
Invited Talks
Professor Bill Thompson (University of California, Berkeley): Interactive Discovery of Program-like Social Norms
Professor Dagmar Sternad (Northeastern): Human Control of Dynamically Complex Objects: Predictability, Stability and Embodiment
Professor Samuel McDougle (Yale): Abstractions in Motor Memory and Planning
Dr Fred Callaway (NYU / Harvard): Cultural evolution of compositional problem solving
Dr Maria Eckstein (DeepMind): Understanding Human Learning and Abstraction Using Cognitive Models and Artificial Neural Networks
Spotlight Talks
Nora Harhen (UC Irvine): Developmental differences in exploration reveal differences in structure inference
Simone D'Ambrogio (Oxford): Discovery of Cognitive Strategies for Information Sampling with Deep Cognitive Modelling and Investigation of their Neural Basis
Gaia Molinaro (UC Berkeley): Latent learning progress guides hierarchical goal selection in humans
Lucy Lai (Harvard): Policy regularization in the brain enables robustness and flexibility
Roey Schurr (Harvard): Dynamic computational phenotyping of human cognition
Yulin Dong (Peking): Optimal mental representation of social networks explains biases in social learning and perception
Antonino Visalli (Padova): Extensions of the Hierarchical Gaussian Filter to Wiener diffusion processes
Frank Tong (Vanderbilt): Improved modeling of human vision by incorporating robustness to blur in convolutional neural networks
Lance Ying (Harvard): Grounding Language about Belief in a Bayesian Theory-of-Mind
Jorge Eduardo Ramírez-Ruiz (Universitat Pompeu Fabra): The maximum occupancy principle (MOP) as a generative model of realistic behavior
Rory John Bufacchi (Chinese Academy of Sciences): Egocentric value maps of the near-body environment
Matteo Alleman (Columbia): Modeling behavioral imprecision from neural representations
Colin Conwell (Johns Hopkins): Is visual cortex really “language-aligned”? Perspectives from Model-to-Brain Comparisons in Human and Monkeys on the Natural Scenes Dataset
Ryan Low (UCL): A normative account of the psychometric function and how it changes with stimulus and reward distributions
Keynote Talks
University of California, Berkeley
Pitfalls and advances in computational cognitive modeling
The rise of computational cognitive modeling has largely rested on the promise that it should provide a useful bridge between behavior and neural processes, revealing the computations underlying cognitive phenomena through interpretable model parameters and variables. In this talk, I will show some examples of the promise of this approach but will also highlight some pitfalls and limitations. Finally, I will briefly discuss some advances and future directions that attempt to tackle these issues.
National Research Council of Italy, Rome
Embodied decision-making and planning
Traditional decision-making research predominantly revolves around scenarios akin to classical economic paradigms, where choices are predefined and action dynamics are disregarded. However, human cognition has evolved to confront a myriad of situations requiring what we term "embodied decisions" — scenarios where action dynamics are intrinsic to choice selection. Studying embodied decisions, exemplified by scenarios such as a lion selecting its prey or a person navigating stepping stones across a river, presents unique challenges and opportunities for empirical exploration.
In this presentation, I will give an overview of recent progress in my lab regarding the understanding of embodied decisions “in the lab” and “in the wild”. I will talk about our efforts to develop computational models that move beyond simple linear relationships between decision and action, incorporating feedback loops from action dynamics to decision-making and planning processes. Furthermore, I will elaborate on our work to create models that formalize the various dimensions of embodied choices, including present and future affordances, and align them with parameters commonly examined in classical economic decision frameworks, such as probabilities and utilities.
Ultimately, investigating embodied decisions promises to expand the horizons of decision-making research, shedding light on the fundamental mechanisms that drive human behavior in dynamic and ecologically valid contexts.
Invited Talks
University of California, Berkeley
Interactive Discovery of Program-like Social Norms
Everyday feats of human social intelligence such as waiting in line, taking turns, and sharing space portray a capacity to engage in ad-hoc joint activities that are systematically structured yet flexible and sensitive to context. I will present the results of a behavioral experiment designed to evaluate the predictions of a formal theory of the computations that underpin this capacity. Our model construes the discovery of structured but flexible social norms as joint-planning via theory-of-mind social reasoning in multi-agent sequential decision-making problems. We evaluated the predictions of this model against participant interactions in a 2-player iterated decision-making task. Participants developed norms of interaction that were structured and procedural, but sensitive to context. Across 3 conditions our model captured the way participants balanced joint reward, fairness, and complexity when forming norms in this game.
Northeastern University
Human Control of Dynamically Complex Objects: Predictability, Stability and Embodiment
Disciplinary traditions have created a deep divide between cognitive science and motor neuroscience. However, brain and body interact in a mutually supportive fashion: The morphology and dynamics of our body lays the foundation of what might appear to result from cognitive control. And yet, virtually all actions require direction from high-level cognitive decisions. To disentangle cognitive and physical determinants of our motor behavior, our research has developed a ‘task-dynamic approach’ that explores the dynamic constraints and affordances of a given motor task. With minimal assumptions about low-level ‘primitives’, we deduce what is needed from high-level control to achieve successful behavior. Specifically, our research examines how humans interact with dynamically complex objects, a cup of coffee and a bullwhip. Using a simplified cup of coffee in a virtual environment, we investigated how humans transport this dynamically complex object. Its internal dynamics creates nonlinear interaction forces that can be chaotic and unpredictable for humans. Our studies revealed that humans developed strategies that established predictable and stable interaction forces. Complementary studies on manipulating an even more complex object, an infinite-dimensional bullwhip, revealed how skilled actors simplified its complex dynamics to successfully control and essentially embody the complex tool.
Yale University
Abstractions in Motor Memory and Planning
The fields of motor neuroscience and cognitive psychology are too often siloed. But abstract cognitive processes affect motor behavior in a range of ways, influencing the selection, planning, and learning of movements. In turn, how we move affects what we perceive, closing the loop between cognitive and motor systems. In this talk, I will discuss some recent projects that highlight the intersection of cognition and motor behavior. I will primarily feature work on how higher-level cognitive stages of action planning shape lower-level implicit forms of motor learning. I will also speculate that observed interactions between motor planning and execution in human motor behavior may echo ideas from monkey sensorimotor neurophysiology. Overall, I will try to make the case that studying motor behavior in a vacuum risks missing key stops along the road from thought to action.
NYU / Harvard
Cultural evolution of compositional problem solving
A key feature distinguishing human intelligence from most modern artificial intelligence is its high degree of compositionality. We rarely solve complex problems from scratch, but instead cobble together new solutions from reusable pieces. Attempts to replicate this type of intelligence in AI have seen limited success. This creates a puzzle: How are humans able to effortlessly carve nature at its joints while powerful computers struggle? In this talk, I suggest one possible answer: they don't. I propose that the compositionality of human thought is primarily not the result of explicit decomposition performed by individuals, but is instead driven by a selective pressure for reusable information in cultural evolution. In a simple evolutionary model, I show that compositional systems often become dominant in populations of social learners even when no individual would benefit from developing it. I identify key features of the environments and social information structure that govern the emergence of compositionality, and present preliminary experimental results demonstrating the phenomenon in an iterated learning experiment. These results raise the intriguing possibility that explicit compositionality may not be a prerequisite for intelligence in general, but instead reflects particular type of reasoning best-suited for agents that rely heavily on social learning.
Google DeepMind
Understanding Human Learning and Abstraction Using Cognitive Models and Artificial Neural Networks
Computational cognitive modeling has been an indispensable tool for cognitive science. Spanning Reinforcement Learning (RL), Bayesian Inference, and many others, computational models have shed light on otherwise unobservable cognitive mechanisms and paved the path towards a deeper understanding of the underlying neural substrate. I will first discuss the many strengths of the computational modeling framework, focussing on how RL modeling can help us obtain a precise, mechanistic explanation of cognitive phenomena. Specifically, I will use the framework to explain how learning mechanisms change with age, specifically during the adolescent years, which impose tremendous changes both to the external environment and the neural substrate. I will then show research on some limitations of classic cognitive models, including a lack of generalizability over tasks and models. Lastly, I will discuss recent research at the intersection of cognitive psychology and artificial intelligence, which aims to resolve the shortcomings of classical models. Specifically, I will highlight a novel model of learning that delivers a comprehensive picture of human reward-based learning by augmenting classic RL models with artificial neural networks. The model estimates precise computational mechanisms directly from observed behavior, obliterating the need to hand-specify equations like in the classic approach, and providing both interpretable models and precise predictions. The model reveals that learning and decision making are fundamentally context-dependent, and that memory plays a crucial role that goes far beyond the low-dimensional, Markovian values of classic RL. Overall, the new approach highlights a need to incorporate more complex mechanisms into cognitive models of learning and decision making, and I will highlight one of them: abstraction. Future work can address many of the outstanding questions by integrating novel methods, e.g., procedurally-generated task design.
Spotlight talks
University of California, Irvine
Developmental differences in exploration reveal differences in structure inference
Humans are adept at uncovering the latent structure of environments. This ability is supported in part by prior experience which biases the consideration of possible structures. Here, we asked how developmental differences in prior experience shape structure inference in novel environments. We had 245 8 to 25 year olds complete a patch foraging task previously shown in adults to elicit individual differences in structural inference (Harhen & Bornstein, 2023, PNAS). Participants decided between harvesting a depleting patch of rewards or incurring a time delay to travel to a replenished patch. The environment consisted of three patch types differing in how quickly they depleted. These differences were not explicitly communicated to participants, requiring them to infer the environment’s reward generation process. We used an infinite mixture model to predict how task behavior would differ between an agent expecting a simpler environment a priori, inferring a single patch type, versus an agent expecting a more complex one, inferring multiple. We found that younger participants explored more than adults in the richest patches (𝛽 rich=0.43, p=.004) but not the others (𝛽 poor=0.18, p=.12, 𝛽 neutral=-0.091, p=.21), consistent with simpler structural priors. While their decisions suggested inference of a single patch type, their reaction times demonstrated an adult-like sensitivity to changes in patch quality (𝛽 change = 0.045; p=.049, 𝛽 change*age=0.008, p=.71), revealing a potential knowledge-behavior gap. Our results suggest that across development people represent novel environments with greater complexity, consequently, shaping their exploratory decisions.
University of Oxford
Discovery of Cognitive Strategies for Information Sampling with Deep Cognitive Modelling and Investigation of their Neural Basis
Successful interaction with an uncertain environment requires a careful arbitration between gathering information and committing to a choice. Moreover, uncertainty arising from external independent sources (background uncertainty) may influence such arbitration. We recruited 20 participants who performed a novel information sampling task inside a 7T MRI scanner. We identified distinct information-seeking preferences across individuals, with background uncertainty influencing the information sampling strategy. To characterize the processes that underlie this behaviour, we compared 4 standard cognitive models (SCM) with a novel deep cognitive model (DCM). This approach enabled us to combine a SCM with an artificial neural network (ANN) to discover complex relationships that are challenging to identify with SCMs. We found that DCM allows for more accurate predictions. The analysis of the ANN revealed two distinct interpretable strategies that participants used to perform the task. We are currently studying brain areas that support the computations predicted by the DCM. Overall, we propose new insights into information sampling and show the potential of integrating ANNs with SCM to reveal complex strategies and their neural bases.
University of California, Berkeley
Latent learning progress guides hierarchical goal selection in humans
Humans are autotelic agents, learning through self-defined curricula. Previous accounts have shown that performance and learning progress jointly drive autonomous curriculum development but have not explored how hierarchical structures and covert measures of learning progress impact goal selection. Here, we introduce a novel experimental paradigm to test the hypothesis that individuals exploit the hierarchical structure of the environment to accomplish self-imposed goals. Furthermore, we hypothesize that individuals may perceive progress in their learning even without experiencing external changes in performance (i.e., covertly). We thus introduce the notion of covert learning progress and test whether discovering hierarchical structures speeds it up, thereby impacting goal selection. Initial findings (N = 176) confirm our predictions and highlight inter-individual differences in goal-setting – some of which rely on the ability to recognize hierarchies in the learning environment. Our findings contribute to elucidating the computational mechanisms of human goal selection in hierarchical settings, which may propel advances in teaching and productivity as well as the development of autotelic machines that pursue their own goals.
Harvard University
Policy regularization in the brain enables robustness and flexibility
Capacity-limited agents are driven to reduce the cost of their policies. However, there is limited evidence as to how the brain implements cost-sensitive action selection. Policy regularization is a method in reinforcement learning that uses a “default” policy to regularize a “control” policy to reduce the description length, or information cost, of an agent’s policy. In this study, we use motor sequence learning to understand how the brain learns and stores representations that enable robustness and flexibility in adaptive behavior. Using evidence from lesion studies, we propose a computational division of labor between the dorsolateral striatum (DLS) and dorsomedial striatum (DMS): DLS stores the “default” policy that governs automatic, history-dependent action selection, while DMS flexibly learns the reward-maximizing action given external state. DLS “regularizes” DMS to learn policies that are biased towards the default. We propose that cortical regions such as motor cortex (MC) and prefrontal cortex (PFC) store high-dimensional representations of action and state on which learning occurs. Our model makes novel, experimentally-testable predictions and provides a normative rationale for the functional organization of striatum.
Harvard University and The Hebrew University of Jerusalem
Dynamic computational phenotyping of human cognition
Computational phenotyping has emerged as a powerful tool for characterizing individual variability across a variety of cognitive domains. An individual's computational phenotype is defined as a set of mechanistically interpretable parameters obtained from fitting computational models to behavioral data. However, the interpretation of these parameters hinges critically on their psychometric properties, which are rarely studied. To identify the sources governing the temporal variability of the computational phenotype, we carried out a 12-week longitudinal study using a battery of 7 tasks that measure aspects of human learning, memory, perception, and decision making. To examine the influence of state effects, we collected weekly reports tracking subjects' mood and daily activities. We developed a dynamic computational phenotyping framework to tease apart the time-varying effects of practice and mood states. Our results show that many phenotype dimensions covary with practice and affective factors, indicating that what appears to be unreliability may reflect previously unmeasured structure. These results support a fundamentally dynamic understanding of cognitive variability within an individual.
Peking University
Optimal mental representation of social networks explains biases in social learning and perception
Social networks modulate our beliefs and choices in a wide range of activities. Interestingly, when making decisions within complex social networks, it is often impossible to consider the topological structure of entire network. So which social connections are considered and which are ignored, how will the streamlined representation of network affect our perception and navigation of the social world? Here we propose a computational account whereby individuals with limited cognitive resource construct simplified social network representations flexibly and optimally to facilitate social learning. Using data from 4 separate lab and field studies, we show that network representations derived from our model were similar to those revealed by subjects choices and graph neural network (GNN) trained to mimic human learning. Our model offers a normative explanation for DeGroot learning, one of the most influential heuristics for learning on networks; unifies a variety of seemingly disparate biases previously reported in social relationship perception; provides a window into the cognitive root of some important societal conundrums; and points to potential connections in network representation learning between machine and human intelligence.
University of Padova
Extensions of the Hierarchical Gaussian Filter to Wiener diffusion processes.
In perceptual decision making, the Drift Diffusion Model (DDM) is a common framework for modeling evidence accumulation until a decision threshold is reached. An open question is how prior expectations influence DDM parameters. Three possible mechanisms can be considered, with prior beliefs influencing: 1) The starting point of the evidence accumulation process; 2) The drift rate (the speed of evidence accumulation); 3) The decision threshold (the amount of evidence needed to make a decision). To tackle this issue, we developed three extensions of the Hierarchical Gaussian Filter (HGF) including observation models based on the Wiener first-passage time-distribution (the core of DDMs). Each HGF extension estimates trial-wise trajectories of beliefs about the hidden causes of sensory inputs and their modulation of DDM parameters. After validation via parameter recovery analysis on simulated data, the three alternatives were applied to real behavioral data (N=40) from a volatile random dot tachistogram task. Preliminary findings show significant modulations of all DDM parameters by prior beliefs, with a smaller effect on the drift rate. Subsequent steps involve modeling parameter interplay and applying these models to brain data.
Vanderbilt University
Improved modeling of human vision by incorporating robustness to blur in convolutional neural networks
Whenever a visual scene is cast onto the retina, much of it will appear degraded due to poor resolution in the periphery; moreover, optical defocus can cause blur in central vision. However, the pervasiveness of blurry or degraded input is typically overlooked in the training of convolutional neural networks (CNNs). We hypothesized that the absence of blurry training inputs may cause CNNs to rely excessively on high spatial frequency information for object recognition, thereby causing systematic deviations from biological vision. We evaluated this hypothesis by comparing standard CNNs with CNNs trained on a combination of clear and blurry images. We show that blur-trained CNNs outperform standard CNNs at predicting neural responses to objects across a variety of viewing conditions in the visual cortex of both monkeys and humans. Moreover, blur-trained CNNs acquire increased sensitivity to shape information and greater robustness to multiple forms of visual noise, leading to improved correspondence with human perception. Our results provide novel neurocomputational evidence that blurry visual experiences may be critical for conferring robustness to biological visual systems.
Harvard University
Grounding Language about Belief in a Bayesian Theory-of-Mind
Humans talk about beliefs on a regular basis, often using rich compositional language to describe what others think and know. What explains this capacity to interpret the epistemic content of other minds? In this paper, we take a step towards an answer by grounding the semantics of belief statements in a Bayesian theory-of-mind: By modeling how humans jointly infer goals and beliefs that explain an agent’s actions, then evaluating statements about the agent’s beliefs against these inferences via epistemic logic, our framework provides a conceptual role semantics for belief, explaining the gradedness and compositionality of human belief attributions. We evaluate this framework by studying how humans attribute goals and beliefs while watching an agent solve a gridworld puzzle. In contrast to pure logical deduction and non-mentalizing baselines, our model provides a much better fit to human goal and belief attributions, demonstrating the importance of theory-of-mind for a semantics of belief.
Universitat Pompeu Fabra
The maximum occupancy principle (MOP) as a generative model of realistic behavior
Classical theories of behavior assume that agents tend to maximize some form of utility function. However, very often animals move with curiosity, acting in a reward-free manner. In these cases, behavior is highly variable and complex, possibly to allow for the occupancy of large regions of space-time (survival). Here we propose that the goal of behavior is maximizing occupancy of future paths of actions and states in order to conquer space-time. Rewards play a secondary role, but they are still important to provide agents with the necessary energy and resources to occupy path space. We find that action-state path entropy is the only measure consistent with additivity and other intuitive properties of expected future action-state path occupancy. Goal-directness results from a complex interplay between internal states and terminal world states while the agent maximizes cumulative future action-state entropy. We show that complex behaviors such as `dancing' and hide-and-seek naturally result from the intrinsic motivation to occupy path space. In summary, we present a single-principle theory of behavior that generates both variability and goal-directedness in the absence of reward maximization.
International Center for Primate Brain Research, Institute of Neuroscience, Chinese Academy of Sciences
Egocentric value maps of the near-body environment
Bodypart-centred response fields are pervasive, being observed in single neurons, fMRI, EEG, and behaviour. Nonetheless, their potential to foster neuroscientific understanding remains underexploited because we lack a unifying formal explanation of their origins and role. Here, we used reinforcement learning and artificial neural networks to demonstrate that bodypart-centred fields naturally arise from a simple assumption: agents often experience reward after contacting environmental objects. This explanation reproduces multiple experimental findings foundational in the peripersonal space literature. Crucially, it also suggests that peripersonal fields provide building blocks that create a modular model of the world near the agent: an egocentric value map. This concept is supported by the emergent modularity we observed in our artificial networks, and robustly fits extensive empirical data. Egocentric value maps also provide testable predictions, and subsume existing explanations of bodypart-centered receptive fields.
Columbia University
Modeling behavioral imprecision from neural representations
In recall tasks that require working memory, humans and other animals have heavy-tailed error distributions. What is the source of these large, surprisingly frequent errors? A standard approach suggests that errors come from distinct processes: a narrow `correct response’ distribution, and a uniform guess distribution. This idea has been successful in modeling behavior, but it is unclear how it relates to underlying neural mechanisms. Recent work has instead proposed a noisy winner-take-all model, in which all errors come from the same competitive recall process. Our main contribution is to adapt the winner-take-all idea to the study of neural population geometry. We show theoretically how heterogeneity of neural receptive fields, a common experimental observation, can produce naturalistic error distributions. We then show that the same idea can accurately model the behavior of macaques in a working memory task, using one free parameter. In sum, our work supports previous behavioral models by grounding them in neural data, and provides a theoretical framework for connecting single-cell properties to neural population geometry, and to the behavioral errors made downstream.
Johns Hopkins University
Is visual cortex really “language-aligned”? Perspectives from Model-to-Brain Comparisons in Human and Monkeys on the Natural Scenes Dataset
Recent progress in multimodal AI and “language-aligned” visual representation learning has re-ignited debates about the role of language in shaping the human visual system. In particular, the emergent ability of “language-aligned” vision models (e.g. CLIP) -- and even pure language models (e.g. GPT) -- to predict image-evoked brain activity has led some to suggest that human visual cortex itself may be “language-aligned” in comparable ways. But what would we make of this claim if the same procedures worked in the modeling of visual activity in a species that doesn’t speak language? Here, we deploy controlled comparisons of pure-vision, pure-language, and multimodal vision-language models in prediction of human (N=4) AND rhesus macaque (N=6, 5:IT, 1:V1) ventral visual activity evoked in response to the same set of 1000 captioned natural images (“NSD1000”). We find decidedly similar patterns of results in aggregate model predictivity of early and late ventral visual cortex across both species. Together, we take these results to suggest that language predictivity of the human visual system is not necessarily due to “language-alignment” per se, but rather to the statistical structure of the visual world as reflected in language.
Gatsby Computational Neuroscience Group, University College London
A normative account of the psychometric function and how it changes with stimulus and reward distributions
The behavioral and neural computations underlying decision making are commonly probed using tasks that require subjects to classify a stimulus into one of two categories. Behavior is typically quantified using the psychometric function--the conditional distribution of the choice, given the stimulus. Here, we characterize how the psychometric function emerges from a normative model of decision making under uncertainty, and how it adapts to perceived changes in the stimulus/category distribution and reward structure of the task. In particular, we examine inference and decision making in a Bayesian agent that cannot observe the stimulus itself, but only a noisy neural representation of it. The agent seeks to maximize expected reward (with respect to an internal world model) in the face of sensory/decision noise and stochastic rewards. The optimal behavioral policy is a decision rule defined on neural response space. The resulting psychometric function summarizes the agent’s behavior from the perspective of an experimenter who only observes the stimulus and choice, and to whom behavior appears stochastic due to ignorance about the internal neural state. We analytically demonstrate how the behavioral policy and psychometric function depend on distributions of the sensory and decision noise, stimulus, category, and reward. In a sensory discrimination task with mouse, rat, and human subjects, the normative model can explain empirical changes in the psychometric function in response to varying stimulus distributions.