Publications

You can also check out Google Scholar for a more up-to-date publication list.

Research Articles and Reviews

2025

Research Articles

Dax, Maximilian, Green, Stephen R, Gair, Jonathan, Gupte, Nihar, Pürrer, Michael, Raymond, Vivien, Wildberger, Jonas, Macke, Jakob H, Buonanno, Alessandra, Schölkopf, Bernhard
Real-time gravitational-wave inference for binary neutron stars using machine learning
Nature, 2025
url | preprint | news and views | briefing

Haxel, Lisa, Ahola, Oskari, Belardinelli, Paolo, Ermolova, Maria, Humaidan, Dania, Macke, Jakob H, Ziemann, Ulf
Decoding Motor Excitability in TMS using EEG-Features: An Exploratory Machine Learning Approach
IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2025
url

Schulz, Auguste, Vetter, Julius, Gao, Richard, Morales, Daniel, Lobato-Rios, Victor, Ramdya, Pavan, Gonçalves, Pedro J, Macke, Jakob H
Modeling conditional distributions of neural and behavioral data with masked variational autoencoders
Cell Reports, 2025
url

Gloeckler, Manuel, Toyota, Shoji, Fukumizu, Kenji, Macke, Jakob H
Compositional simulation-based inference for time series
ICLR, 2025
url

Moss, Guy, Višnjević, Vjeran, Eisen, Olaf, Orachewski, Falk M, Schröder, Cornelius, Macke, Jakob H, Drews, Reinhard
Simulation-Based Inference of Surface Accumulation and Basal Melt Rates of an Antarctic Ice Shelf from Isochronal Layers
Journal of Glaciology, 2025
url | preprint

Vetter, Julius, Gloeckler, Manuel, Gedon, Daniel, Macke, Jakob H
Effortless, Simulation-Efficient Bayesian Inference using Tabular Foundation Models
arXiv, 2025
url

2024

Research Articles

Pals, Matthijs, Macke, Jakob H, Barak, Omri
Trained recurrent neural networks develop phase-locked limit cycles in a working memory task
PLOS CB, 2024
url

Vetter, Julius, Macke, Jakob H, Gao, Richard
Generating realistic neurophysiological time series with denoising diffusion probabilistic models
Cell Patterns, 2024
url

Vetter, Julius, Moss, Guy, Schröder, Cornelius, Gao, Richard, Macke, Jakob H
Sourcerer: Sample-based Maximum Entropy Source Distribution Estimation
NeurIPS, 2024
url

Beck, Jonas, Bosch, Nathanael, Deistler, Michael, Kadhim, Kyra L, Macke, Jakob H, Hennig, Philipp, Berens, Philipp
Diffusion Tempering Improves Parameter Estimation with Probabilistic Integrators for Ordinary Differential Equations
arxiv, 2024
url

Haxel, Lisa, Belardinelli, Paolo, Ermolova, Maria, Humaidan, Dania, Macke, Jakob H, Ziemann, Ulf
Decoding Motor Excitability in TMS using EEG-Features: An Exploratory Machine Learning Approach
biorxiv, 2024
url

Bischoff, Sebastian, Darcher, Alana, Deistler, Michael, Gao, Richard, Gerken, Franziska, Gloeckler, Manuel, Haxel, Lisa, Kapoor, Jaivardhan, Lappalainen, Janne K, Macke, Jakob H, Moss, Guy, Pals, Matthijs, Pei, Felix, Rapp, Rachel, Sağtekin, A Erdem, Schröder, Cornelius, Schulz, Auguste, Stefanidi, Zinovia, Toyota, Shoji, Ulmer, Linda, Vetter, Julius
A Practical Guide to Sample-based Statistical Distances for Evaluating Generative Models in Science
Transactions on Machine Learning Research, 2024
url

Gloeckler, Manuel, Deistler, Michael, Weilbach, Christian, Wood, Frank, Macke, Jakob H
All-in-one simulation-based inference
ICML, 2024
url

Kapoor, Jaivardhan, Schulz, Auguste, Vetter, Julius, Pei, Felix, Gao, Richard, Macke, Jakob H
Latent Diffusion for Neural Spiking Data
NeurIPS, 2024
url

Pals, Matthijs, Sağtekin, A Erdem, Pei, Felix, Gloeckler, Manuel, Macke, Jakob H
Inferring stochastic low-rank recurrent neural networks from neural data
NeurIPS, 2024
url

Gao, Richard, Deistler, Michael, Schulz, Auguste, Pedro, Gonçalves J, Macke, Jakob H
Deep inverse modeling reveals dynamic-dependent invariances in neural circuit mechanisms
bioRxiv, 2024
preprint

Zucca, Stefano, Schulz, Auguste, Pedro, Gonçalves J, Macke, Jakob H, Aman, Saleem B, Solomon, Sam G.
Loom response in mouse superior colliculus depends on sensorimotor context
bioRxiv, 2024
url

Deistler, Michael, Kadhim, Kyra L, Pals, Matthijs, Beck, Jonas, Huang, Ziwei, Gloeckler, Manuel, Lappalainen, Janne K, Schröder, Cornelius, Berens, Philipp, Pedro, Gonçalves J, Macke, Jakob H
Differentiable simulation enables large-scale training of detailed biophysical models of neural dynamics
bioRxiv, 2024
url

Lappalainen, Janne K, Tschopp, Fabian D, Prakhya, Sridhama, McGill, Mason, Nern, Aljoscha, Kazunori, Shinomiya, Takemura, Shin-ya, Gruntman, Eyal, Macke, Jakob H, Turaga, Srinivas C
Connectome-constrained networks predict neural activity across the fly visual system
Nature, 2024
url | code | briefing | press release | blog

Krouglova, Anastasia N, Johnson, Hayden R, Confavreux, Basile, Deistler, Michael, Gonçalves, Pedro J
Multifidelity Simulation-based Inference for Computationally Expensive Simulators
arxiv, 2024
url

2023

Research Articles

Dax, Maximillian, Green, Stephen R, Gair, Jonathan, Pürrer, Michael, Wildberger, Jonas, Macke, Jakob H, Buonanno, Alessandra, Schölkopf, Bernhard
Neural Importance Sampling for Rapid and Reliable Gravitational-Wave Inference
PRL, 2023
url | preprint

Gloeckler, Manuel, Deistler, Michael, Macke, Jakob H
Adversarial robustness of amortized Bayesian inference
ICML, 2023
url

Gao, Richard, Deistler, Michael, Macke, Jakob H
Generalized Bayesian Inference for Scientific Simulators via Amortized Cost Estimation
NeurIPS, 2023
url

Boelts, Jan, Harth, Philipp, Gao, Richard, Udvary, Daniel, Yanez, Felipe, Baum, Daniel, Hege, Hans-Christian, Oberlaender, Marcel, Macke, Jakob H
Simulation-based inference for efficient identification of generative models in connectomics
Plos CB, 2023
url

Dax, Maximilian, Green, Stephen R, Gair, Jonathan, Pürrer, Michael, Wildberger, Jonas, Macke, Jakob H, Hege, Hans-Christian, Buonanno, Alessandra, Schölkopf, Bernhard
Neural Importance Sampling for Rapid and Reliable Gravitational-Wave Inference
Physical Review Letters, 2023
url

Wildberger, Jonas, Dax, Maximilian, Green, Stephen R, Gair, Jonathan, Pürrer, Michael, Macke, Jakob H, Hege, Hans-Christian, Buonanno, Alessandra, Schölkopf, Bernhard
Adapting to noise distribution shifts in flow-based gravitational-wave inference
Physical Review D, 2023
url

Dax, Maximilian, Wildberger, Jonas, Buchholz, Simon, Green, Stephen R, Macke, Jakob H, Schölkopf, Bernhard
Flow Matching for Scalable Simulation-Based Inference
NeurIPS, 2023
url

Gorecki, Mila, Macke, Jakob H., Deistler, Michael
Amortized Bayesian Decision Making for simulation-based models
arxiv, 2023
url

Confavreux, Basile, Ramesh, Poornima, Goncalves, Pedro J, Macke, Jakob H, Vogels, Tim P
Meta-learning families of plasticity rules in recurrent spiking networks using simulation-based inference
NeurIPS, 2023
url

2022

Research Articles

Deistler, Michael, Macke, Jakob H, Gonçalves, Pedro J
Energy efficient network activity from disparate circuit parameters
PNAS, 2022
url

Ramesh, Poornima, Lueckmann, Jan-Matthis, Boelts, Jan, Tejero-Cantero, Álvaro, Greenberg, David S, Gonçalves, Pedro J, Macke, Jakob H
GATSBI: Generative Adversarial Training for Simulation-Based Inference
ICLR, 2022
url

Dax, Maximilian, Green, Stephen R, Gair, Jonathan, Deistler, Michael, Schölkopf, Bernhard, Macke, Jakob H
Group equivariant neural posterior estimation
ICLR, 2022
url

Glöckler, Manuel, Deistler, Michael, Macke, Jakob H
Variational methods for simulation-based inference
ICLR, 2022
url

Udvary, Daniel, Hart, Phillip, Macke, Jakob H, Hege, Hans-Christian, de Kock, Christiaaan PJ, Sakman, Bert, Oberlaender, Marcel
The impact of neuronal structure on corticl network architecture
Cell Reports, 39 (2), 2022
url

Jan Boelts, Jan-Matthis Lueckmann, Richard Gao, Macke Jakob H
Flexible and efficient simulation-based inference for models of decision-making
eLife, 2022
url

Liebe, Stefanie, Niediek, Johannes, Pals, Matthijs, Reber, Thomas P, Faber, Jennifer, Bostroem, Jan, Elger, Christian E, Macke, Jakob H, Mormann, Florian
Phase of firing does not reflect temporal order in sequence memory of humans and recurrent neural networks
bioRxiv, 2022
url

Deistler, Michael, Goncalves, Pedro J, Macke, Jakob H
Truncated proposals for scalable and hassle-free simulation-based inference
NeurIPS, 2022
url

Blum, Corinna, Baur, David, Achauer, Lars-Christian, Berens, Philipp, Biergans, Stephanie, Erb, Michael, Hömberg, Volker, Huang, Ziwei, Kohlbacher, Oliver, Liepert, Joachim, Lindig, Tobias, Lohmann, Gabriele, Macke, Jakob H, Römhild, Jörg, Rösinger-Hein, Christine, Zrenner, Brigitte, Ziemann, Ulf
Personalized neurorehabilitative precision medicine: from data to therapies (MWKNeuroReha) - a multi-centre prospective observational clinical trial to predict long-term outcome of patients with acute motor stroke
BMC neurology, 22 (1), pp. 1-15, 2022
url

2021

Research Articles

Dax, Maximilian, Green, Stephen R, Gair, Jonathan Gair, Macke, Jakob H, Buonanno, Alessandra, Schölkopf, Bernhard
Amotized Bayes inference of gravitational waves with normalizing flows
Fourth Workshop on Machine Learning and the Physical Sciences at NeurIPS, 2021
pdf

Pofahl, Martin, Nikbakht, Negar, Haubrich, André N, Nguyen, Theresa, Masala, Nicola, Distler, Fabian, Braganza, Oliver, Macke, Jakob H, Ewell, Laura A, Golcuk, Kurtulus, Beck, Heinz
Synchronous activity patterns in the dentate gyrus during immobility
Elife, 10 , p. e65786, 2021
pdf

Lueckmann, Jan-Matthis, Boelts, Jan, Greenberg, David S, Gonçalves, Pedro J, Macke, Jakob H

Benchmarking Simulation-Based Inference

Recent advances in probabilistic modelling have led to a large number of simulation-based inference algorithms which do not require numerical evaluation of likelihoods. However, a public benchmark with appropriate performance metrics for such'likelihood-free'algorithms has been lacking. This has made it difficult to compare algorithms and identify their strengths and weaknesses. We set out to fill this gap: We provide a benchmark with inference tasks and suitable performance metrics, with an initial selection of algorithms including recent approaches employing neural networks and classical Approximate Bayesian Computation methods. We found that the choice of performance metric is critical, that even state-of-the-art algorithms have substantial room for improvement, and that sequential estimation improves sample efficiency. Neural network-based approaches generally exhibit better performance, but there is no uniformly best algorithm. We provide practical advice and highlight the potential of the benchmark to diagnose problems and improve algorithms. The results can be explored interactively on a companion website. All code is open source, making it possible to contribute further benchmark tasks and inference algorithms.

AISTATS, 2021
url

Dehnen, Gert, Kehl, Marcel S, Darcher, Alana, Müller, Tamara T, Macke, Jakob H, Borger, Valeri, Surges, Rainer, Mormann, Florian
Duplicate Detection of Spike Events: A Relevant Problem in Human Single-Unit Recordings
Brain Science, 11 (6), p. 761, 2021
url

Corna, Andrea, Ramesh, Poornima, Jetter, Florian, Lee, Meng-Jung, Macke, Jakob H, Zeck, Günther
Discrimination of simple objects decoded from the output of retinal ganglion cells upon sinusoidal electrical stimulation
Journal of Neural Engineering, 18 (4), p. 046086, 2021
url

Speiser, Artur, Muller, Lucas-Raphael, Philipp Hoess, Matti, Ulf, Obara, Christopher J, Legant, Wesley R, Kreshuk, Anna, Macke, Jakob H, Ries, Jonas, Turaga, Srinivas C
Deep learning enables fast and dense single-molecule localization with high accuracy
Nature Methods, 18 , pp. 1082–1090, 2021
url

Lavin, Alexander, Zenil, Hector, Paige, Brooks, Krakauer, David, Gottschlich, Justin, Mattson, Tim, Anadkumar, Anima, Choudry, Sanjay, Rocki, Kamil, Baydin, Atilim Günes, Prunkl, Carina, Isayev, Olexandr, Peterson, Erik, McMahon, Peter L, Macke, Jakob H, Cranmer, Kyle, Zhang, Jiaxin, Wainwright, Haruko, Hanuka, Adi, Veloso, Manuela, Assefa, Samuel, Zheng, Stephan, Pfeffer, Avi
Simulation Intelligence: Towards a New Generation of Scientific Methods
arXiv, 2021
url

Dax, Maximilian, Green, Stephen R, Gair, Jonathan, Macke, Jakob H, Buonanno, Alessandra, Schölkopf, Bernhard
Real-time gravitational wave science with neural posterior
Physical review letters, 127 (24), p. 241103, 2021
url

Glöckler, Manuel, Deistler, Michael, Macke Jakob H
Variational methods for simulation-based inference
ICLR, 2021
url

Dax, Maximilian, Green, Stephen R., Gair, Jonathan, Deistler, Michael, Schölkopf, Bernhard, Macke Jakob H
Group equivariant neural posterior estimation
ICLR, 2021
url

2020

Research Articles

Sekhar, Sudarshan, Ramesh, Poornima, Bassetto, Giacomo, Zrenner, Eberhart, Macke, Jakob H, Rathbun, Daniel L
Characterizing retinal ganglion cell responses to electrical stimulation using generalized linear models
Frontiers in Neuroscience, 14 , p. 378, 2020
url

Rene, Alexandre, Longtin, Andre, Macke, Jakob H.
Inference of a mesoscopic population model from population spike trains
Neural computation, 32 (8), pp. 1448--1498, 2020

Tejero-Cantero, Alvaro, Boelts, Jan, Deistler, Michael, Lueckmann, Jan-Matthis, Durkan, Conor, Gonçalves, Pedro J, Greenberg, David S, Macke, Jakob H
Sbi a toolkit for simulation-based inference
Journal of Open Source Software, 5 (52), p. 2505, 2020
url

Gonçalves, Pedro J, Lueckmann, Jan-Matthis, Deistler, Michael, Nonnenmacher, Marcel, Öcal, Kaan, Bassetto, Giacomo, Chintaluri, C, Podlaski, WF, Haddad, SA, Vogels, TP, Grennberg DS, Macke Jakob H
Training deep neural density estimators to identify mechanistic models of neural dynamics
Elife, 2020
url

2019

Research Articles

Barrett DGT, Morcos Ari S, Macke JH

Analyzing biological and artificial neural networks: challenges with opportunities for synergy?

Deep neural networks (DNNs) transform stimuli across multiple processing stages to produce representations that can be used to solve complex tasks, such as object recognition in images. However, a full understanding of how they achieve this remains elusive. The complexity of biological neural networks substantially exceeds the complexity of DNNs, making it even more challenging to understand the representations they learn. Thus, both machine learning and computational neuroscience are faced with a shared challenge: how can we analyze their representations in order to understand how they solve complex tasks? We review how data-analysis concepts and techniques developed by computational neuroscientists can be useful for analyzing representations in DNNs, and in turn, how recently developed techniques for analysis of DNNs can be useful for understanding representations in biological neural networks. We explore opportunities for synergy between the two fields, such as the use of DNNs as in silico model systems for neuroscience, and how this synergy can lead to new hypotheses about the operating principles of biological neural networks.

Current Opinion in Neurobiology, 55 , pp. 55-64, 2019
url | pdf

Greenberg, D.S., Nonnenmacher, M., Macke, J.H.
Automatic Posterior Transformation for Likelihood-Free Inference
Proceedings of the 36th International Conference on Machine Learning, 97 , pp. 2404-2414, 2019
url

Ansuini, A, Laio, A, Macke, JH, Zoccolan, D
Intrinsic dimension of data representations in deep neural networks
Advances in Neural Information Processing Systems 32, 2019
pdf

Boelts, Jan, Harth, Philipp, Yanez, Felipe, Hege, Hans-Christian, Oberlaender, Marcel, Macke, Jakob H
Bayesian inference for synaptic connectivity rules in antomically realistic cortical connectiomes
Bernstein Conference 2019, 2019
pdf

Lueckmann J, Bassetto G, Karaletsos T, Macke JH

Likelihood-free inference with emulator networks

Approximate Bayesian Computation (ABC) provides methods for Bayesian inference in simulation-based stochastic models which do not permit tractable likelihoods. We present a new ABC method which uses probabilistic neural emulator networks to learn synthetic likelihoods on simulated data -- both local emulators which approximate the likelihood for specific observed data, as well as global ones which are applicable to a range of data. Simulations are chosen adaptively using an acquisition function which takes into account uncertainty about either the posterior distribution of interest, or the parameters of the emulator. Our approach does not rely on user-defined rejection thresholds or distance functions. We illustrate inference with emulator networks on synthetic examples and on a biophysical neuron model, and show that emulators allow accurate and efficient inference even on high-dimensional problems which are challenging for conventional ABC approaches.

Proceedings of Machine Learning Research, 96 , pp. 32-53, 2019
URL | pdf

Speiser, Artur, Turaga, Srinivas C, Macke, Jakob H
Teaching deep neural networks to localize sources in super-resolution microscopy by combining simulation-based learning and unsupervised learning
axXiv, 2019

Ramesh, Poornima, Atayi, Mohamad, Macke, Jakob H
Adversarial training of neural encoding models on population spike trains
Workshop, 2019

Macke, Jakob H, Nienborg, Hendrikje
Choice (-history) correlations in sensory cortex: cause or consequence?
Current opinion in neurobiology, 58 , pp. 148--154, 2019

2018

Research Articles

Berens P, Freeman J, Deneux T, Chenkov N, McColgan T, Speiser A, Macke JH, Turaga S, Mineault P, Rupprecht P, Gerhard S, Friedrich RW, Friedrich J, Paninski P, Pachitariu M, Harris KD, Bolte B, Machado TA, Ringach D, Reimer J, Froudarakis E, Euler T, Roman-Roson M, Theis L, Tolias AS, Bethge M

Community-based benchmarking improves spike inference from two-photon calcium imaging data

In recent years, two-photon calcium imaging has become a standard tool to probe the function of neural circuits and to study computations in neuronal populations. However, the acquired signal is only an indirect measurement of neural activity due to the comparatively slow dynamics of fluorescent calcium indicators. Different algorithms for estimating spike trains from noisy calcium measurements have been proposed in the past, but it is an open question how far performance can be improved. Here, we report the results of the spikefinder challenge, launched to catalyze the development of new spike inference algorithms through crowd-sourcing. We present ten of the submitted algorithms which show improved performance compared to previously evaluated methods. Interestingly, the top-performing algorithms are based on a wide range of principles from deep neural networks to generative models, yet provide highly correlated estimates of the neural activity. The competition shows that benchmark challenges can drive algorithmic developments in neuroscience.

PLoS computational biology, 14 , 2018
biorXiv | URL | pdf

Lueckmann J, Macke JH*, Nienborg H*

Can serial dependencies in choices and neural activity explain choice probabilities?

During perceptual decisions the activity of sensory neurons co-varies with choice, a co-variation often quantified as “choice-probability”. Moreover, choices are influenced by a subject’s previous choice (serial dependence) and neuronal activity often shows temporal correlations on long (seconds) timescales. Here, we test whether these findings are linked. Using generalized linear models we analyze simultaneous measurements of behavior and V2 neural activity in macaques performing a visual discrimination task. Both, decisions and spiking activity show substantial temporal correlations and cross-correlations but seem to reflect two mostly separate processes. Indeed, removing history effects using semi-partial correlation analysis leaves choice probabilities largely unchanged. The serial dependencies in choices and neural activity therefore cannot explain the observed choice probability. Rather, serial dependencies in choices and spiking activity reflect two predominantly separate but parallel processes, which are coupled on each trial by co-variations between choices and activity. These findings provide important constraints for computational models of perceptual decision-making that include feedback signals.

Journal of Neuroscience, 38 , pp. 3495-3506, 2018
url | pdf

Djurdjevic V, Ansuini A, Bertolini D, Macke JH, Zoccolan D

Accuracy of Rats in Discriminating Visual Objects Is Explained by the Complexity of Their Perceptual Strategy

Despite their growing popularity as models of visual functions, it remains unclear whether rodents are capable of deploying advanced shape-processing strategies when engaged in visual object recognition. In rats, for instance, pattern vision has been reported to range from mere detection of overall object luminance to view-invariant processing of discriminative shape features. Here we sought to clarify how refined object vision is in rodents, and how variable the complexity of their visual processing strategy is across individuals. To this aim, we measured how well rats could discriminate a reference object from 11 distractors, which spanned a spectrum of image-level similarity to the reference. We also presented the animals with random variations of the reference, and processed their responses to these stimuli to derive subject-specific models of rat perceptual choices. Our models successfully captured the highly variable discrimination performance observed across subjects and object conditions. In particular, they revealed that the animals that succeeded with the most challenging distractors were those that integrated the wider variety of discriminative features into their perceptual strategies. Critically, these strategies were largely preserved when the rats were required to discriminate outlined and scaled versions of the stimuli, thus showing that rat object vision can be characterized as a transformation-tolerant, feature-based filtering process. Overall, these findings indicate that rats are capable of advanced processing of shape information, and point to the rodents as powerful models for investigating the neuronal underpinnings of visual object recognition and other high-level visual functions.

Current Biology, 28 , pp. 1005–1015, 2018
url | dispatch | pdf | press

Nonnenmacher M, Turaga SC, Macke JH

Extracting low-dimensional dynamics from multiple large-scale neural population recordings by learning to predict correlations

A powerful approach for understanding neural population dynamics is to extract low-dimensional trajectories from population recordings using dimensionality reduction methods. Current approaches for dimensionality reduction on neural data are limited to single population recordings, and can not identify dynamics embedded across multiple measurements. We propose an approach for extracting low-dimensional dynamics from multiple, sequential recordings. Our algorithm scales to data comprising millions of observed dimensions, making it possible to access dynamics distributed across large populations or multiple brain areas. Building on subspace-identification approaches for dynamical systems, we perform parameter estimation by minimizing a moment-matching objective using a scalable stochastic gradient descent algorithm: The model is optimized to predict temporal covariations across neurons and across time. We show how this approach naturally handles missing data and multiple partial recordings, and can identify dynamics and predict correlations even in the presence of severe subsampling and small overlap between recordings. We demonstrate the effectiveness of the approach both on simulated data and a whole-brain larval zebrafish imaging dataset.

Advances in Neural Information Processing Systems 30: 31st Conference on Neural Information Processing Systems (NeurIPS 2017), 2018
pdf | URL | arXiv | code

Lueckmann J*, Gonçalves P*, Bassetto G, Oecal K, Nonnenmacher M, Macke JH

Flexible statistical inference for mechanistic models of neural dynamics

Mechanistic models of single-neuron dynamics have been extensively studied in computational neuroscience. However, identifying which models can quantitatively reproduce empirically measured data has been challenging. We propose to overcome this limitation by using likelihood-free inference approaches (also known as Approximate Bayesian Computation, ABC) to perform full Bayesian inference on single-neuron models. Our approach builds on recent advances in ABC by learning a neural network which maps features of the observed data to the posterior distribution over parameters. We learn a Bayesian mixture-density network approximating the posterior over multiple rounds of adaptively chosen simulations. Furthermore, we propose an efficient approach for handling missing features and parameter settings for which the simulator fails -- both being prevalent issues in models of neural dynamics -- as well as a strategy for automatically learning relevant features using recurrent neural networks. On synthetic data, our approach efficiently estimates posterior distributions and recovers ground-truth parameters. On in-vitro recordings of membrane voltages, we recover multivariate posteriors over biophysical parameters, which yield model-predicted voltage traces that accurately match empirical data. Our approach will enable neuroscientists to perform Bayesian inference on complex neuron models without having to design model-specific algorithms, closing the gap between mechanistic and statistical approaches to single-neuron modelling.

Advances in Neural Information Processing Systems 30: 31st Conference on Neural Information Processing Systems (NeurIPS 2017), 2018
pdf | URL | arXiv | code

Speiser A, Jinyao Y, Archer E, Buesing L, Turaga SC, Macke JH

Fast amortized inference of neural activity from calcium imaging data with variational autoencoders

Calcium imaging permits optical measurement of neural activity. Since intracellular calcium concentration is an indirect measurement of neural activity, computational tools are necessary to infer the true underlying spiking activity from fluorescence measurements. Bayesian model inversion can be used to solve this problem, but typically requires either computationally expensive MCMC sampling, or faster but approximate maximum-a-posteriori optimization. Here, we introduce a flexible algorithmic framework for fast, efficient and accurate extraction of neural spikes from imaging data. Using the framework of variational autoencoders, we propose to amortize inference by training a deep neural network to perform model inversion efficiently. The recognition network is trained to produce samples from the posterior distribution over spike trains. Once trained, performing inference amounts to a fast single forward pass through the network, without the need for iterative optimization or sampling. We show that amortization can be applied flexibly to a wide range of nonlinear generative models and significantly improves upon the state of the art in computation time, while achieving competitive accuracy. Our framework is also able to represent posterior distributions over spike-trains. We demonstrate the generality of our method by proposing the first probabilistic approach for separating backpropagating action potentials from putative synaptic inputs in calcium imaging of dendritic spines.

Advances in Neural Information Processing Systems 30: 31st Conference on Neural Information Processing Systems (NeurIPS 2017), 2018
pdf | URL | arXiv | code

Preprints and Technical Reports

David GT Barrett, Ari S Morcos, Jakob H Macke

Analyzing biological and artificial neural networks: challenges with opportunities for synergy?

Deep neural networks (DNNs) transform stimuli across multiple processing stages to produce representations that can be used to solve complex tasks, such as object recognition in images. However, a full understanding of how they achieve this remains elusive. The complexity of biological neural networks substantially exceeds the complexity of DNNs, making it even more challenging to understand the representations that they learn. Thus, both machine learning and computational neuroscience are faced with a shared challenge: how can we analyze their representations in order to understand how they solve complex tasks? We review how data-analysis concepts and techniques developed by computational neuroscientists can be useful for analyzing representations in DNNs, and in turn, how recently developed techniques for analysis of DNNs can be useful for understanding representations in biological neural networks. We explore opportunities for synergy between the two fields, such as the use of DNNs as in-silico model systems for neuroscience, and how this synergy can lead to new hypotheses about the operating principles of biological neural networks.

Arxiv Preprint, 2018
URL | pdf

2017

Research Articles

Nonnenmacher M, Behrens C, Berens P, Bethge M, Macke JH

Signatures of criticality arise from random subsampling in simple population models

The rise of large-scale recordings of neuronal activity has fueled the hope to gain new insights into the collective activity of neural ensembles. How can one link the statistics of neural population activity to underlying principles and theories? One attempt to interpret such data builds upon analogies to the behaviour of collective systems in statistical physics. Divergence of the specific heat—a measure of population statistics derived from thermodynamics—has been used to suggest that neural populations are optimized to operate at a “critical point”. However, these findings have been challenged by theoretical studies which have shown that common inputs can lead to diverging specific heat. Here, we connect “signatures of criticality”, and in particular the divergence of specific heat, back to statistics of neural population activity commonly studied in neural coding: firing rates and pairwise correlations. We show that the specific heat diverges whenever the average correlation strength does not depend on population size. This is necessarily true when data with correlations is randomly subsampled during the analysis process, irrespective of the detailed structure or origin of correlations. We also show how the characteristic shape of specific heat capacity curves depends on firing rates and correlations, using both analytically tractable models and numerical simulations of a canonical feed-forward population model. To analyze these simulations, we develop efficient methods for characterizing large-scale neural population activity with maximum entropy models. We find that, consistent with experimental findings, increases in firing rates and correlation directly lead to more pronounced signatures. Thus, previous reports of thermodynamical criticality in neural populations based on the analysis of specific heat can be explained by average firing rates and correlations, and are not indicative of an optimized coding strategy. We conclude that a reliable interpretation of statistical tests for theories of neural coding is possible only in reference to relevant ground-truth models.

PLoS Comput Biol 13(10):e1005718, 2017
URL | pdf | supplement | arXiv | code

2016

Research Articles

Schuett HH, Harmeling S, Macke JH, Wichmann FA

Painfree and accurate Bayesian estimation of psychometric functions for (potentially) overdispersed data

The psychometric function describes how an experimental variable, such as stimulus strength, influences the behaviour of an observer. Estimation of psychometric functions from experimental data plays a central role in fields such as psychophysics, experimental psychology and in the behavioural neurosciences. Experimental data may exhibit substantial overdispersion, which may result from non-stationarity in the behaviour of observers. Here we extend the standard binomial model which is typically used for psychometric function estimation to a beta-binomial model. We show that the use of the beta-binomial model makes it possible to determine accurate credible intervals even in data which exhibit substantial overdispersion. This goes beyond classical measures for overdispersion—goodness-of-fit—which can detect overdispersion but provide no method to do correct inference for overdispersed data. We use Bayesian inference methods for estimating the posterior distribution of the parameters of the psychometric function. Unlike previous Bayesian psychometric inference methods our software implementation—psignifit 4—performs numerical integration of the posterior within automatically determined bounds. This avoids the use of Markov chain Monte Carlo (MCMC) methods typically requiring expert knowledge. Extensive numerical tests show the validity of the approach and we discuss implications of overdispersion for experimental design. A comprehensive MATLAB toolbox implementing the method is freely available (https://github.com/wichmann-lab/psignifit) and a python implementation will follow soon.

Vision Research, 122 , pp. 105-123, 2016
code | code direct | pdf | URL | Wichmann lab

Park M, Bohner G, Macke JH

Unlocking neural population non-stationarities using hierarchical dynamics models

Neural population activity often exhibits rich variability. This variability can arise from single-neuron stochasticity, neural dynamics on short time-scales, as well as from modulations of neural firing properties on long time-scales, often referred to as neural non-stationarity. To better understand the nature of co-variability in neural circuits and their impact on cortical information processing, we introduce a hierarchical dynamics model that is able to capture both slow inter-trial modulations in firing rates as well as neural population dynamics. We derive a Bayesian Laplace propagation algorithm for joint inference of parameters and population states. On neural population recordings from primary visual cortex, we demonstrate that our model provides a better account of the structure of neural firing than stationary dynamics models.

Advances in Neural Information Processing Systems 28: 29th Conference on Neural Information Processing Systems (NeurIPS 2015), 2016
pdf | supplement | arXiv | code

2015

Research Articles

Archer EW, Koster U, Pillow JW, Macke JH

Low-dimensional models of neural population activity in sensory cortical circuits

Neural responses in visual cortex are influenced by visual stimuli and by ongoing spiking activity in local circuits. An important challenge in computational neuroscience is to develop models that can account for both of these features in large multi-neuron recordings and to reveal how stimulus representations interact with and depend on cortical dynamics. Here we introduce a statistical model of neural population activity that integrates a nonlinear receptive field model with a latent dynamical model of ongoing cortical activity. This model captures the temporal dynamics, effective network connectivity in large population recordings, and correlations due to shared stimulus drive as well as common noise. Moreover, because the nonlinear stimulus inputs are mixed by the ongoing dynamics, the model can account for a relatively large number of idiosyncratic receptive field shapes with a small number of nonlinear inputs to a low-dimensional latent dynamical model. We introduce a fast estimation method using online expectation maximization with Laplace approximations. Inference scales linearly in both population size and recording duration. We apply this model to multi-channel recordings from primary visual cortex and show that it accounts for a large number of individual neural receptive fields using a small number of nonlinear inputs and a low-dimensional dynamical model.

Advances in Neural Information Processing Systems 27: 28th Conference on Neural Information Processing Systems (NeurIPS 2014), pp. 343-351, 2015
URL | pdf

Putzky P, Franzen F, Bassetto G, Macke JH

A Bayesian model for identifying hierarchically organised states in neural population activity

Neural population activity in cortical circuits is not solely driven by external inputs, but is also modulated by endogenous states. These cortical states vary on multiple time-scales and also across areas and layers of the neocortex. To understand information processing in cortical circuits, we need to understand the statistical structure of internal states and their interaction with sensory inputs. Here, we present a statistical model for extracting hierarchically organized neural population states from multi-channel recordings of neural spiking activity. We model population states using a hidden Markov decision tree with state-dependent tuning parameters and a generalized linear observation model. Using variational Bayesian inference, we estimate the posterior distribution over parameters from population recordings of neural spike trains. On simulated data, we show that we can identify the underlying sequence of population states over time and reconstruct the ground truth parameters. Using extracellular population recordings from visual cortex, we find that a model with two levels of population states outperforms a generalized linear model which does not include state-dependence, as well as models which only including a binary state. Finally, modelling of state-dependence via our model also improves the accuracy with which sensory stimuli can be decoded from the population response.

Advances in Neural Information Processing Systems 27: 28th Annual Conference on Neural Information Processing Systems (NeurIPS 2014), pp. 3095-3103, 2015
URL | pdf | supplement | spotlight | code

Küffner R, Zach N, Norel R, Hawe J, Schoenfeld D, Wang L, Li G, Fang L, Mackey L, Hardiman O, Cudkowicz M, Sherman A, Ertaylan G, Grosse-Wentrup M, Hothorn T, van Ligtenberg J, Macke JH, Meyer T, Schölkopf B, Tran L, Vaughan R, Stolovitzky G, Leitner ML

Crowdsourced analysis of clinical trial data to predict amyotrophic lateral sclerosis progression

Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease with substantial heterogeneity in its clinical presentation. This makes diagnosis and effective treatment difficult, so better tools for estimating disease progression are needed. Here, we report results from the DREAM-Phil Bowen ALS Prediction Prize4Life challenge. In this crowdsourcing competition, competitors developed algorithms for the prediction of disease progression of 1,822 ALS patients from standardized, anonymized phase 2/3 clinical trials. The two best algorithms outperformed a method designed by the challenge organizers as well as predictions by ALS clinicians. We estimate that using both winning algorithms in future trial designs could reduce the required number of patients by at least 20%. The DREAM-Phil Bowen ALS Prediction Prize4Life challenge also identified several potential nonstandard predictors of disease progression including uric acid, creatinine and surprisingly, blood pressure, shedding light on ALS pathobiology. This analysis reveals the potential of a crowdsourcing competition that uses clinical trial data for accelerating ALS research and development.

Nature Biotechnology, 33 (1), pp. 51-57, 2015
URL | DOI

Reviews and Book Chapters

Panzeri S, Macke JH, Gross J, Kayser C

Neural population coding: combining insights from microscopic and mass signals

Behavior relies on the distributed and coordinated activity of neural populations. Population activity can be measured using multi-neuron recordings and neuroimaging. Neural recordings reveal how the heterogeneity, sparseness, timing, and correlation of population activity shape information processing in local networks, whereas neuroimaging shows how long-range coupling and brain states impact on local activity and perception. To obtain an integrated perspective on neural information processing we need to combine knowledge from both levels of investigation. We review recent progress of how neural recordings, neuroimaging, and computational approaches begin to elucidate how interactions between local neural population activity and large-scale dynamics shape the structure and coding capacity of local information representations, make them state-dependent, and control distributed populations that collectively shape behavior.

Trends in Cognitive Sciences, 19 (3), pp. 162–172, 2015
URL | DOI | pdf

Macke JH, Buesing L, Sahani M
Estimating State and Parameters in State Space Models of Spike Trains
Advanced State Space Methods for Neural and Clinical Data, 2015
pdf | code

2014

Research Articles

Fründ I, Wichmann FA, Macke JH

Quantifying the effect of intertrial dependence on perceptual decisions

In the perceptual sciences, experimenters study the causal mechanisms of perceptual systems by probing observers with carefully constructed stimuli. It has long been known, however, that perceptual decisions are not only determined by the stimulus, but also by internal factors. Internal factors could lead to a statistical influence of previous stimuli and responses on the current trial, resulting in serial dependencies, which complicate the causal inference between stimulus and response. However, the majority of studies do not take serial dependencies into account, and it has been unclear how strongly they influence perceptual decisions. We hypothesize that one reason for this neglect is that there has been no reliable tool to quantify them and to correct for their effects. Here we develop a statistical method to detect, estimate, and correct for serial dependencies in behavioral data. We show that even trained psychophysical observers suffer from strong history dependence. A substantial fraction of the decision variance on difficult stimuli was independent of the stimulus but dependent on experimental history. We discuss the strong dependence of perceptual decisions on internal factors and its implications for correct data interpretation.

Turaga SC, Buesing L, Packer AM, Dalgleish H, Pettit N, Hausser M, Macke JH

Inferring neural population dynamics from multiple partial recordings of the same neural circuit

Advances in Neural Information Processing Systems 26: 27th Conference on Neural Information Processing Systems (NeurIPS 2013), pp. 539-547, 2014
URL | pdf

Reviews and Book Chapters

Macke JH

Electrophysiology Analysis, Bayesian

Bayesian analysis of electrophysiological data refers to the statistical processing of data obtained in electrophysiological experiments (i.e., recordings of action potentials or voltage measurements with electrodes or imaging devices) which utilize methods from Bayesian statistics. Bayesian statistics is a framework for describing and modelling empirical data using the mathematical language of probability to model uncertainty. Bayesian statistics provides a principled and flexible framework for combining empirical observations with prior knowledge and for quantifying uncertainty. These features are especially useful for analysis questions in which the dataset sizes are small in comparison to the complexity of the model, which is often the case in neurophysiological data analysis.

Encyclopedia of Computational Neuroscience, pp. 1-5, 2014
URL | DOI | preprint

2013

Research Articles

Watanabe M, Bartels A, Macke JH, Murayama Y, Logothetis NK

Temporal Jitter of the BOLD Signal Reveals a Reliable Initial Dip and Improved Spatial Resolution

fMRI, one of the most important noninvasive brain imaging methods, relies on the blood oxygen level-dependent (BOLD) signal, whose precise underpinnings are still not fully understood [1]. It is a widespread assumption that the components of the hemodynamic response function (HRF) are fixed relative to each other in time, leading most studies as well as analysis tools to focus on trial-averaged responses, thus using or estimating a condition- or location-specific “canonical HRF” [2, 3 and 4]. In the current study, we examined the nature of the variability of the BOLD response and asked in particular whether the positive BOLD peak is subject to trial-to-trial temporal jitter. Our results show that the positive peak of the stimulus-evoked BOLD signal exhibits a trial-to-trial temporal jitter on the order of seconds. Moreover, the trial-to-trial variability can be exploited to uncover the initial dip in the majority of voxels by pooling trial responses with large peak latencies. Initial dips exposed by this procedure possess higher spatial resolution compared to the positive BOLD signal in the human visual cortex. These findings allow for the reliable observation of fMRI signals that are physiologically closer to neural activity, leading to improvements in both temporal and spatial resolution.

Current Biology, 23 (21), pp. 2146–2150, 2013
URL | DOI

Macke JH, Murray I, Latham PE

Estimation bias in maximum entropy models

Maximum entropy models have become popular statistical models in neuroscience and other areas in biology and can be useful tools for obtaining estimates of mutual information in biological systems. However, maximum entropy models fit to small data sets can be subject to sampling bias; i.e., the true entropy of the data can be severely underestimated. Here, we study the sampling properties of estimates of the entropy obtained from maximum entropy models. We focus on pairwise binary models, which are used extensively to model neural population activity. We show that if the data is well described by a pairwise model, the bias is equal to the number of parameters divided by twice the number of observations. If, however, the higher order correlations in the data deviate from those predicted by the model, the bias can be larger. Using a phenomenological model of neural population recordings, we find that this additional bias is highest for small firing probabilities, strong correlations and large population sizes—for the parameters we tested, a factor of about four higher. We derive guidelines for how long a neurophysiological experiment needs to be in order to ensure that the bias is less than a specified criterion. Finally, we show how a modified plug-in estimate of the entropy can be used for bias correction.

Entropy, 15 (8), pp. 3109-3219, 2013
URL | DOI | pdf | code

Haefner RM, Gerwinn S, Macke JH, Bethge M

Inferring decoding strategies from choice probabilities in the presence of correlated variability

The activity of cortical neurons in sensory areas covaries with perceptual decisions, a relationship that is often quantified by choice probabilities. Although choice probabilities have been measured extensively, their interpretation has remained fraught with difficulty. We derive the mathematical relationship between choice probabilities, read-out weights and correlated variability in the standard neural decision-making model. Our solution allowed us to prove and generalize earlier observations on the basis of numerical simulations and to derive new predictions. Notably, our results indicate how the read-out weight profile, or decoding strategy, can be inferred from experimentally measurable quantities. Furthermore, we developed a test to decide whether the decoding weights of individual neurons are optimal for the task, even without knowing the underlying correlations. We confirmed the practicality of our approach using simulated data from a realistic population model. Thus, our findings provide a theoretical foundation for a growing body of experimental results on choice probabilities and correlations.

Nature Neuroscience, 16 (2), pp. 235–242, 2013
URL | DOI | news | code

Buesing L, Macke JH, Sahani M

Spectral learning of linear dynamics from generalised-linear observations with application to neural population data

Latent linear dynamical systems with generalised-linear observation models arise in a variety of applications, for example when modelling the spiking activity of populations of neurons. Here, we show how spectral learning methods for linear systems with Gaussian observations (usually called subspace identification in this context) can be extended to estimate the parameters of dynamical system models observed through non-Gaussian noise models. We use this approach to obtain estimates of parameters for a dynamical model of neural population data, where the observed spike-counts are Poisson-distributed with log-rates determined by the latent dynamical process, possibly driven by external inputs. We show that the extended system identification algorithm is consistent and accurately recovers the correct parameters on large simulated data sets with much smaller computational cost than approximate expectation-maximisation (EM) due to the non-iterative nature of subspace identification. Even on smaller data sets, it provides an effective initialization for EM, leading to more robust performance and faster convergence. These benefits are shown to extend to real neural data.

Advances in Neural Information Processing Systems 25: 26th Conference on Neural Information Processing Systems (NeurIPS 2012), pp. 1691-1699, 2013
URL | pdf

2012

Research Articles

Schwartz G, Macke JH, Amodei D, Tang H, Berry MJ

Low Error Discrimination using a Correlated Population Code

We explored the manner in which spatial information is encoded by retinal ganglion cell populations. We flashed a set of 36 shape stimuli onto the tiger salamander retina and used different decoding algorithms to read out information from a population of 162 ganglion cells. We compared the discrimination performance of linear decoders, which ignore correlation induced by common stimulation, against nonlinear decoders, which can accurately model these correlations. Similar to previous studies, decoders that ignored correlation suffered only a modest drop in discrimination performance for groups of up to ∼30 cells. However, for more realistic groups of 100+ cells, we found order-of-magnitude differences in the error rate. We also compared decoders that used only the presence of a single spike from each cell against more complex decoders that included information from multiple spike counts and multiple time bins. More complex decoders substantially outperformed simpler decoders, showing the importance of spike timing information. Particularly effective was the first spike latency representation, which allowed zero discrimination errors for the majority of shape stimuli. Furthermore, the performance of nonlinear decoders showed even greater enhancement compared to linear decoders for these complex representations. Finally, decoders that approximated the correlation structure in the population by matching all pairwise correlations with a maximum entropy model fit to all 162 neurons were quite successful, especially for the spike latency representation. Together, these results suggest a picture in which linear decoders allow a coarse categorization of shape stimuli, while nonlinear decoders, which take advantage of both correlation and spike timing, are needed to achieve high-fidelity discrimination.

Journal of Neurophysiology, 108 (4), pp. 1069-1088, 2012
URL | DOI | code

Buesing L, Macke JH, Sahani M

Learning stable, regularised latent models of neural population dynamics

Ongoing advances in experimental technique are making commonplace simultaneous recordings of the activity of tens to hundreds of cortical neurons at high temporal resolution. Latent population models, including Gaussian-process factor analysis and hidden linear dynamical system (LDS) models, have proven effective at capturing the statistical structure of such data sets. They can be estimated efficiently, yield useful visualisations of population activity, and are also integral building-blocks of decoding algorithms for brain-machine interfaces (BMI). One practical challenge, particularly to LDS models, is that when parameters are learned using realistic volumes of data the resulting models often fail to reflect the true temporal continuity of the dynamics; and indeed may describe a biologically-implausible unstable population dynamic that is, it may predict neural activity that grows without bound. We propose a method for learning LDS models based on expectation maximisation that constrains parameters to yield stable systems and at the same time promotes capture of temporal structure by appropriate regularisation. We show that when only little training data is available our method yields LDS parameter estimates which provide a substantially better statistical description of the data than alternatives, whilst guaranteeing stable dynamics. We demonstrate our methods using both synthetic data and extracellular multi-electrode recordings from motor cortex.

Network, 23 (1-2), pp. 24-47, 2012
URL | DOI | pdf

Macke JH, Murray I, Latham P

How biased are maximum entropy models?

Maximum entropy models have become popular statistical models in neuroscience and other areas in biology, and can be useful tools for obtaining estimates of mutual information in biological systems. However, maximum entropy models fit to small data sets can be subject to sampling bias; i.e. the true entropy of the data can be severely underestimated. Here we study the sampling properties of estimates of the entropy obtained from maximum entropy models. We show that if the data is generated by a distribution that lies in the model class, the bias is equal to the number of parameters divided by twice the number of observations. However, in practice, the true distribution is usually outside the model class, and we show here that this misspecification can lead to much larger bias. We provide a perturbative approximation of the maximally expected bias when the true model is out of model class, and we illustrate our results using numerical simulations of an Ising model; i.e. the second-order maximum entropy distribution on binary data.

Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems (NeurIPS 2011), pp. 2034-2042, 2012
URL | pdf | code

Macke JH, Büsing L, Cunningham JP, Yu BM, Shenoy KV, Sahani M

Empirical models of spiking in neural populations

Neurons in the neocortex code and compute as part of a locally interconnected population. Large-scale multi-electrode recording makes it possible to access these population processes empirically by fitting statistical models to unaveraged data. What statistical structure best describes the concurrent spiking of cells within a local network? We argue that in the cortex, where firing exhibits extensive correlations in both time and space and where a typical sample of neurons still reflects only a very small fraction of the local population, the most appropriate model captures shared variability by a low-dimensional latent process evolving with smooth dynamics, rather than by putative direct coupling. We test this claim by comparing a latent dynamical model with realistic spiking observations to coupled generalised linear spike-response models (GLMs) using cortical recordings. We find that the latent dynamical approach outperforms the GLM in terms of goodness-offit, and reproduces the temporal correlations in the data more accurately. We also compare models whose observations models are either derived from a Gaussian or point-process models, finding that the non-Gaussian model provides slightly better goodness-of-fit and more realistic population spike counts.

Advances in Neural Information Processing Systems 24: 25th Conference on Neural Information Processing Systems (NeurIPS 2011), pp. 1350-1358, 2012
URL | pdf | code

2011

Research Articles

Macke JH, Gerwinn S, White LW, Kaschube M, Bethge M

Gaussian process methods for estimating cortical maps

A striking feature of cortical organization is that the encoding of many stimulus features, for example orientation or direction selectivity, is arranged into topographic maps. Functional imaging methods such as optical imaging of intrinsic signals, voltage sensitive dye imaging or functional magnetic resonance imaging are important tools for studying the structure of cortical maps. As functional imaging measurements are usually noisy, statistical processing of the data is necessary to extract maps from the imaging data. We here present a probabilistic model of functional imaging data based on Gaussian processes. In comparison to conventional approaches, our model yields superior estimates of cortical maps from smaller amounts of data. In addition, we obtain quantitative uncertainty estimates, i.e. error bars on properties of the estimated map. We use our probabilistic model to study the coding properties of the map and the role of noise-correlations by decoding the stimulus from single trials of an imaging experiment.

NeuroImage, 56 (2), pp. 570-581, 2011
URL | DOI | code

Macke JH, Opper M, Bethge M

Common Input Explains Higher-Order Correlations and Entropy in a Simple Model of Neural Population Activity

Simultaneously recorded neurons exhibit correlations whose underlying causes are not known. Here, we use a population of threshold neurons receiving correlated inputs to model neural population recordings. We show analytically that small changes in second-order correlations can lead to large changes in higher-order redundancies, and that the resulting interactions have a strong impact on the entropy, sparsity, and statistical heat capacity of the population. Our findings for this simple model may explain some surprising effects recently observed in neural population recordings.

Physical Review Letters, 106 (20), p. 208102, 2011
URL | DOI | code | supplement

Reviews and Book Chapters

Macke J, Berens P, Bethge M

Statistical analysis of multi-cell recordings: linking population coding models to experimental data

Modern recording techniques such as multi-electrode arrays and two-photon imaging methods are capable of simultaneously monitoring the activity of large neuronal ensembles at single cell resolution. These methods finally give us the means to address some of the most crucial questions in systems neuroscience: what are the dynamics of neural population activity? How do populations of neurons perform computations? What is the functional organization of neural ensembles? While the wealth of new experimental data generated by these techniques provides exciting opportunities to test ideas about how neural ensembles operate, it also provides major challenges: multi-cell recordings necessarily yield data which is high-dimensional in nature. Understanding this kind of data requires powerful statistical techniques for capturing the structure of the neural population responses, as well as their relationship with external stimuli or behavioral observations. Furthermore, linking recorded neural population activity to the predictions of theoretical models of population coding has turned out not to be straightforward. These challenges motivated us to organize a workshop at the 2009 Computational Neuroscience Meeting in Berlin to discuss these issues. In order to collect some of the recent progress in this field, and to foster discussion on the most important directions and most pressing questions, we issued a call for papers for this Research Topic. We asked authors to address the following four questions: 1. What classes of statistical methods are most useful for modeling population activity? 2. What are the main limitations of current approaches, and what can be done to overcome them? 3. How can statistical methods be used to empirically test existing models of (probabilistic) population coding? 4. What role can statistical methods play in formulating novel hypotheses about the principles of information processing in neural populations? A total of 15 papers addressing questions related to these themes are now collected in this Research Topic. Three of these articles have resulted in "Focused reviews" in Frontiers in Neuroscience (Crumiller et al., 2011; Rosenbaum et al., 2011; Tchumatchenko et al., 2011), illustrating the great interest in the topic. Many of the articles are devoted to a better understanding of how correlations arise in neural circuits, and how they can be detected, modeled, and interpreted. For example, by modeling how pairwise correlations are transformed by spiking non-linearities in simple neural circuits, Tchumatchenko et al. (2010) show that pairwise correlation coefficients have to be interpreted with care, since their magnitude can depend strongly on the temporal statistics of their input-correlations. In a similar spirit, Rosenbaum et al. (2010) study how correlations can arise and accumulate in feed-forward circuits as a result of pooling of correlated inputs. Lyamzin et al. (2010) and Krumin et al. (2010) present methods for simulating correlated population activity and extend previous work to more general settings. The method of Lyamzin et al. (2010) allows one to generate synthetic spike trains which match commonly reported statistical properties, such as time varying firing rates as well signal and noise correlations. The Hawkes framework presented by Krumin et al. (2010) allows one to fit models of recurrent population activity to the correlation-structure of experimental data. Louis et al. (2010) present a novel method for generating surrogate spike trains which can be useful when trying to assess the significance and time-scale of correlations in neural spike trains. Finally, Pipa and Munk (2011) study spike synchronization in prefrontal cortex during working memory. A number of studies are also devoted to advancing our methodological toolkit for analyzing various aspects of population activity (Gerwinn et al., 2010; Machens, 2010; Staude et al., 2010; Yu et al., 2010). For example, Gerwinn et al. (2010) explain how full probabilistic inference can be performed in the popular model class of generalized linear models (GLMs), and study the effect of using prior distributions on the parameters of the stimulus and coupling filters. Staude et al. (2010) extend a method for detecting higher-order correlations between neurons via population spike counts to non-stationary settings. Yu et al. (2010) describe a new technique for estimating the information rate of a population of neurons using frequency-domain methods. Machens (2010) introduces a novel extension of principal component analysis for separating the variability of a neural response into different sources. Focusing less on the spike responses of neural populations but on aggregate signals of population activity, Boatman-Reich et al. (2010) and Hoerzer et al. (2010) describe methods for a quantitative analysis of field potential recordings. While Boatman-Reich et al. (2010) discuss a number of existing techniques in a unified framework and highlight the potential pitfalls associated with such approaches, Hoerzer et al. (2010) demonstrate how multivariate autoregressive models and the concept of Granger causality can be used to infer local functional connectivity in area V4 of behaving macaques. A final group of studies is devoted to understanding experimental data in light of computational models (Galán et al., 2010; Pandarinath et al., 2010; Shteingart et al., 2010). Pandarinath et al. (2010) present a novel mechanism that may explain how neural networks in the retina switch from one state to another by a change in gap junction coupling, and conjecture that this mechanism might also be found in other neural circuits. Galán et al. (2010) present a model of how hypoxia may change the network structure in the respiratory networks in the brainstem, and analyze neural correlations in multi-electrode recordings in light of this model. Finally, Shteingart et al. (2010) show that the spontaneous activation sequences they find in cultured networks cannot be explained by Zipf’s law, but rather require a wrestling model. The papers of this Research Topic thus span a wide range of topics in the statistical modeling of multi-cell recordings. Together with other recent advances, they provide us with a useful toolkit to tackle the challenges presented by the vast amount of data collected with modern recording techniques. The impact of novel statistical methods on the field and their potential to generate scientific progress, however, depends critically on how readily they can be adopted and applied by laboratories and researchers working with experimental data. An important step toward this goal is to also publish computer code along with the articles (Barnes, 2010) as a successful implementation of advanced methods also relies on many details which are hard to communicate in the article itself. In this way it becomes much more likely that other researchers can actually use the methods, and unnecessary re-implementations can be avoided. Some of the papers in this Research Topic already follow this goal (Gerwinn et al., 2010; Louis et al., 2010; Lyamzin et al., 2010). We hope that this practice becomes more and more common in the future and encourage authors and editors of Research Topics to make as much code available as possible, ideally in a format that can be easily integrated with existing software sharing initiatives (Herz et al., 2008; Goldberg et al., 2009).

Frontiers in Computational Neuroscience, 5 (35), pp. 1-2, 2011
URL | DOI | pdf-book(big!)

Gerwinn S, Macke JH, Bethge M

Reconstructing stimuli from the spike-times of leaky integrate and fire neurons

Reconstructing stimuli from the spike trains of neurons is an important approach for understanding the neural code. One of the difficulties associated with this task is that signals which are varying continuously in time are encoded into sequences of discrete events or spikes. An important problem is to determine how much information about the continuously varying stimulus can be extracted from the time-points at which spikes were observed, especially if these time-points are subject to some sort of randomness. For the special case of spike trains generated by leaky integrate and fire neurons, noise can be introduced by allowing variations in the threshold every time a spike is released. A simple decoding algorithm previously derived for the noiseless case can be extended to the stochastic case, but turns out to be biased. Here, we review a solution to this problem, by presenting a simple yet efficient algorithm which greatly reduces the bias, and therefore leads to better decoding performance in the stochastic case.

Frontiers in Neuroscience, 5 (1), pp. 1-16, 2011
URL | DOI

2010

Research Articles

Lyamzin DR, Macke JH, Lesica NA

Modeling population spike trains with specified time-varying spike rates, trial-to-trial variability, and pairwise signal and noise correlations

As multi-electrode and imaging technology begin to provide us with simultaneous recordings of large neuronal populations, new methods for modeling such data must also be developed. Here, we present a model for the type of data commonly recorded in early sensory pathways: responses to repeated trials of a sensory stimulus in which each neuron has it own time-varying spike rate (as described by its PSTH) and the dependencies between cells are characterized by both signal and noise correlations. This model is an extension of previous attempts to model population spike trains designed to control only the total correlation between cells. In our model, the response of each cell is represented as a binary vector given by the dichotomized sum of a deterministic "signal" that is repeated on each trial and a Gaussian random "noise" that is different on each trial. This model allows the simulation of population spike trains with PSTHs, trial-to-trial variability, and pairwise correlations that match those measured experimentally. Furthermore, the model also allows the noise correlations in the spike trains to be manipulated independently of the signal correlations and single-cell properties. To demonstrate the utility of the model, we use it to simulate and manipulate experimental responses from the mammalian auditory and visual systems. We also present a general form of the model in which both the signal and noise are Gaussian random processes, allowing the mean spike rate, trial-to-trial variability, and pairwise signal and noise correlations to be specified independently. Together, these methods for modeling spike trains comprise a potentially powerful set of tools for both theorists and experimentalists studying population responses in sensory systems.

Frontiers in Computational Neuroscience, 4 (144), pp. 1-11, 2010
URL | DOI | pdf

Macke JH, Wichmann FA

Estimating predictive stimulus features from psychophysical data: The decision image technique applied to human faces

One major challenge in the sensory sciences is to identify the stimulus features on which sensory systems base their computations, and which are predictive of a behavioral decision: they are a prerequisite for computational models of perception. We describe a technique (decision images) for extracting predictive stimulus features using logistic regression. A decision image not only defines a region of interest within a stimulus but is a quantitative template which defines a direction in stimulus space. Decision images thus enable the development of predictive models, as well as the generation of optimized stimuli for subsequent psychophysical investigations. Here we describe our method and apply it to data from a human face classification experiment. We show that decision images are able to predict human responses not only in terms of overall percent correct but also in terms of the probabilities with which individual faces are (mis-) classified by individual observers. We show that the most predictive dimension for gender categorization is neither aligned with the axis defined by the two class-means, nor with the first principal component of all faces-two hypotheses frequently entertained in the literature. Our method can be applied to a wide range of binary classification tasks in vision or other psychophysical contexts.

Journal of Vision, 10 (5), p. 22, 2010
URL | DOI | pdf

Gerwinn S, Macke J, Bethge M

Bayesian inference for generalized linear models for spiking neurons

Generalized Linear Models (GLMs) are commonly used statistical methods for modelling the relationship between neural population activity and presented stimuli. When the dimension of the parameter space is large, strong regularization has to be used in order to fit GLMs to datasets of realistic size without overfitting. By imposing properly chosen priors over parameters, Bayesian inference provides an effective and principled approach for achieving regularization. Here we show how the posterior distribution over model parameters of GLMs can be approximated by a Gaussian using the Expectation Propagation algorithm. In this way, we obtain an estimate of the posterior mean and posterior covariance, allowing us to calculate Bayesian confidence intervals that characterize the uncertainty about the optimal solution. From the posterior we also obtain a different point estimate, namely the posterior mean as opposed to the commonly used maximum a posteriori estimate. We systematically compare the different inference techniques on simulated as well as on multi-electrode recordings of retinal ganglion cells, and explore the effects of the chosen prior and the performance measure used. We find that good performance can be achieved by choosing an Laplace prior together with the posterior mean estimate.

Frontiers in Computational Neuroscience, 4 (12), pp. 1-17, 2010
URL | DOI | pdf

Macke JH, Gerwinn S, Kaschube M, White LE, Bethge M

Bayesian estimation of orientation preference maps

Imaging techniques such as optical imaging of intrinsic signals, 2-photon calcium imaging and voltage sensitive dye imaging can be used to measure the functional organization of visual cortex across different spatial and temporal scales. Here, we present Bayesian methods based on Gaussian processes for extracting topographic maps from functional imaging data. In particular, we focus on the estimation of orientation preference maps (OPMs) from intrinsic signal imaging data. We model the underlying map as a bivariate Gaussian process, with a prior covariance function that reflects known properties of OPMs, and a noise covariance adjusted to the data. The posterior mean can be interpreted as an optimally smoothed estimate of the map, and can be used for model based interpolations of the map from sparse measurements. By sampling from the posterior distribution, we can get error bars on statistical properties such as preferred orientations, pinwheel locations or pinwheel counts. Finally, the use of an explicit probabilistic model facilitates interpretation of parameters and quantitative model comparisons. We demonstrate our model both on simulated data and on intrinsic signaling data from ferret visual cortex.

Advances in Neural Information Processing Systems 22: 23rd Conference on Neural Information Processing Systems (NeurIPS 2009), pp. 1195-1203, 2010
URL | pdf | code

2009

Research Articles

Gerwinn S, Macke JH, Bethge M

Bayesian population decoding of spiking neurons

The timing of action potentials in spiking neurons depends on the temporal dynamics of their inputs and contains information about temporal fluctuations in the stimulus. Leaky integrate-and-fire neurons constitute a popular class of encoding models, in which spike times depend directly on the temporal structure of the inputs. However, optimal decoding rules for these models have only been studied explicitly in the noiseless case. Here, we study decoding rules for probabilistic inference of a continuous stimulus from the spike times of a population of leaky integrate-and-fire neurons with threshold noise. We derive three algorithms for approximating the posterior distribution over stimuli as a function of the observed spike trains. In addition to a reconstruction of the stimulus we thus obtain an estimate of the uncertainty as well. Furthermore, we derive a `spike-by-spike‘ online decoding scheme that recursively updates the posterior with the arrival of each new spike. We use these decoding rules to reconstruct time-varying stimuli represented by a Gaussian process from spike trains of single neurons as well as neural populations.

Frontiers in Computational Neuroscience, 3 (21), pp. 1-14, 2009
URL | DOI | pdf

Macke JH, Berens P, Ecker AS, Tolias AS, Bethge M

Generating Spike Trains with Specified Correlation Coefficients

Spike trains recorded from populations of neurons can exhibit substantial pairwise correlations between neurons and rich temporal structure. Thus, for the realistic simulation and analysis of neural systems, it is essential to have efficient methods for generating artificial spike trains with specified correlation structure. Here we show how correlated binary spike trains can be simulated by means of a latent multivariate gaussian model. Sampling from the model is computationally very efficient and, in particular, feasible even for large populations of neurons. The entropy of the model is close to the theoretical maximum for a wide range of parameters. In addition, this framework naturally extends to correlations over time and offers an elegant way to model correlated neural spike counts with arbitrary marginal distributions.

Neural Computation, 21 (2), pp. 397-423, 2009
URL | DOI | pdf | code

Preprints and Technical Reports

Macke JH, Opper M, Bethge M

The effect of pairwise neural correlations on global population statistics

Simultaneously recorded neurons often exhibit correlations in their spiking activity. These correlations shape the statistical structure of the population activity, and can lead to substantial redundancy across neurons. Here, we study the effect of pairwise correlations on the population spike count statistics and redundancy in populations of threshold-neurons in which response-correlations arise from correlated Gaussian inputs. We investigate the scaling of the redundancy as the population size is increased, and compare the asymptotic redundancy in our models to the corresponding maximum- and minimum entropy models.

MPG Technical Report, (183), 2009
PDF

2008

Research Articles

Ku S-P, Gretton A, Macke J, Logothetis NK

Comparison of Pattern Recognition Methods in Classifying High-resolution BOLD Signals Obtained at High Magnetic Field in Monkeys

Pattern recognition methods have shown that functional magnetic resonance imaging (fMRI) data can reveal significant information about brain activity. For example, in the debate of how object categories are represented in the brain, multivariate analysis has been used to provide evidence of a distributed encoding scheme [Science 293:5539 (2001) 24252430]. Many follow-up studies have employed different methods to analyze human fMRI data with varying degrees of success [Nature reviews 7:7 (2006) 523534]. In this study, we compare four popular pattern recognition methods: correlation analysis, support-vector machines (SVM), linear discriminant analysis (LDA) and Gaussian naïve Bayes (GNB), using data collected at high field (7 Tesla) with higher resolution than usual fMRI studies. We investigate prediction performance on single trials and for averages across varying numbers of stimulus presentations. The performance of the various algorithms depends on the nature of the brain activity being categorized: for several tasks, many of the methods work well, whereas for others, no method performs above chance level. An important factor in overall classification performance is careful preprocessing of the data, including dimensionality reduction, voxel selection and outlier elimination.

Magnetic Resonance Imaging, 26 (7), pp. 1007-1014, 2008
URL | DOI | pdf

Macke JH, Zeck G, Bethge M

Receptive Fields without Spike-Triggering

Stimulus selectivity of sensory neurons is often characterized by estimating their receptive field properties such as orientation selectivity. Receptive fields are usually derived from the mean (or covariance) of the spike-triggered stimulus ensemble. This approach treats each spike as an independent message but does not take into account that information might be conveyed through patterns of neural activity that are distributed across space or time. Can we find a concise description for the processing of a whole population of neurons analogous to the receptive field for single neurons? Here, we present a generalization of the linear receptive field which is not bound to be triggered on individual spikes but can be meaningfully linked to distributed response patterns. More precisely, we seek to identify those stimulus features and the corresponding patterns of neural activity that are most reliably coupled. We use an extension of reverse-correlation methods based on canonical correlation analysis. The resulting population receptive fields span the subspace of stimuli that is most informative about the population response. We evaluate our approach using both neuronal models and multi-electrode recordings from rabbit retinal ganglion cells. We show how the model can be extended to capture nonlinear stimulus-response relationships using kernel canonical correlation analysis, which makes it possible to test different coding mechanisms. Our technique can also be used to calculate receptive fields from multi-dimensional neural measurements such as those obtained from dynamic imaging methods.

Advances in Neural Information Processing Systems 20: 21st Annual Conference on Neural Information Processing Systems (NeurIPS 2007), pp. 969-976, 2008
URL | pdf

Gerwinn S, Macke J, Seeger M, Bethge M

Bayesian Inference for Spiking Neuron Models with a Sparsity Prior

Generalized linear models are the most commonly used tools to describe the stimulus selectivity of sensory neurons. Here we present a Bayesian treatment of such models. Using the expectation propagation algorithm, we are able to approximate the full posterior distribution over all weights. In addition, we use a Laplacian prior to favor sparse solutions. Therefore, stimulus features that do not critically influence neural activity will be assigned zero weights and thus be effectively excluded by the model. This feature selection mechanism facilitates both the interpretation of the neuron model as well as its predictive abilities. The posterior distribution can be used to obtain confidence intervals which makes it possible to assess the statistical significance of the solution. In neural data analysis, the available amount of experimental measurements is often limited whereas the parameter space is large. In such a situation, both regularization by a sparsity prior and uncertainty estimates for the model parameters are essential. We apply our method to multi-electrode recordings of retinal ganglion cells and use our uncertainty estimate to test the statistical significance of functional couplings between neurons. Furthermore we used the sparsity of the Laplace prior to select those filters from a spike-triggered covariance analysis that are most informative about the neural response.

Advances in Neural Information Processing Systems 20: 21st Conference on Neural Information Processing Systems (NeurIPS 2007), pp. 529-536, 2008
URL | pdf

Macke JH, Maack N, Gupta R, Denk W, Schölkopf B, Borst A

Contour-propagation Algorithms for Semi-automated Reconstruction of Neural Processes

A new technique, Serial Block Face Scanning Electron Microscopy (SBFSEM), allows for automatic sectioning and imaging of biological tissue with a scanning electron microscope. Image stacks generated with this technology have a resolution sufficient to distinguish different cellular compartments, including synaptic structures, which should make it possible to obtain detailed anatomical knowledge of complete neuronal circuits. Such an image stack contains several thousands of images and is recorded with a minimal voxel size of 10-20nm in the x and y- and 30nm in z-direction. Consequently, a tissue block of 1mm3 (the approximate volume of the Calliphora vicina brain) will produce several hundred terabytes of data. Therefore, highly automated 3D reconstruction algorithms are needed. As a first step in this direction we have developed semiautomated segmentation algorithms for a precise contour tracing of cell membranes. These algorithms were embedded into an easy-to-operate user interface, which allows direct 3D observation of the extracted objects during the segmentation of image stacks. Compared to purely manual tracing, processing time is greatly accelerated.

Journal of Neuroscience Methods, 167 (2), pp. 349-357, 2008
URL | DOI | pdf

2007

Research Articles

Bethge M, Gerwinn S, Macke JH

Unsupervised learning of a steerable basis for invariant image representations

There are two aspects to unsupervised learning of invariant representations of images: First, we can reduce the dimensionality of the representation by finding an optimal trade-off between temporal stability and informativeness. We show that the answer to this optimization problem is generally not unique so that there is still considerable freedom in choosing a suitable basis. Which of the many optimal representations should be selected? Here, we focus on this second aspect, and seek to find representations that are invariant under geometrical transformations occuring in sequences of natural images. We utilize ideas of steerability and Lie groups, which have been developed in the context of filter design. In particular, we show how an anti-symmetric version of canonical correlation analysis can be used to learn a full-rank image basis which is steerable with respect to rotations. We provide a geometric interpretation of this algorithm by showing that it finds the two-dimensional eigensubspaces of the avera ge bivector. For data which exhibits a variety of transformations, we develop a bivector clustering algorithm, which we use to learn a basis of generalized quadrature pairs (i.e. complex cells) from sequences of natural images.

Human Vision and Electronic Imaging XII: Proceedings of the SPIE Human Vision and Electronic Imaging Conference 2007, pp. 1-12, 2007
URL | DOI | pdf

Laub J, Macke JH, Müller K-R, Wichmann FA

Inducing Metric Violations in Human Similarity Judgements

Attempting to model human categorization and similarity judgements is both a very interesting but also an exceedingly difficult challenge. Some of the difficulty arises because of conflicting evidence whether human categorization and similarity judgements should or should not be modelled as to operate on a mental representation that is essentially metric. Intuitively, this has a strong appeal as it would allow (dis)similarity to be represented geometrically as distance in some internal space. Here we show how a single stimulus, carefully constructed in a psychophysical experiment, introduces l2 violations in what used to be an internal similarity space that could be adequately modelled as Euclidean. We term this one influential data point a conflictual judgement. We present an algorithm of how to analyse such data and how to identify the crucial point. Thus there may not be a strict dichotomy between either a metric or a non-metric internal space but rather degrees to which potentially large subsets of stimuli are represented metrically with a small subset causing a global violation of metricity.

Advances in Neural Information Processing Systems 19: 20th Conference on Neural Information Processing Systems (NeurIPS 2006), pp. 777-784, 2007
URL | pdf

Conference Presentations

2019

Nonnenmacher M, Lueckmann J, Bassetto G, Goncalves PJ, Macke JH

Inferring the parameters of neural simulations from high-dimensional observations

Many models in neuroscience, such as networks of spiking neurons or complex biophysical models, are defined as numerical simulators. This means one can simulate data from the model, but calculating the likelihoods associated with specific observations is hard or intractable, which in turn makes statistical inference challenging. So-called Approximate Bayesian Computation (ABC) aims to make Bayesian inference possible for likelihood-free models. However, standard ABC algorithms do not scale to high-dimensional observations, e.g. inference of receptive fields from high-dimensional stimuli. Here, we develop an approach to likelihood-free inference for high-dimensional data, where we train a neural network to perform statistical inference given adaptively simulated data sets. The network is composed of layers performing non-linear feature extraction, and fully connected layers for non-linear density estimation. Feature extraction layers are either convolutional or recurrent in structure, depending on whether the data is high-dimensional in space or time, respectively. This approach makes it possible to scale ABC to problems with high-dimensional inputs.

Computational and Systems Neuroscience Meeting (COSYNE 2019), (II-39), 2019
URL

2018

Nonnenmacher M, Goncalves PJ, Bassetto G, Lueckmann J, Macke JH

Robust statistical inference for simulation-based models in neuroscience

Bayesian statistical inference provides a principled framework for linking models of neural activity with empirical measurements. However, for many neuroscientific models of interest, and in particular those relying on numerical simulations, statistical inference is difficult and requires bespoke and expensive inference algorithms. Recent developments in machine learning from the field of likelihood-free inference [1,2] allow `black-box' statistical inference on a wide range of models. These methods adaptively simulate synthetic datasets from the model in question and use these datasets to train a probabilistic neural network to perform statistical inference. Statistical inference is very fast once the network is trained, which allows to infer model parameters conditioned on several observed datasets without additional costs. We simplify the usage and increase robustness of the involved deep learning methods to make them more widely applicable in the neuroscience community. Using convolutional networks, we furthermore extend the range of applications to high-dimensional data statistics and image-valued datasets. We illustrate how this approach can be used to perform parameter inference on a range of neuroscience applications from individual ion channels to single cells and neural circuits. In particular we infer receptive field (RF) parameters from single-cell neural activity in early visual areas resulting from white-noise stimulation (reverse correlation). We leverage recent work [3] on statistical inference for canonical RF models (Gabor filters for simple cells in V1). These parameterized RF models have interpretable parameters and strongly reduce the amount of data needed to constrain the posterior over those parameters. The approach will enable neuroscientists to perform Bayesian inference on complex models without having to design model-specific algorithms, closing the gap between biophysical and statistical approaches to neural activity.

Bernstein Conference 2018, (T 87), 2018
URL

2017

Goncalves PJ, Lueckmann J, Bassetto G, Nonnenmacher M, Macke JH

Flexible Bayesian inference for complex models of single neurons

Characterizing the input-output transformations of single neurons is critical for understanding neural computation. Single-neuron models have been extensively studied, ranging from simple phenomenological models to complex multi-compartment neurons. However, linking mechanistic models of single-neurons to empirical observations of neural activity has been challenging. Statistical inference is only possible for a few neuron models (e.g. GLMs), and no generally applicable, effective statistical inference algorithms are available: As a consequence, comparisons between models and data are either qualitative or rely on manual parameter tweaking, parameter-fitting using heuristics or brute-force search (Druckmann et al. 2007). Furthermore, parameter-fitting approaches typically return a single best-fitting estimate, but do not characterize the entire space of models that would be consistent with data (the 'posterior distribution'). We overcome this limitation by presenting a general method to infer the posterior distribution over model parameters given observed data on complex single-neuron models. Our approach can be applied in a 'black box' manner to a wide range of single-neuron models without requiring model-specific modifications. In particular, it extends to models without explicit likelihoods (e.g. most single-neuron models). We achieve this goal by building on recent advances in likelihood-free Bayesian inference (Papamakarios and Murray 2016): the key idea is to simulate multiple data-sets from different parameters, and then to train a probabilistic neural network which approximates the mapping from data to posterior distribution. We illustrate this approach using single- and multi-compartment models of single neurons: On simulated data, estimated posterior distributions recover ground-truth parameters, and reveal the manifold of parameters for which the model exhibits the same behaviour. On in-vitro recordings of membrane voltages, we recover multivariate posteriors over biophysical parameters, and voltage traces accurately match empirical data. Our approach will enable neuroscientists to perform Bayesian inference on complex neuron models without having to design model-specific algorithms, closing the gap between biophysical and statistical approaches to single-neuron modelling.

26th Annual Computational Neuroscience Meeting (CNS*2017), (Oral presentation 4), 2017
URL

Goncalves PJ, Lueckmann J, Bassetto G, Nonnenmacher M, Macke JH

Flexible Bayesian inference for mechanistic models of neural dynamics

One of the central goals of computational neuroscience is to understand the dynamics of single neurons and neural ensembles. However, linking mechanistic models of neural dynamics to empirical observations of neural activity has been challenging. Statistical inference is only possible for a few models of neural dynamics (e.g. GLMs), and no generally applicable, effective statistical inference algorithms are available: As a consequence, comparisons between models and data are either qualitative or rely on manual parameter tweaking, parameter-fitting using heuristics or brute-force search. Furthermore, parameter-fitting approaches typically return a single best-fitting estimate, but do not characterize the entire space of models that would be consistent with data. We overcome this limitation by presenting a general method for Bayesian inference on mechanistic models of neural dynamics. Our approach can be applied in a `black box' manner to a wide range of neural models without requiring model-specific modifications. In particular, it extends to models without explicit likelihoods (e.g. most spiking networks). We achieve this goal by building on recent advances in likelihood-free Bayesian inference (Papamakarios and Murray 2016, Moreno et al. 2016): the key idea is to simulate multiple data-sets from different parameters, and then to train a probabilistic neural network which approximates the mapping from data to posterior distribution. We illustrate this approach using Hodgkin-Huxley models of single neurons and models of spiking networks: On simulated data, estimated posterior distributions recover ground-truth parameters, and reveal the manifold of parameters for which the model exhibits the same behaviour. On in-vitro recordings of membrane voltages, we recover multivariate posteriors over biophysical parameters, and voltage traces accurately match empirical data. Our approach will enable neuroscientists to perform Bayesian inference on complex neural dynamics models without having to design model-specific algorithms, closing the gap between biophysical and statistical approaches to neural dynamics.

Computational and Systems Neuroscience Meeting (COSYNE 2017), (II-3), 2017
URL

Lueckmann J, Macke JH, Nienborg H

Can serial dependencies in choices and neural activity explain choice probabilities?

The activity of sensory neurons co-varies with choice during perceptual decisions, commonly quantified as “choice probability”. Moreover, choices are influenced by a subject’s previous choice (serial dependencies) and neuronal activity often shows temporal correlations on long (seconds) timescales. Here, we ask whether these findings are linked, specifically: How are choice probabilities in sensory neurons influenced by serial dependencies in choices and neuronal activity? Do serial dependencies in choices and neural activity reflect the same underlying process? Using generalized linear models (GLMs) we analyze simultaneous measurements of behavior and V2 neural activity in macaques performing a visual discrimination task. We observe that past decisions are substantially more predictive of the current choice than the current spike count. Moreover, spiking activity exhibits strong correlations from trial to trial. We dissect temporal correlations by systematically varying the order of predictors in the GLM, and find that these correlations reflect two largely separate processes: There is neither a direct effect of the previous-trial spike count on choice, nor a direct effect of preceding choices on the spike count. Additionally, variability in spike counts can largely be explained by slow fluctuations across multiple trials (using a Gaussian Process latent modulator within the GLM). Is choice-probability explained by history effects, i.e. how big is the residual choice probability after correcting for temporal correlations? We compute semi-partial correlations between choices and neural activity, which constitute a lower bound on the residual choice probability. We find that removing history effects by using semi-partial correlations does not systematically change the magnitude of choice probabilities. We therefore conclude that despite the substantial serial dependencies in choices and neural activity these do not explain the observed choice probability. Rather, the serial dependencies in choices and spiking activity reflect two parallel processes which are correlated by instantaneous co-variations between choices and activity.

Computational and Systems Neuroscience Meeting (COSYNE 2017), (II-77), 2017
URL

Speiser A, Archer E, Turaga S, Macke JH

Amortized inference for fast spike prediction from calcium imaging data

Calcium imaging allows neuronal activity measurements from large populations of spatially identified neurons in-vivo. However, spike inference algorithms are needed to infer spike times from fluorescence measurements of calcium concentration. Bayesian model inversion can be used to infer spikes, using carefully designed generative models that describe how spiking activity in a neuron influences measured fluorescence. Model inversion typically requires either computationally expensive MCMC sampling methods, or faster but approximate maximum-a-posteriori estimation. We present a method for efficiently inverting generative models for spike inference. Our method is several orders of magnitude faster than existing approaches, allowing for generative-model based spike inference in real-time for large-scale population neural imaging, and can be applied to a wide range of linear and nonlinear generative models. We use recent advances in black-box variational inference (BBVI, Ranganath 2014) and ‘amortize’ inference by learning a deep network based recognition-model for fast model inversion (Mnih 2016). At training time, we simultaneously optimize the parameters of the generative model as well as the weights of a deep neural network which predicts the posterior approximation. At test time, performing inference for a given trace amounts to a fast single forward pass through the network at constant computational cost, and without the need for iterative optimization or MCMC sampling. On simple synthetic datasets, we show that our method is just as accurate as existing methods. However, the BBVI approach works with a wide range of generative models in a black-box manner as long as they are differentiable. In particular, we show that using a nonlinear generative model is better suited to describe GCaMP6 data (Chen 2013), leading to improved performance on real data. The framework can also easily be extended to combine supervised and unsupervised objectives enabling semi-supervised learning of spike inference..

Computational and Systems Neuroscience Meeting (COSYNE 2017), (III-59), 2017
URL

Bassetto G, Macke JH

Using bayesian inference to estimate receptive fields from a small number of spikes

A crucial step towards understanding how the external world is represented by sensory neurons is the characterization of neural receptive fields. Advances in experimental methods give increasing opportunity to study sensory processing in behaving animals, but also necessitate the ability to estimate receptive fields from very small spike-counts. For visual neurons, the stimulus space can be very high dimensional, raising challenges for data-analysis: How can one accurately estimate neural receptive fields using only a few spikes, and obtain quantitative uncertainty-estimates about tuning properties (such as location and preferred orientation)? For many sensory areas, there are canonical parametric models of receptive field shapes (e.g., Gabor functions for primary visual cortex) which can be used to constrain receptive fields \--- we will use such parametric models for receptive field estimation in low-data regimes using full Bayesian inference. We will focus on modelling simple cells in primary visual cortex, but our approach will be applicable more generally. We model the spike generation process using a generalized linear model (GLM), with a receptive field parameterized as a time-modulated Gabor. Use of the parametric model dramatically reduces the number of parameters, and allows us to directly estimate the posterior distribution over interpretable model parameters. We develop an efficient Markov Chain Monte Carlo procedure which is adapted to receptive field estimation from movie-data, by exploiting spatio-temporal separability of receptive fields. We show that the method successfully detects the presence or absence of a receptive field in simulated data even when the total number of spikes is low, and can correctly recover ground-truth parameters. When applied to electrophysiological recordings, it returns estimates of model parameters which are consistent across different subsets of the data. In comparison with non-parametric methods based on Gaussian Processes, we find that it leads to better spike-prediction performance.

Computational and Systems Neuroscience Meeting (COSYNE 2017), (I-27), 2017

2016

Bassetto G, Macke JH

Full Bayesian inference for model-based receptive field estimation, with application to primary visual cortex

A central question in sensory neuroscience is to understand how sensory information is represented in neural activity. A crucial step towards the solution of this problem is the characterization of the neuron’s receptive field (RF), which provides a quantitative description of those features of a rich sensory stimulus that modulate the firing rate of the neuron. For visual neurons, the stimulus space can be very high dimensional, and RFs have to be estimated from neurophysiological recordings of limited size. The scarcity of data makes it paramount to have statistical methods which incorporate prior knowledge into the estimation process (Park & Pillow 2011), as well as to provide quantitative estimates of uncertainty about the inferred RFs (Stevenson et al 2011). For many sensory areas, there are canonical parametric models of RF shapes – e.g., Gabor functions for RFs in primary visual cortex (V1) (Jones & Palmer 1987). Bayesian methods provide a quantitative way of evaluating these models on empirical data by estimating the uncertainty of the inferred model parameters. We present a technique for full Bayesian inference of the parameters of parametric RF models, focusing on Gabor-shapes for V1. We model the spike generation process by means of a generalized linear model (GLM, Paninski 2004), whose linear filter (i.e., RF) is parameterized as a time-modulated Gabor-function. Use of this model dramatically reduces the number of parameters required to describe the RF, and allows us to directly estimate the posterior distribution over interpretable model parameters (e.g. location, orientation, etc.). The resulting model is non-linear in the parameters. We present an efficient Markov Chain Monte Carlo procedure for inferring the full posterior distribution over model parameters. We show that the method successfully detects the presence or absence of a RF in simulated data – even when the total number of spikes is very low – and can correctly recover ground-truth parameters. When applied to electrophysiological recordings, it returns estimates of model parameters which are consistent across different subsets of the data. Our current implementation is focused on the response of simple cells in V1, but the approach can readily be extended to other sensory areas or non-linear models of complex cells.

Bernstein Conference 2016, (W 91), 2016
URL | DOI

Nonnenmacher M, Buesing L, Speiser A, Turaga SC, Macke JH

Stitching neural activity in space and time: theory and practice

Simultaneous recordings of the activity of large neural populations are extremely valuable as they can be used to infer the dynamics and interactions of neurons in a local circuit, shedding light on the computations performed. It is now possible to measure the activity of hundreds of neurons using in-vivo 2-photon calcium imaging. However, this experimental technique imposes a trade-off between the number of neurons which can be simultaneously recorded, and the temporal resolution at which the activity of those neurons can be sampled. Previous work (Turaga et al 2012, Bishop & Yu 2014) has shown that statistical models can be used to ameliorate this trade-off, by “stitching” neural activity from sub-populations of neurons which have been imaged sequentially with overlap, rather than simultaneously. This makes it possible to estimate correlations even between non-simultaneously recorded neurons. In this work, we make two contributions: First, we show how taking into account correlations in the dynamics of neural activity gives rise to more general conditions under which stitching can be achieved, extending the work of (Bishop & Yu 2014). Second, we extend this framework to stitch activity both in space and time, i.e. from multiple sub-populations which might be imaged at different temporal rates. We use low-dimensional linear latent dynamical systems (LDS) to model neural population activity, and present scalable algorithms to estimate the parameters of a globally accurate LDS model from incomplete measurements. Using simulated data, we show that this approach can provide more accurate estimates of neural correlations than conventional approaches, and gives insights into the underlying neural dynamics.

Computational and Systems Neuroscience Meeting (COSYNE 2016), (III-100), 2016
URL

2015

Nonnenmacher M, Behrens C, Berens P, Bethge M, Macke JH

Correlations and signatures of criticality in neural population models

Large-scale recording methods make it possible to measure the statistics of neural population activity, and thereby to gain insights into the principles that govern the collective activity of neural ensembles. One hypothesis that has emerged from this approach is that neural populations are poised at a ‘thermo-dynamic critical point’, and that this has important functional consequences (Tkacik et al 2014). Support for this hypothesis has come from studies that computed the specific heat, a measure of global population statistics, for groups of neurons subsampled from population recordings. These studies have found two effects which—in physical systems—indicate a critical point: First, specific heat diverges with population size N. Second, when manipulating population statistics by introducing a ’temperature’ in analogy to statistical mechanics, the maximum heat moves towards unit-temperature for large populations. What mechanisms can explain these observations? We show that both effects arise in a simple simulation of retinal population activity. They robustly appear across a range of parameters including biologically implausible ones, and can be understood analytically in simple models. The specific heat grows with N whenever the (average) correlation is independent of N, which is always true when uniformly subsampling a large, correlated population. For weakly correlated populations, the rate of divergence of the specific heat is proportional to the correlation strength. Thus, if retinal population codes were optimized to maximize specific heat, then this would predict that they seek to increase correlations. This is incongruent with theories of efficient coding that make the opposite prediction. We find criticality in a simple and parsimonious model of retinal processing, and without the need for fine-tuning or adaptation. This suggests that signatures of criticality might not require an optimized coding strategy, but rather arise as consequence of sub-sampling a stimulus-driven neural population (Aitchison et al 2014).

45th Annual Meeting of the Society for Neuroscience (Neuroscience 2015), (543.23), 2015
URL

Nonnenmacher M, Behrens C, Berens P, Bethge M, Macke JH

Correlations and signatures of criticality in neural population models

Bernstein Conference 2015, pp. 27-28, 2015
URL

Macke JH
Dissecting choice-probabilities in V2 neurons using serial dependence
COSYNE 2015 Workshops, 2015
URL

Nonnenmacher M, Behrens C, Berens P, Bethge M, Macke JH

Correlations and signatures of criticality in neural population models

Computational and Systems Neuroscience Meeting (COSYNE 2015), (III-79), 2015
URL

2014

Nienborg H, Macke JH

Using sequential dependencies in neural activity and behavior to dissect choice related activity in V2

During perceptual decisions the activity of sensory neurons co-varies with choice. Previous findings suggest that this partially reflects “bottom-up” and “top-down” effects. However, the quantitative contributions of these effects are unclear. To address this question, we take advantage of the observation that past choices influence current behavior (sequential dependencies). Here, we use data from two macaque monkeys performing a disparity discrimination task during simultaneous extracellular recordings of disparity selective V2 neurons. We quantify the sequential dependencies using generalized linear models to predict choices or spiking activity of the V2 neurons. We find that past choices predict current choices substantially better than the spike counts on the current trial, i.e. have a higher “choice probability”. In addition, we observe that past choices have a significant predictive effect on the activity of sensory neurons on the current trial. This effect results from sequential dependencies of choices and neural activity alone, but also reflects a direct influence of past choices on the spike count on the current trial. We then use these sequential dependencies to dissect the neuronal co-variation with choice: We decomposed the choice co-variation of neural spike counts into components, which can be explained by behavior or neural activity on previous trials. We find that about 30% of the observed co-variation is already explained by the animals’ previous choice, suggesting a “top-down” contribution of at least 30%. Additionally, our results exemplify how variability frequently regarded as noise reflects the systematic effect of ignored neural and behavioral co-variates, and that interpretation of co-variations between neural activity and observed behavior should take the temporal context within the experiment into account.

44th Annual Meeting of the Society for Neuroscience (Neuroscience 2014), 44 , p. 435.08, 2014
URL

Archer E, Pillow J, Macke JH

Low Dimensional Dynamical Models of Neural Populations with Common Input

Modern experimental technologies enable simultaneous recording of large neural populations. These high-dimensional data present a challenge for analysis. Recent work has focused on extracting low-dimensional dynamical trajectories that may underly such responses. Such methods enable visualization and may also provide insight into neural computations. Previous work focuses on modeling a population’s dynamics without conditioning on external stimuli. Our proposed technique integrates linear dimensionality reduction with a latent dynamical system model of neural activity. Under our model, population response is governed by a low-dimensional dynamical system with quadratic input. In this framework the number of parameters in grows linearly with population (size given fixed latent dimensionality). Hence it is computationally fast for large populations, unlike fully-connected models. Our method captures both noise correlations and low-dimensional stimulus selectivity through the simultaneous modeling of dynamics and stimulus dependence. This approach is particularly well-suited for studying the population activity of sensory cortices, where neurons often have substantial receptive field overlap.

15th Conference of Junior Neuroscientists of Tübingen (NeNa 2014), 15 , p. 22, 2014
URL

Nienborg H, Macke JH

Using sequential dependencies in neural activity and behavior to dissect choice related activity in V2

Bernstein Conference 2014, pp. 73-74, 2014
URL | DOI

Archer E, Pillow JW, Macke JH

Low-dimensional dynamical neural population models with shared stimulus drive

Modern experimental technologies enable simultaneous recording of large neural populations. These high-dimensional data present a challenge for analysis. Recent work has focused on extracting low-dimensional dynamical trajectories that may underlie such responses. These methods enable visualization and may also provide insight into neural compuations. However, previous work focused on modeling a population’s dynamics without conditioning on external stimuli. We propose a new technique that integrates linear dimensionality reduction (analogous to the STA and STC) with a latent dynamical system model of neural activity. Under our model, the spike response of a neural population is governed by a low- dimensional dynamical system with quadratic input. In this framework, the number of parameters grows linearly with population (size given fixed latent dimensionality). Hence, it is computationally fast for large populations, unlike fully-connected models. Our method captures both noise correlations and low-dimensional stimulus selectivity through the simultaneous modeling of dynamics and stimulus dependence. This approach is particularly well-suited for studying the population activity of sensory cortices, where neurons often have substantial receptive field overlap.

Bernstein Conference 2014, pp. 72-73, 2014
URL | DOI

Schütt H, Harmeling S, Macke JH, Wichmann F

Pain-free bayesian inference for psychometric functions

To estimate psychophysical performance, psychometric functions are usually modeled as sigmoidal functions, whose parameters are estimated by likelihood maximization. While this approach gives a point estimate, it ignores its reliability (its variance). This is in contrast to Bayesian methods, which in principle can determine the posterior of the parameters and thus the reliability of the estimates. However, using Bayesian methods in practice usually requires extensive expert knowledge, user interaction and computation time. Also many methods---including Bayesian ones---are vulnerable to non-stationary observers (whose performance is not constant). Our work provides an efficient Bayesian analysis, which runs within seconds on a common office computer, requires little user-interaction and improves robustness against non-stationarity. A Matlab implementation of our method, called PSIGNFIT 4, is freely available online. We additionally provide methods to combine posteriors to test the difference between psychometric functions (such as between conditions), obtain posterior distributions for the average of a group, and other comparisons of practical interest. Our method uses numerical integration, allowing robust estimation of a beta-binomial model that is stable against non-stationarities. Comprehensive simulations to test the numerical and statistical correctness and robustness of our method are in progress, and initial results look very promising.

Perception, 43 (ECVP Abstract Supplement), p. 162, 2014
URL

Schütt H, Harmeling S, Macke J, Wichmann F

Pain-free Bayesian inference for psychometric functions

To estimate psychophysical performance, psychometric functions are usually modeled as sigmoidal functions, whose parameters are estimated by likelihood maximization. While this approach gives a point estimate, it ignores its reliability (its variance). This is in contrast to Bayesian methods, which in principle can determine the posterior of the parameters and thus the reliability of the estimates. However, using Bayesian methods in practice usually requires extensive expert knowledge, user interaction and computation time. Also many methods|including Bayesian ones|are vulnerable to non-stationary observers (whose performance is not constant). Our work provides an efficient Bayesian analysis, which runs within seconds on a common office computer, requires little user-interaction and improves robustness against non-stationarity. A Matlab implementation of our method, called PSIGNFIT 4, is freely available online. We additionally provide methods to combine posteriors to test the difference between psychometric functions (such as between conditions), obtain posterior distributions for the average of a group, and other comparisons of practical interest. Our method uses numerical integration, allowing robust estimation of a beta-binomial model that is stable against non-stationarities. Comprehensive simulations to test the numerical and statistical correctness and robustness of our method are in progress, and initial results look very promising.

2014 European Mathematical Psychology Group Meeting (EMPG), pp. 38-39, 2014
URL

Turaga SC, Buesing L, Packer A, Dalgleish H, Pettit N, Hauser M, Macke JH

Predicting noise correlations for non-simultaneously measured neuron pairs

Computational and Systems Neuroscience Meeting (COSYNE 2014), 2014 , p. 84, 2014
URL

Archer E, Pillow JW, Macke JH

Low-dimensional models of neural population recordings with complex stimulus selectivity

Modern experimental technologies such as multi-electrode arrays and 2-photon population calcium imaging make it possible to record the responses of large neural populations (up to 100s of neurons) simultaneously. These high-dimensional data pose a significant challenge for analysis. Recent work has focused on extracting lowdimensional dynamical trajectories that may underlie such responses. These methods enable visualization of high-dimensional neural activity, and may also provide insight into the function of underlying circuitry. Previous work, however, has primarily focused on models of a opulation’s intrinsic dynamics, without taking into account any external stimulus drive. We propose a new technique that integrates linear dimensionality reduction of stimulus-response functions (analogous to spike-triggered average and covariance analysis) with a latent dynamical system (LDS) model of neural activity. Under our model, the population response is governed by a low-dimensional dynamical system with nonlinear (quadratic) stimulus-dependent input. Parameters of the model can be learned by combining standard expectation maximization for linear dynamical system models with a recently proposed algorithms for learning quadratic feature selectivity. Unlike models with all-to-all connectivity, this framework scales well to large populations since, given fixed latent dimensionality, the number of parameters grows linearly with population size. Simultaneous modeling of dynamics and stimulus dependence allows our method to model correlations in response variability while also uncovering low-dimensional stimulus selectivity that is shared across a population. Because stimulus selectivity and noise correlations both arise from coupling to the underlying dynamical system, it is particularly well-suited for studying the neural population activity of sensory cortices, where stimulus inputs received by different neurons are likely to be mediated by local circuitry, giving rise to both shared dynamics and substantial receptive field overlap.

Computational and Systems Neuroscience Meeting (COSYNE 2014), 2014 , p. 162, 2014
URL

2013

Turaga SC, Buesing L, Packer M, Hausser M, Macke JH

Inferring interactions between cell types from multiple calcium imaging snapshots of the same neural circuit

Understanding the functional connectivity between different cortical cell types and the resulting population dynamics is a challenging and important problem. Progress with in-vivo 2-photon population calcium imaging has made it possible to densely sample neural activity in superficial layers of a local patch of cortex. In principle, such data can be used to infer the functional (statistical) connectivity between different classes of cortical neurons by fitting models such as generalized linear models or latent dynamical systems (LDS). However, this approach faces 3 major challenges which we address: 1) only small populations of neurons (~200) can currently be simultaneously imaged at any given time; 2) the cell types of individual neurons are often unknown; and 3) it is unclear how to pool data across different animals to derive an average model. First, while it is not possible to simultaneously image all neurons in a cortical column, it is currently possible to image the activity of ~200 neurons at a time and to repeat this procedure at multiple cortical depths (down to layer 3). We present a computational method ("Stitching LDS") which allows us to "stitch" such non-simultaneously imaged populations of neurons into one large virtual population spanning different depths of cortex. Importantly - and surprisingly - this approach allows us to predict couplings and noise correlations even for pairs of neurons that were never imaged simultaneously. Second, we automatically cluster neurons based on similarities in their functional connectivity (“Clustering LDS”). Under the assumption that such functionally defined clusters can correspond to cell types, this enables us to infer both the cell types and their functional connectivity. Third, while connection profiles of individual cells in one class can be variable, we expect the ‘average’ influence of one cell class on another to be fairly consistent across animals. We show how our approach can be used to pool measurements across different animals in a principled manner (“Pooling LDS”). The result is a highly accurate average model of the interactions between different cell classes. We demonstrate the utility of our computational tools by applying them to model the superficial layers of barrel cortex based on in-vivo 2-photon imaging data in awake mice.

43rd Annual Meeting of the Society for Neuroscience (Neuroscience 2013), 43 (743.27), 2013
URL

Macke JH

B8: Statistical Modelling of Psychophysical Data

n this tutorial, we will discuss some statistical techniques that one can use in order to obtain a more accurate statistical model of the relationship between experimental variables and psychophysical performance. We will use models which include the effect of additional, non-stimulus determinants of behaviour, and which therefore give us additional flexibility in analysing psychophysical data. For example, these models will allow us to estimate the effect of experimental history on the responses on an observer, and to automatically correct for errors which can be attributed to such history-effects. By reanalysing a large data-set of low-level psychophysical data, we will show that the resulting models have vastly superior statistical goodness of fit, give more accurate estimates of psychophysical functions and allow us to detect and capture interesting temporal structure in psychophysical data. In summary, the approach presented in this tutorial does not only yield more accurate models of the data, but also has the potential to reveal unexpected structure in the kind of data that every visual scientist has plentiful-- classical psychophysical data with binary responses.

Perception, 42 (ECVP Abstract Supplement), p. 4, 2013
URL | DOI

Buesing L, Macke JH, Sahani M
Robust estimation for neural state-space models
Computational and Systems Neuroscience Meeting (COSYNE 2013), (II-89), 2013
URL

Macke JH, Murray I, Latham P
How biased are maximum entropy models of neural population activity?
Computational and Systems Neuroscience Meeting (COSYNE 2013), (III-89), 2013
URL

2012

Macke JH, Büsing L, Cunningham JP, Yu BM, Shenoy KV, Sahani M
Empirical models of spiking in neural populations
Janelia Farm Conference 2012: Machine Learning, Statistical Inference, and Neuroscience, 2012

2011

Haefner RM, Gerwinn S, Macke JH, Bethge M

Relationship between decoding strategy, choice probabilities and neural correlations in perceptual decision-making task

When monkeys make a perceptual decision about ambiguous visual stimuli, individual sensory neurons in MT and other areas have been shown to covary with the decision. This observation suggests that the response variability in those very neurons causes the animal to choose one over the other option. However, the fact that sensory neurons are correlated has greatly complicated attempts to link those covariances (and the associated choice probabilities) to a direct involvement of any particular neuron in a decision-making task. Here we report on an analytical treatment of choice probabilities in a population of correlated sensory neurons read out by a linear decoder. We present a closed-form solution that links choice probabilities, noise correlations and decoding weights for the case of fixed integration time. This allowed us to analytically prove and generalize a prior numerical finding about the choice probabilities being only due to the difference between the correlations within and between decision pools (Nienborg & Cumming 2010) and derive simplified expressions for a range of interesting cases. We investigated the implications for plausible correlation structures like pool-based and limited-range correlations. We found that the relationship between choice probabilities and decoding weights is in general non-monotonic and highly sensitive to the underlying correlation structure. In fact, given empirical measures of the interneuronal correlations and CPs, our formulas allow to infer the individual neuronal decoding weights. We confirmed the feasibility of this approach using synthetic data. We then applied our analytical results to a published dataset of empirical noise correlations and choice probabilities (Cohen & Newsome 2008 and 2009) recorded during a classic motion discriminating task (Britten et al 1992). We found that the data are compatible with an optimal read-out scheme in which the responses of neurons with the correct direction preference are summed and those with perpendicular preference, but positively correlated noise, are subtracted. While the correlation data of Cohen & Newsome (being based on individual extracellular electrode recordings) do not give access to the full covariance structure of a neural population, our analytical formulas will make it possible to accurately infer individual read-out weights from simultaneous population recordings.

41st Annual Meeting of the Society for Neuroscience (Neuroscience 2011), 41 (17.09), 2011
URL

Macke JH, Opper M, Bethge M

The effect of common input on higher-order correlations and entropy in neural populations

Finding models for capturing the statistical structure of multi-neuron firing patterns is a major challenge in sensory neuroscience. Recently, Maximum Entropy (MaxEnt) models have become popular tools for studying neural population recordings [4, 3]. These studies have found that small populations in retinal, but not in local cortical circuits, are well described by models based on pairwise correlations. It has also been found that entropy in small populations grows sublinearly [4], that sparsity in the population code is related to correlations [3], and it has been conjectured that neural populations might be at a ícritical pointí. While there have been many empirical studies using MaxEnt models, there has arguably been a lack of analytical studies that might explain the diversity of their findings. In particular, theoretical models would be of great importance for investigating their implications for large populations. Here, we study these questions in a simple, tractable population model of neurons receiving Gaussian inputs [1, 2]. Although the Gaussian input has maximal entropy, the spiking-nonlinearities yield non-trivial higher-order correlations (íhocsí). We find that the magnitude of hocs is strongly modulated by pairwise correlations, in a manner which is consistent with neural recordings. In addition, we show that the entropy in this model grows sublinearly for small, but linearly for large populations. We characterize how the magnitude of hocs grows with population size. Finally, we find that the hocs in this model lead to a diverging specific heat, and therefore, that any such model appears to be at a critical point. We conclude that common input might provide a mechanistic explanation for a wide range of recent empirical observations. [1] SI Amari, H Nakahara, S Wu, Y Sakai. Neural Comput, 2003. [2] JH Macke, M Opper, M Bethge. ArXiv, 2010. [3] IE Ohiorhenuan, et. al Nature, 2010. [4] E Schneidman, MJ Berry, R Segev, W Bialek. Nature, 2006.

Computational and Systems Neuroscience Meeting (COSYNE 2011), (III-68), 2011
URL

Macke JH, Büsing L, Cunningham JP, Yu BM, Shenoy KV, Sahani M

Modelling low-dimensional dynamics in recorded spiking populations

Neural population activity reflects not only variations in stimulus drive ( captured by many neural encoding models) but also the rich computational dynamics of recurrent neural circuitry. Identifying this dynamical structure, and relating it to external stimuli and behavioural events, is a crucial step towards understanding neural computation. One data-driven approach is to fit hidden low-dimensional dynamical systems models to the high-dimensional spiking observations collected by microelectrode arrays (Yu et al, 2006, 2009). This approach yields low-dimensional representations of population-activity, allowing analysis and visualization of population dynamics with single trial resolution. Here, we compare two models using latent linear dynamics, with the dependence of spiking observations on the dynamical state being either linear with Gaussian observations (GaussLDS), or generalised linear with Poisson observations and an exponential nonlinearity (PoissonLDS) (Kulkarni & Paninski, 2007). Both models were fit by Expectation-Maximisation to multi-electrode recordings from pre-motor cortex in behaving monkeys during the delay-period of a delayed reach task. We evaluated the accuracy of different approximations for the E-step necessary for PoissonLDS using elliptical slice sampling. We quantified model-performance using a cross-prediction approach (Yu et al). Although only the Poisson noise model takes the discrete nature of spiking into account, we found no consistent improvement of the Poisson-model over GaussLDS: PoissonLDS was generally more accurate for low dimensions, but slightly under-performed GaussLDS in higher dimensions (cf. Lawhern et al. 2010). We also examined the ability of such models to capture conventional population metrics such as pairwise correlations and the distribution of synchronous spikes counts. We found that both models were able to reproduce these quantities with very low dynamical dimension, although the non-positivity of the Gaussian model introduced a bias. Thus, despite its verisimilitude, the Poisson observation model does not always yield more accurate predictions in real data.

Computational and Systems Neuroscience Meeting (COSYNE 2011), (I-34), 2011
URL

2010

Macke JH, Sebastian G, White LE, Kaschube M, Bethge M

Estimating cortical maps with Gaussian process models

A striking feature of cortical organization is that the encoding of many stimulus features, such as orientation preference, is arranged into topographic maps. The structure of these maps has been extensively studied using functional imaging methods, for example optical imaging of intrinsic signals, voltage sensitive dye imaging or functional magnetic resonance imaging. As functional imaging measurements are usually noisy, statistical processing of the data is necessary to extract maps from the imaging data. We here present a probabilistic model of functional imaging data based on Gaussian processes. In comparison to conventional approaches, our model yields superior estimates of cortical maps from smaller amounts of data. In addition, we obtain quantitative uncertainty estimates, i.e. error bars on properties of the estimated map. We use our probabilistic model to study the coding properties of the map and the role of noise correlations by decoding the stimulus from single trials of an imaging experiment. In addition, we show how our method can be used to reconstruct maps from sparse measurements, for example multi-electrode recordings. We demonstrate our model both on simulated data and on intrinsic signaling data from ferret visual cortex.

40th Annual Meeting of the Society for Neuroscience (Neuroscience 2010), 40 (483.18), 2010
URL

Gerwinn S, Macke JH, Bethge M

Toolbox for inference in generalized linear models of spiking neurons

Generalized linear models are increasingly used for analyzing neural data, and to characterize the stimulus dependence and functional connectivity of both single neurons and neural populations. One possibility to extend the computational complexity of these models is to expand the stimulus, and possibly the representation of the spiking history into high dimensional feature spaces. When the dimension of the parameter space is large, strong regularization has to be used in order to fit GLMs to datasets of realistic size without overfitting. By imposing properly chosen priors over parameters, Bayesian inference provides an effective and principled approach for achieving regularization. In this work, we present a MATLAB toolbox which provides efficient inference methods for parameter fitting. This includes standard maximum a posteriori estimation for Gaussian and Laplacian prior, which is also sometimes referred to as L1- and L2-reguralization. Furthermore, it implements approximate inference techniques for both prior distributions based on the expectation propagation algorithm [1]. In order to model the refractory property and functional couplings between neurons, the spiking history within a population is often represented as responses to a set of predefined basis functions. Most of the basis function sets used so far, are non-orthogonal. Commonly priors are specified without taking the properties of the basis functions into account (uncorrelated Gauss, independent Laplace). However, if basis functions overlap, the coefficients are correlated. As an example application of this toolbox, we analyze the effect of independent prior distributions, if the set of basis functions are non-orthogonal and compare the performance to the orthogonal setting.

Frontiers in Computational Neuroscience, 2010 (Conference Abstract: Bernstein Conference on Computational Neuroscience), 2010
URL | DOI

Haefner R, Gerwinn S, Macke J, Bethge M

Implications of correlated neuronal noise in decision making circuits for physiology and behavior

Understanding how the activity of sensory neurons contribute to perceptual decision making is one of the major questions in neuroscience. In the current standard model, the output of opposing pools of noisy, correlated sensory neurons is integrated by downstream neurons whose activity elicits a decision-dependent behavior [1][2]. The predictions of the standard model for empirical measurements like choice probability (CP), psychophysical kernel (PK) and reaction time distribution crucially depend on the spatial and temporal correlations within the pools of sensory neurons. This dependency has so far only been investigated numerically and for time-invariant correlations and variances. However, it has recently been shown that the noise variance undergoes significant changes over the course of the stimulus presentation [3]. The same is true for inter-neuronal correlations that have been shown to change with task and attentional state [4][5]. In the first part of our work we compute analytically the time course of CPs and PKs in the presence of arbitrary noise correlations and variances for the case of non-leaky integration and Gaussian noise. This allows general insights and is especially needed in the light of the experimental transition from single-cell to multi-cell recordings. Then we simulate the implications of realistic noise in several variants of the standard model (leaky and non-leaky integration, integration over the entire stimulus presentation or until a bound, with and without urgency signal) and compare them to physiological data. We find that in the case of non-leaky integration over the entire stimulus duration, the PK only depends on the overall level of noise variance, not its time course. That means that the PK remains constant regardless of the temporal changes in the noise. This finding supports an earlier conclusion that an observed decreasing PK suggests that the brain is not integrating over the entire stimulus duration but only until it has accumulated sufficient evidence, even in the case of no urgency [6]. The time course of the CP, on the other hand, strongly depends on the time course of the noise variances and on the temporal and interneuronal correlations. If noise variance or interneuronal correlation increases, CPs increase as well. This dissociation of PK and CP allows an alternative solution to the puzzle recently posed by [7] in a bottom-up framework by combining integration to a bound with an increase in noise variance/correlation. In addition, we derive how the distribution of reaction times depends on noise variance and correlation, further constraining the model using empirical observations.

Frontiers in Neuroscience, Conference Abstract: Computational and Systems Neuroscience 2010 , 2010
URL | DOI

2009

Häfner R, Gerwinn S, Macke JH, Bethge M

Neuronal decision-making with realistic spiking models

The neuronal processes underlying perceptual decision-making have been the focus of numerous studies over the past two decades. In the current standard model [1][2][3] the output of noisy sensory neurons is pooled and integrated by decision neurons. Once the activity of the decision neurons reaches a threshold, the corresponding choice is made. This bottom-up model was recently challenged based on the empirical finding that the time courses of psychophysical kernel (PK) and choice probability (CP) qualitatively differ from each other [4]. It was concluded that the decision-related activity in sensory neurons, at least in part, reflects the decision through a top-down signal, rather than contribute to the decision causally. However, the prediction of the standard bottom-up model about the relationship between the time courses of PKs and CPs crucially depends on the underlying noise model. Our study explores the impact of the time course and correlation structure of neuronal noise on PK and CP for several decision models. For the case of non-leaky integration over the entire stimulus duration, we derive analytical expressions for Gaussian additive noise with arbitrary correlation structure. For comparison, we also investigate biophysically generated responses with a Fano factor that increases with the counting window [5], and alternative decision models (leaky, integration to bound) using numerical simulations. In the case of non-leaky integration over the entire stimulus duration we find that the amplitude of the PK only depends on the overall level of noise, but not its temporal changes. Consequently the PK remains constant regardless of the temporal evolution or correlation structure in the noise. In conjunction with the observed decrease in the amplitude of the PK (e.g. [4]) this supports the conclusion that decreasing PKs are evidence for an integration to a bound model [1][3]. However, we find that the temporal evolution of the CP depends strongly on both the time course of the noise variance and the temporal correlations within the pool of sensory neurons. For instance, a noise variance that increases over time also leads to an increasing CP. The bottom-up account that appears to agree best with the data in [4] combines an increasing variance of the correlated noise (the noise that cannot be eliminated by averaging over many neurons) with an integration-to-bound decision model. This leads to a decreasing PK, as well as a CP that first increases slowly before leveling off and persisting until the end. We do not find qualitatively different results when using biophysically generated or Poisson distributed responses instead of additive Gaussian noise. In summary, we advance the analytical framework for a quantitative comparison of choice probabilities and psychophysical kernels and find that recent data that was taken to be evidence of a top-down component in choice probabilities, may alternatively be accounted for by a bottom-up model when allowing for time-varying correlated noise.

Frontiers in Computational Neuroscience, 2009 (Conference Abstract: Bernstein Conference on Computational Neuroscience), pp. 132-133, 2009
URL | DOI

Macke JH, Wichmann FA

Estimating Critical Stimulus Features from Psychophysical Data: The Decision-Image Technique Applied to Human Faces

One of the main challenges in the sensory sciences is to identify the stimulus features on which the sensory systems base their computations: they are a pre-requisite for computational models of perception. We describe a technique---decision-images--- for extracting critical stimulus features based on logistic regression. Rather than embedding the stimuli in noise, as is done in classification image analysis, we want to infer the important features directly from physically heterogeneous stimuli. A Decision-image not only defines the critical region-of-interest within a stimulus but is a quantitative template which defines a direction in stimulus space. Decision-images thus enable the development of predictive models, as well as the generation of optimized stimuli for subsequent psychophysical investigations. Here we describe our method and apply it to data from a human face discrimination experiment. We show that decision-images are able to predict human responses not only in terms of overall percent correct but are able to predict, for individual observers, the probabilities with which individual faces are (mis-) classified. We then test the predictions of the models using optimized stimuli. Finally, we discuss possible generalizations of the approach and its relationships with other models.

Journal of Vision, 9 (8), p. 31, 2009
URL | DOI

Berens P, Macke JH, Ecker AS, Cotton RJ, Bethge M, Tolias AS

Sensory input statistics and network mechanisms in primate primary visual cortex

Understanding the structure of multi-neuronal firing patterns in ensembles of cortical neurons is a major challenge for systems neuroscience. The dependence of network properties on the statistics of the sensory input can provide important insights into the computations performed by neural ensembles. Here, we study the functional properties of neural populations in the primary visual cortex of awake, behaving macaques by varying visual input statistics in a controlled way. Using arrays of chronically implanted tetrodes, we record simultaneously from up to thirty well-isolated neurons while presenting sets of images with three different correlation structures: spatially uncorrelated white noise (whn), images matching the second-order correlations of natural images (phs) and natural images including higher-order correlations (nat). We find that groups of six nearby cortical neurons show little redundancy in their firing patterns (represented as binary vectors, 10ms bins) but rather act almost independently (mean multi-information 0.85 bits/s, range 0.16 - 1.90 bits/s, mean fraction of marginal entropy 0.34 %, N=46). Although network correlations are weak, they are statistically significant. While relatively few groups showed significant redundancies under stimulation with white noise (67.4 ± 3.2%; mean fraction of groups ± S.E.M.), many more did so in the other two conditions (phs: 95.7 ± 0.6%; nat: 89.1 ± 1.4%). Additional higher-order correlations in natural images compared to phase scrambled images did not increase but rather decrease the redundancy in the cortical representation: Network correlations are significantly higher in phs than in nat, as is the number of significantly correlated groups. Multi-information measures the reduction in entropy due to any form of correlation. By using second order maximum entropy modeling, we find that a large fraction of multi-information is accounted for by pairwise correlations (whn: 75.0 ± 3.3%; phs: 82.8 ± 2.1%; nat: 80.8 ± 2.4%; groups with significant redundancy). Importantly, stimulation with natural images containing higher-order correlations only lead to a slight increase in the fraction of redundancy due to higher-order correlations in the cortical representation (mean difference 2.26 %, p=0.054, Sign test). While our results suggest that population activity in V1 may be modeled well using pairwise correlations only, they leave roughly 20-25 % of the multi-information unexplained. Therefore, choosing a particular form of higher-order interactions may improve model quality. Thus, in addition to the independent model, we evaluated the quality of three different models: (a) The second-order maximum entropy model, which minimizes higher-order correlations, (b) a model which assumes that correlations are a product of common inputs (Dichotomized Gaussian) and (c) a mixture model in which correlations are induced by a discrete number of latent states. We find that an independent model is sufficient for the white noise condition but neither for phs or nat. In contrast, all of the correlation models (a-c) perform similarly well for the conditions with correlated stimuli. Our results suggest that under natural stimulation redundancies in cortical neurons are relatively weak. Higher-order correlations in natural images do not increase but rather decrease the redundancies in the cortical representation.

Frontiers in Systems Neuroscience, 2009 (Conference Abstracts: Computational and Systems Neuroscience), 2009
URL | DOI

Gerwinn S, Macke J, Bethge M
Bayesian Population Decoding of Spiking Neurons
Frontiers in Systems Neuroscience, 2009 (Conference Abstracts: Computational and Systems Neuroscience), 2009
URL | DOI

Macke J, Gerwinn S, White L, Kaschube M, Bethge M

Bayesian estimation of orientation preference maps

Neurons in the early visual cortex of mammals exhibit a striking organization with respect to their functional properties. A prominent example is the layout of orientation preferences in primary visual cortex, the orientation preference map (OPM). Functional imaging techniques, such as optical imaging of intrinsic signals have been used extensively for the measurement of OPMs. As the signal-to-noise ratio in individual pixels if often low, the signals are usually spatially smoothed with a fixed linear filter to obtain an estimate of the functional map. Here, we consider the estimation of the map from noisy measurements as a Bayesian inference problem. By combining prior knowledge about the structure of OPMs with experimental measurements, we want to obtain better estimates of the OPM with smaller trial numbers. In addition, the use of an explicit, probabilistic model for the data provides a principled framework for setting parameters and smoothing. We model the underlying map as a bivariate Gaussian process (GP, a.k.a. Gaussian random field), with a prior covariance function that reflects known properties of OPMs. The posterior mean of the map can be interpreted as an optimally smoothed map. Hyper-parameters of the model can be chosen by optimization of the marginal likelihood. In addition, the GP also returns a predicted map for any location, and can therefore be used for extending the map to pixel at which no, or only unreliable data was obtained. We also obtain a posterior distribution over maps, from which we can estimate the posterior uncertainty of statistical properties of the maps, such as the pinwheel density. Finally, our probabilistic model of both the signal and the noise can be used for decoding, and for estimating the informational content of the map.

Frontiers in Systems Neuroscience, 2009 (Conference Abstracts: Computational and Systems Neuroscience), 2009
URL | DOI

2008

Ku S-P, Gretton A, Macke J, Logothetis NK

Pattern recognition methods in classifying fMRI data

Pattern recognition methods have shown that fMRI data can reveal signifficant information about brain activity. For example, in the debate of how object-categories are represented in the brain, multivariate analysis has been used to provide evidence of a distributed encoding scheme. Many follow-up studies have employed different methods to analyze human fMRI data with varying degrees of success. In this presentation I would like to discuss and compare four popular pattern recognition methods: correlation analysis, support-vector machines (SVM), linear discriminant analysis and Gaussian naive Bayes (GNB), using data collected at high field (7T) with higher resolution than usual fMRI studies. We investigate prediction performance on single trials and for averages across varying numbers of stimulus presentations. The performance of the various algorithms depends on the nature of the brain activity being categorized: for several tasks, many of the methods work well, whereas for others, no methods perform above chance level. An important factor in overall classiffication performance is careful preprocessing of the data, including dimensionality reduction, voxel selection, and outlier elimination.

9th Conference of the Junior Neuroscientists of Tübingen (NeNa 2008), 9 (11), 2008
URL

Macke JH, Opper M, Bethge M
How pairwise correlations affect the redundancy in large populations of neurons
Frontiers in Computational Neuroscience, 2008 (Conference Abstract: Bernstein Symposium 2008), 2008
URL | DOI

Macke J, Opper M, Bethge M

How pairwise correlations affect the redundancy in large populations of neurons

Simultaneously recorded neurons often exhibit correlations in their spiking activity. These correlations shape the statistical structure of the population activity, and can lead to substantial redundancy across neurons. Knowing the amount of redundancy in neural responses is critical for our understanding of the neural code. Here, we study the effect of pairwise correlations on the statistical structure of population activity. We model correlated activity as arising from common Gaussian inputs into simple threshold neurons. In population models with exchangeable correlation structure, one can analytically calculate the distribution of synchronous events across the whole population, and the joint entropy (and thus the redundancy) of the neural responses. We investigate the scaling of the redundancy as the population size is increased, and characterize its phase transitions for increasing correlation strengths. We compare the asymptotic redundancy in our models to the corresponding maximum- and minimum entropy models. Although this model must exhibit more redundancy than the maximum entropy model, we find that its joint entropy increases linearly with population size.

Frontiers in Computational Neuroscience, 2008 (Conference Abstract: Bernstein Symposium 2008), 2008
URL | DOI

Macke JH, Berens P, Ecker AS, Opper M, Tolias AS, Bethge M
Modeling populations of spiking neurons with the Dichotomized Gaussian distribution
Annual Meeting 2008 of Sloan-Swartz Centers for Theoretical Neurobiology, 2008
URL

Macke JH, Schwartz G, Berry M
The role of stimulus correlations for population decoding in the retina
AREADNE 2008: Research in Encoding and Decoding of Neural Ensembles, p. 73, 2008
URL

Berens P, Ecker AS, Subramaniyan M, Macke JH, Hauck P, Bethge M, Tolias AS

Pairwise Correlations and Multineuronal Firing Patterns in the Primary Visual Cortex of the Awake, Behaving Macaque

Understanding the structure of multi-neuronal firing patterns has been a central quest and major challenge for systems neuroscience. In particular, how do pairwise interactions between neurons shape the firing patterns of neuronal ensembles in the cortex? To study this question, we recorded simultaneously from multiple single neurons in the primary visual cortex of an awake, behaving macaque using an array of chronically implanted tetrodes1. High contrast flashed and moving bars were used for stimulation, while the monkey was required to maintain fixation. In a similar vein to recent studies of in vitro preparations2,3,5, we applied maximum entropy analysis for the first time to the binary spiking patterns of populations of cortical neurons recorded in vivo from the awake macaque. We employed the Dichotomized Gaussian distribution, which can be seen as a close approximation to the pairwise maximum-entropy model for binary data4. Surprisingly, we find that even pairs of neurons with nearby receptive fields (receptive field center distance < 0.15°) have only weak correlations between their binary responses computed in bins of 10 ms (median absolute correlation coefficient: 0.014, 0.010-0.019, 95% confidence intervals, N=95 pairs; positive correlations: 0.015, N=59; negative correlations: -0.013, N=36). Accordingly, the distribution of spiking patterns of groups of 10 neurons is described well with a model that assumes independence between individual neurons (Jensen-Shannon-Divergence: 1.06×10-2 independent model, 0.96×10-2 approximate second-order maximum-entropy model4; H/H1=0.992). These results suggest that the distribution of firing patterns of small cortical networks in the awake animal is predominantly determined by the mean activity of the participating cells, not by their interactions. Meaningful computations, however, are performed by neuronal populations much larger than 10 neurons. Therefore, we investigated how weak pairwise correlations affect the firing patterns of artificial populations4 of up to 1000 cells with the same correlation structure as experimentally measured. We find that in neuronal ensembles of this size firing patterns with many active or silent neurons occur considerably more often than expected from a fully independent population (e.g. 130 or more out of 1000 neurons are active simultaneously roughly every 300 ms in the correlated model and only once every 3-4 seconds in the independent model). These results suggest that the firing patterns of cortical networks comparable in size to several minicolumns exhibit a rich structure, even if most pairs appear relatively independent when studying small subgroups thereof.

AREADNE 2008: Research in Encoding and Decoding of Neural Ensembles, p. 46, 2008
URL

Bethge M, Macke JH, Berens P, Ecker AS, Tolias AS

Flexible Models for Population Spike Trains

In order to understand how neural systems perform computations and process sensory information, we need to understand the structure of firing patterns in large populations of neurons. Spike trains recorded from populations of neurons can exhibit substantial pair wise correlations between neurons and rich temporal structure. Thus, efficient methods for generating artificial spike trains with specified correlation structure are essential for the realistic simulation and analysis of neural systems. Here we show how correlated binary spike trains can be modeled by means of a latent multivariate Gaussian model. Sampling from our model is computationally very efficient, and in particular, feasible even for large populations of neurons. We show empirically that the spike trains generated with this method have entropy close to the theoretical maximum. They are therefore consistent with specified pair-wise correlations without exhibiting systematic higher-order correlations. We compare our model to alternative approaches and discuss its limitations and advantages. In addition, we demonstrate its use for modeling temporal correlations in a neuron recorded in macaque primary visual cortex. Neural activity is often summarized by discarding the exact timing of spikes, and only counting the total number of spikes that a neuron (or population) fires in a given time window. In modeling studies, these spike counts have often been assumed to be Poisson distributed and neurons to be independent. However, correlations between spike counts have been reported in various visual areas. We show how both temporal and inter-neuron correlations shape the structure of spike counts, and how our model can be used to generate spike counts with arbitrary marginal distributions and correlation structure. We demonstrate its capabilities by modeling a population of simultaneously recorded neurons from the primary visual cortex of a macaque, and we show how a model with correlations accounts for the data far better than a model that assumes independence.

AREADNE 2008: Research in Encoding and Decoding of Neural Ensembles, p. 48, 2008
URL

Ku S-P, Gretton A, Macke J, Tolias AT, Logothetis NK

Analysis of Pattern Recognition Methods in Classifying Bold Signals in Monkeys at 7-Tesla

Pattern recognition methods have shown that fMRI data can reveal significant information about brain activity. For example, in the debate of how object-categories are represented in the brain, multivariate analysis has been used to provide evidence of distributed encoding schemes. Many follow-up studies have employed different methods to analyze human fMRI data with varying degrees of success. In this study we compare four popular pattern recognition methods: correlation analysis, support-vector machines (SVM), linear discriminant analysis and Gaussian naïve Bayes (GNB), using data collected at high field (7T) with higher resolution than usual fMRI studies. We investigate prediction performance on single trials and for averages across varying numbers of stimulus presentations. The performance of the various algorithms depends on the nature of the brain activity being categorized: for several tasks, many of the methods work well, whereas for others, no methods perform above chance level. An important factor in overall classification performance is careful preprocessing of the data, including dimensionality reduction, voxel selection, and outlier elimination.

AREADNE 2008: Research in Encoding and Decoding of Neural Ensembles, p. 67, 2008
URL

Schwartz G, Macke J, Berry M

The role of stimulus correlations for population decoding in the retina

a large number of retinal ganglion cells, one should be able to construct a decoding algorithm to discriminate different visual stimuli. Despite the inherent noise in the response of the ganglion cell population, everyday visual experience is highly deterministic. We have designed an experiment to study the nature of the population code of the retina in the "low error" regime. We presented 36 different black and white shapes, each with the same number of black pixels, to the retina of a tiger salamander while recording retinal ganglion cell responses using a multi-electrode array. Each shape was presented over 100 trials for 0.5 s each and trials were randomly interleaved. Spike trains were recorded from 162 ganglion cells in 13 experiments. We removed noise correlations by shuffling trials, as we wanted to focus on the role of correlations induced by the stimulus (signal correlations). We designed decoding algorithms for this population response in order to detect each target shape against the distracter set of the 35 other shapes. Binary response vectors were constructed using a 100 ms bin following the presentation of each shape. First, we used a simple decoder that assumes that all neurons are independent. This decoder is a linear classifier. A second decoder, which takes into account correlations between neurons, was constructed by fitting Ising models1 to the population response using up to 162 neurons for each model. We also constructed the statistically optimal decoder based on a mixture model, which accounts for signal correlations. Using populations of many neurons, the optimal and Ising decoders performed considerably better than the "independent" decoder. For certain shapes, the optimal decoder had 100 times fewer false positives than the independent decoder at 99% hit rate, and, in the median across shapes, the performance enhancement was 8-fold. While the decoder using an Ising model fit to the pairwise correlations did not achieve optimality, it was up to 50 times more accurate than the independent decoder, and 3 times more accurate in the median across shapes. Some shape discriminations were performed at zero error out of 3500 trials using the optimal and Ising decoders on only a subset of the recorded cells while none reached this "low error" level using the independent decoder even on all 162 cells (see figure). We find that discrimination with very low error using large populations requires a decoder that models signal correlations. Linear classifiers were unable to reach the "low error" regime. The Ising model of the population response is successfully applied to groups of up to 162 cells and offers a biologically feasible mechanism by which downstream neurons could account for correlations in their inputs.

Computational and Systems Neuroscience Meeting (COSYNE 2008), 5 , p. 172, 2008
URL | PDF

2007

Macke JH, Zeck G, Bethge M

Estimating receptive fields without spike-triggering

The prevalent means of characterizing stimulus selectivity in sensory neurons is to estimate their receptive field properties such as orientation selectivity. Receptive fields are usually derived from the mean (or covariance) of the spike-triggered stimulus ensemble. This approach treats each spike as an independent message but ignores the possibility that information might be conveyed through patterns of neural activity that are distributed across space or time. In the retina for example, visual stimuli are analyzed by several parallel channels with different spatiotemporal filtering properties. How can we define the receptive field of a whole population of neurons, not just a single neuron? Imaging methods (such as voltage-sensitive dye imaging) yield measurements of neural activity that do not contain spiking events at all. How can receptive fields be derived from this kind of data? Even for single neurons, there is evidence that multiple features of the neural response, for example spike patterns or latencies, can carry information. How can these features be taken into account in the estimation process? Here, we address the question of how receptive fields can be calculated from such distributed representations. We seek to identify those stimulus features and the corresponding patterns of neural activity that are most reliably coupled, as measured by the mutual information between the two signals. As an efficient implementation of this strategy, we use an extension of reverse-correlation methods based on canonical correlation analysis [1]. We evaluate our approach using both simulated data and multi-electrode recordings from rabbit retinal ganglion cells [2]. In addition, we show how the model can be extended to capture nonlinear stimulus-response relationships and to test different coding mechanisms using kernel canonical correlation analysis [3].

37th Annual Meeting of the Society for Neuroscience (Neuroscience 2007), 37 (768.1), 2007
URL

Franz MO, Macke JH, Saleem A, Schultz SR

Implicit Wiener Series for Estimating Nonlinear Receptive Fields

The representation of the nonlinear response properties of a neuron by a Wiener series expansion has enjoyed a certain popularity in the past, but its application has been limited to rather low-dimensional and weakly nonlinear systems due to the exponential growth of the number of terms that have to be estimated. A recently developed estimation method [1] utilizes the kernel techniques widely used in the machine learning community to implicitly represent the Wiener series as an element of an abstract dot product space. In contrast to the classical estimation methods for the Wiener series, the estimation complexity of the implicit representation is linear in the input dimensionality and independent of the degree of nonlinearity. From the neural system identification point of view, the proposed estimation method has several advantages: 1. Due to the linear dependence of the estimation complexity on input dimensionality, system identification can be also done for systems acting on high-dimensional inputs such as images or video sequences. 2. Compared to classical cross-correlation techniques (such as spike-triggered average or covariance estimates), similar accuracies can be achieved with a considerably smaller amount of data. 3. The new technique does not need white noise as input, but works for arbitrary classes of input signals such as, e.g., natural image patches. 4. Regularisation concepts from machine learning to identify systems with noise-contaminated output signals. We present an application of the implicit Wiener series to find the low-dimensional stimulus subspace which accounts for most of the neuron's activity. We approximate the second-order term of a full Wiener series model with a set of parallel cascades consisting of a linear receptive field and a static nonlinearity. This type of approximation is known as reduced set technique in machine learning. We compare our results on simulated and physiological datasets to existing identification techniques in terms of prediction performance and accuracy of the obtained subspaces.

Neuroforum, 13 (Supplement), p. 1199, 2007
URL

Bethge M, Macke JH, Gerwinn S, Zeck G

Identifying temporal population codes in the retina using canonical correlation analysis

Right from the first synapse in the retina, the visual information gets distributed across several parallel channels with different temporal filtering properties (Wässle, 2004). Yet, the prevalent system identification tool for characterizing neural responses, the spike-triggered average, only allows one to investigate the individual neural responses independently of each other. Here, we present a novel data analysis tool for the identification of temporal population codes based on canonical correlation analysis (Hotelling, 1936). Canonical correlation analysis allows one to find `population receptive fields' (PRF) which are maximally correlated with the temporal response of the entire neural population. The method is a convex optimization technique which essentially solves an eigenvalue problem and is not prone to local minima. We apply the method to simultaneous recordings from rabbit retinal ganlion cells in a whole mount preparation (Zeck et al, 2005). The cells respond to a 16 by 16 pixel m-sequence stimulus presented at a frame rate of 1/(20 msec). The response of 27 ganglion cells is correlated with each input frame in an interval between zero and 200 msec relative to the stimulus. The 200 msec response period is binned into 14 equal-sized bins. As shown in the figure, we obtain six predictive population receptive fields (left column), each of which gives rise to a different population response (right column). The x-axis of the color-coded images used to describe the population response kernels (right column) corresponds to the index of the 27 different neurons, while the y-axis indicates time relative to the stimulus from 0 (top) to 200 msec (bottom). The six population receptive fields do not only provide a more concise description of the population response but can also be estimated much more reliably than the receptive fields of individual neurons. In conclusion, we suggest to characterize retinal ganglion cell responses in terms of population receptive fields, rather than discussing stimulus-neuron and neuron-neuron dependencies separately.

Neuroforum, 13 (Supplement), p. 359, 2007
URL | PDF

Maack N, Kapfer C, Macke JH, Schölkopf B, Denk W, Borst A

3D Reconstruction of Neural Circuits from Serial EM Images

The neural processing of visual motion is of essential importance for course control. A basic model suggesting a possible mechanism of how such a computation could be implemented in the fly visual system is the so called "correlation-type motion detector" proposed by Reichardt and Hassenstein in the 1950s. The basic requirement to reconstruct the neural circuit underlying this computation is the availability of electron microscopic 3D data sets of whole ensembles of neurons constituting the fly visual ganglia. We apply a new technique,"Serial Block Face Scanning Electron Microscopy" (SBFSEM), that allows for an automatic sectioning and imaging of biological tissue with a scanning electron microscope [Denk, Horstman (2004) Serial block face scanning electron microscopy to reconstruct three-dimensional tissue nanostructure. PLOS Biology 2: 1900-1909]. Image Stacks generated with this technology have a resolution sufficient to distinguish different cellular compartments, especially synaptic structures. Consequently detailed anatomical knowledge of complete neuronal circuits can be obtained. Such an image stack contains several thousands of images and is recorded with a minimal voxel size of 25nm in x and y and 30nm in z direction. Consequently a tissue block of 1mm³ (volume of the Calliphora vicina brain) produces several hundreds terabyte of data. Therefore new concepts for managing large data sets and for automated 3D reconstruction algorithms need to be developed. We developed an automated image segmentation and 3D reconstruction software, which allows a precise contour tracing of cell membranes and simultaneously displays the resulting 3D structure. In detail, the software contains two stand-alone packages: Neuron2D and Neuron3D, both offer an easy-to-operate Graphical-User-Interface. Neuron2D software provides the following image processing functions: • Image Viewer: Display image stacks in single or movie mode and optional calculates intensity distribution of each image. • Image Preprocessing: Filter process of image stacks. Implemented filters are a Gaussian 2D and a Non-Linear-Diffusion Filter. The filter step enhances the contrast between contour lines and image background, leading to an enhanced signal to noise ratio which further improves detection of membrane structures. • Image Segmentation: The implemented algorithm extracts contour lines from the preceding image and automatically traces the contour lines in the following images (z-direction), taking into account the previous image segmentation. In addition, a manual interaction is possible. To visualize 3D structures of neuronal circuits the additional software Neuron3D was developed. The reconstruction of neuronal surfaces from contour lines, obtained in Neuron2D, is implemented as a graph theory approach. The reconstructed anatomical data can further provide a subset for computational models of neuronal circuits in the fly visual system.

Neuroforum, 13 (Supplement), p. 1195, 2007
URL

Kienzle W, Macke JH, Wichmann FA, Schölkopf B, Franz MO

Nonlinear Receptive Field Analysis: Making Kernel Methods Interpretable

Identification of stimulus-response functions is a central problem in systems neuroscience and related areas. Prominent examples are the estimation of receptive fields and classification images [1]. In most cases, the relationship between a high-dimensional input and the system output is modeled by a linear (first-order) or quadratic (second-order) model. Models with third or higher order dependencies are seldom used, since both parameter estimation and model interpretation can become very difficult. Recently, Wu and Gallant [3] proposed the use of kernel methods, which have become a standard tool in machine learning during the past decade [2]. Kernel methods can capture relationships of any order, while solving the parameter estmation problem efficiently. In short, the stimuli are mapped into a high-dimensional feature space, where a standard linear method, such as linear regression or Fisher discriminant, is applied. The kernel function allows for doing this implicitly, with all computations carried out in stimulus space. As a consequence, the resulting model is nonlinear, but many desirable properties of linear methods are retained. For example, the estimation problem has no local minima, which is in contrast to other nonlinear approaches, such as neural networks [4]. Unfortunately, although kernel methods excel at modeling complex functions, the question of how to interpret the resulting models remains. In particular, it is not clear how receptive fields should be defined in this context, or how they can be visualized. To remedy this, we propose the following definition: noting that the model is linear in feature space, we define a nonlinear receptive field as a stimulus whose image in feature space maximizes the dot-product with the learned model. This can be seen as a generalization of the receptive field of a linear filter: if the feature map is the identity, the kernel method becomes linear, and our receptive field definition coincides with that of a linear filter. If it is nonlinear, we numerically invert the feature space mapping to recover the receptive field in stimulus space. Experimental results show that receptive fields of simulated visual neurons, using natural stimuli, are correctly identified. Moreover, we use this technique to compute nonlinear receptive fields of the human fixation mechanism during free-viewing of natural images.

Computational and Systems Neuroscience Meeting (COSYNE 2007), p. 16, 2007
URL | PDF

Macke JH, Zeck G, Bethge M

Estimating Population Receptive Fields in Space and Time

Right from the first synapse in the retina, visual information gets distributed across several parallel channels with different temporal filtering properties. Yet, commonly used system identification tools for characterizing neural responses, such as the spike-triggered average, only allow one to investigate the individual neural responses independently of each other. Conversely, many population coding models of neurons and correlations between neurons concentrate on the encoding of a single-variate stimulus. We seek to identify the features of the visual stimulus that are encoded in the temporal response of an ensemble of neurons, and the corresponding spike-patterns that indicate the presence of these features. We present a novel data analysis tool for the identification of such temporal population codes based on canonical correlation analysis (Hotelling, 1936). The "population receptive fields" (PRFs) are defined to be those dimensions of the stimulus-space that are maximally correlated with the temporal response of the entire neural population, irrespective of whether the stimulus features are encoded by the responses of single neurons or by patterns of spikes across neurons or time. These dimensions are identified by canonical correlation analysis, a convex optimization technique which essentially solves an eigenvalue problem and is not prone to local minima. Each receptive field can be represented by the weighted sum of a small number of functions that are separable in space-time. Therefore, non-separable receptive fields can be estimated more efficiently than with spiketriggered techniques, which makes our method advantageous even for the estimation of single-cell receptive fields. The method is demonstrated by applying it to data from multi-electrode recordings from rabbit retinal ganglion cells in a whole mount preparation (Zeck et al, 2005). The figure displays the first 6 PRFs of a population of 27 cells from one such experiment. The recovered stimulus-features look qualitatively different to the receptive fields of single retinal ganglion cells. In addition, we show how the model can be extendended to capture nonlinear stimulus-response relationships and to test different coding-mechanisms by the use of kernel-canonical correlation analysis. In conclusion, we suggest to characterize responses of ensembles of neurons in terms of PRFs, rather than discussing stimulus-neuron and neuron-neuron dependencies separately.

Computational and Systems Neuroscience Meeting (COSYNE 2007), p. 44, 2007
URL | PDF

2006

Macke J

Decision-Images: A tool for identifying critical stimulus features

neurons- during a visual task is an important pre-requisite for computational models of visual cognition. We describe a technique for estimating high-dimensional decision-images, and apply the method to a psychophysical gender discrimination task. The use of regularization makes it possible to map out decision-images using a relatively small number of stimuli. Statistical analysis of the result shows a remarkable fit to the datasets collected-remarkable, as gender discrimination is a rather high-level visual task, and thus believed to be complex, but our model is conceptually rather simple. We demonstrate that the decision-images are sensitive to subtle changes in lighting, texture, and pose, and to individual differences in gender discrimination exhibited by our subjects. We show how decision-images can be used to create new stimuli, and how the approach can be generalized to non-linear and multi-scale decision-images. In addition, connections to reverse correlation techniques for receptive field estimation are described.

7th Conference of the Junior Neuroscientists of Tuebingen (NeNa 2006), 7 , p. 10, 2006
URL

mackelab

Machine Learning in Science

Research Articles and Reviews

2025

Research Articles

2024

Research Articles

2023

Research Articles

2022

Research Articles

2021

Research Articles

2020

Research Articles

2019

Research Articles

2018

Research Articles

Preprints and Technical Reports

2017

Research Articles

2016

Research Articles

2015

Research Articles

Reviews and Book Chapters

2014

Research Articles

Reviews and Book Chapters

2013

Research Articles

2012

Research Articles

2011

Research Articles

Reviews and Book Chapters

2010

Research Articles

2009

Research Articles

Preprints and Technical Reports

2008

Research Articles

2007

Research Articles

Conference Presentations

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006