Ventral-dorsal streams
Neural populations in primary visual cortex (blue) compute simple features of the visual input. The output is sent along the dorsal (green) and ventral (purple) streams to compute the location and identity of objects.

Towards a Theory of Neural Computation

I aim to understand how populations of neurons implement elementary computations, and how these computations are combined to support complex cognition. Towards this aim, I develop theories of how the brain infers the state of the world and refines its inferences through experience, in a manner that is consistent with the distributed, recurrent, and hierarchical nature of neural computation. In order to validate theories of the brain, I develop statistical models that capture the underlying structure in noisy neural recordings, and computer libraries capable of simulating and fitting such models efficiently and without error. In the following paragraphs I will sketch the past, present, and future of my work in these areas.

Hierarchical Model of the World
The brain as a hierarchical graphical model: the bottom of the hierarchy (Z) are sensations over time (t). Higher-level properties and concepts (Y and X) are dynamically inferred by the brain based on the sensations.

The Bayesian Brain

I take the view that the brain encodes a recurrent, hierarchical model of the world (Friston 2008; Pitkow and Angelaki 2017). In this view, sensory receptors (e.g. the retina) constitute the bottom of the hierarchy, and the brain infers encodings of higher-level properties and concepts given the dynamic activity of its senses. Bayesian inference is the mathematically rigorous formulation of this inference problem (Pouget et al. 2013), but solving Bayesian inference problems is rarely trivial.

In Sokoloski (2017) I showed how this problem can be solved exactly for a simple form of theoretical, dynamic neural population when the parameters of the neural population satisfy certain constraints. The results in this paper were a subset of the work I did in my dissertation (Sokoloski 2019), wherein I analyzed the general conditions under which dynamic Bayesian inference is computationally tractable.

Ultimately, animals infer the state of the world in order to behave effectively, and in my master’s thesis I did preliminary work on formulating motor control as a tractable inference problem (Sokoloski 2013). As a postdoctoral researcher in the lab of Philipp Berens, I am combining my experience in control theory and neural modelling to understand how reinforcement learning drives cellular diversity and the functional architecture of the retina.

Neural and Machine Learning

Bayesian inference combines information from observations with prior knowledge. In the Bayesian framework, the question of how the brain learns reduces to how the brain refines its prior knowledge given its history of sensory experience (Berkes et al. 2011). This reduction thus unifies the problem of how the brain learns with the problem of how to fit models to neural data; both problems can be solved by the method of maximum likelihood.

Novel Training Algorithms
Stochastic expectation-maximization (SEM) and stochastic gradient ascent (SGA) can fit models of correlated neural activity. With a bit of math (Hybrid) we may exceed them and achieve near optimal performance (Sokoloski and Coen-Cagli 2019).

In the lab of Ruben Coen-Cagli I developed a novel model of correlated, context-dependent neural population activity, and an efficient algorithm for fitting it to data (Sokoloski, Aschner, and Coen-Cagli 2021). Neural correlations can have a profound effect on neural computation (Kohn et al. 2016), and one of my ongoing projects is to understand if the correlations amongst neurons in primary visual cortex might facilitate tractable Bayesian inference.

In my dissertation (Sokoloski 2019) I did preliminary work on a novel algorithm for fitting hierarchical models, which remains an open problem in machine learning and computational neuroscience (LeCun, Bengio, and Hinton 2015). This algorithm relies on the aforementioned constraints for tractable Bayesian inference, and the work I have so far described constitute steps towards implementing it. If I am successful in realizing this larger project, then I hope to demonstrate how the organization of tuning and correlations in neural populations facilitate both tractable inference and learning in the brain.

   :: c # x
   -> c #* x
   -> Double
Dot product of two points in Goal. The first point is in c coordinates on the x manifold, and the second is in the dual space of the first.

Type-Safe Numerical Optimization

In attempting to implement, apply, and simulate complex models, one comes up against human limitations in our ability to write error-free machine code. Strongly-typed languages assist us in avoiding such errors by granting the compiler the ability to verify our code before we run it. At the same time, one cannot embed all code at the type-level, and libraries written in strongly-typed languages must tradeoff verifiability with practicality.

I have developed a set of libraries for the programming language Haskell which I call the Geometric Optimization Libraries (Goal). Goal provides types and functions based on simple ideas from differential and information geometry (Amari and Nagaoka 2007). Essentially, Goal distinguishes vectors of numbers as points on a manifold in a given coordinate system. Many fundamental mathematical operations can be formulated as changes in coordinates, and Goal thereby provides a compact yet general interface to optimization and statistics. Moreover, by embedding these fundamental concepts at the type-level, Goal allows Haskell type-checkers to serve as simple proof assistants, and in this way Goal facilitates my mathematical work as well.

Caricature Kant 2007
Immanuel Kant showed how we may arrive at truths about reality through reason alone.

Embodiment, Enaction, and Transcendental Arguments

I believe that data cannot speak for itself, and must be interpeted through the lens of well-motivated theory. Physicists have produced insights in many fields by applying their theories and physical intuition to revealing simple structures in data. Nevertheless, a purely physicalist view of the brain ignores what our faculty of introspection and our innate intuitions as cognitive beings reveal about the structure of cognition and neural activity. In particular, I believe that theories of the brain as a Bayesian inference machine are underconstrained (Bowers and Davis 2012; Colombo and Seriès 2012), and that we may at least partially address this with a priori analyses of the embodied and enactive nature of cognition (Thompson 2007; Stewart, Gapenne, and Paolo 2011).

Once upon a time I was a student of philosophy. Although I did not pursue a career in the subject, philosophy continues to inform my research. I dream of one day teaching introductory philosophy courses, and perhaps even contributing to the field in my winter years.


Amari, Shun-ichi, and Hiroshi Nagaoka. 2007. Methods of Information Geometry. Vol. 191. American Mathematical Soc.

Berkes, Pietro, Gergő Orbán, Máté Lengyel, and József Fiser. 2011. “Spontaneous Cortical Activity Reveals Hallmarks of an Optimal Internal Model of the Environment.” Science 331 (6013): 83–87.

Bowers, Jeffrey S., and Colin J. Davis. 2012. “Bayesian Just-so Stories in Psychology and Neuroscience.” Psychological Bulletin 138 (3): 389–414.

Colombo, Matteo, and Peggy Seriès. 2012. “Bayes in the Brain—on Bayesian Modelling in Neuroscience.” The British Journal for the Philosophy of Science 63 (3): 697–723.

Friston, Karl. 2008. “Hierarchical Models in the Brain.” Edited by Olaf Sporns. PLoS Computational Biology 4 (11): e1000211.

Kohn, Adam, Ruben Coen-Cagli, Ingmar Kanitscheider, and Alexandre Pouget. 2016. “Correlations and Neuronal Population Information.” Annual Review of Neuroscience 39 (1): 237–56.

LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. 2015. “Deep Learning.” Nature 521 (7553): 436–44.

Pitkow, Xaq, and Dora E. Angelaki. 2017. “Inference in the Brain: Statistics Flowing in Redundant Population Codes.” Neuron 94 (5): 943–53.

Pouget, Alexandre, Jeff Beck, Wei Ji Ma, and Peter Latham. 2013. “Probabilistic Brains: Knowns and Unknowns.” Nature Neuroscience 16 (9): 1170–8.

Sokoloski, Sacha. 2013. “Efficient Stochastic Control with Kullback Leibler Costs Using Kernel Methods.” Master’s thesis, Technical University of Berlin.

———. 2019. “Implementing Bayesian Inference with Neural Networks.” PhD thesis, University of Leipzig.

———. 2017. “Implementing a Bayes Filter in a Neural Circuit: The Case of Unknown Stimulus Dynamics.” Neural Computation 29 (9): 2450–90.

Sokoloski, Sacha, Amir Aschner, and Ruben Coen-Cagli. 2021. “Modelling the Neural Code in Large Populations of Correlated Neurons.” Edited by Jonathan W Pillow, Joshua I Gold, and Kenneth D Harris. eLife 10 (October): e64615.

Sokoloski, Sacha, and Ruben Coen-Cagli. 2019. “Conditional Finite Mixtures of Poisson Distributions for Context-Dependent Neural Correlations.” arXiv:1908.00637 [Cs, Stat], August.

Stewart, John Robert, Olivier Gapenne, and Ezequiel A. Di Paolo. 2011. Enaction: Toward a New Paradigm for Cognitive Science. MIT Press.

Thompson, Evan. 2007. Mind in Life: Biology, Phenomenology, and the Sciences of Mind. Harvard University Press.