The paradigmatic example of a complex system is the web of biochemical interactions that make up life. We still know very little about the organization of life as a dynamical, interacting network of genes, proteins and biochemical reactions. How do biochemical networks—containing many regulatory, signaling, and metabolic processes—achieve reliability and robustness while evolving? Cells function reliably despite noisy dynamic environments, which is all the more impressive given that control strategies implemented by intra and inter-cellular processes cannot rely on a centralized, global view of the relevant networks. Are the resulting complex dynamics made up of relatively autonomous modules? If so, what is their functional role and how can they be identified? How robust is the collective computation performed by intra-cellular networks to mutations, delays and stochastic noise? To address these questions, we are focused on developing both novel methodologies and informatics tools to study control and collective computation in automata networks used to model gene regulation and biochemical signaling.
Network science has provided many insights into the organization of complex systems. The success of this approach is its ability to capture the organization of multivariate interactions as networks or graphs without explicit dynamical rules for node variables. As the field matures, however, there is a need to move from understanding to controlling complex systems; for example, to revert a diseased cell to a healthy state, or a mature cell to a pluripotent state. This is particularly true in systems biology and medicine, where increasingly accurate models of biochemical regulation have been produced. We have contributed to this goal with two mathematical concepts which allow us to remove different forms of redundancy in networks: 1) distance closures, and 2) canalization via schema re-description. The latter concept is used to remove redundancy from the logical rules of biochemical regulation models in systems biology, revealing that most variables (e.g. chemical species) rely on a smaller subset of their inputs to be regulated (canalization). The removal of this redundancy simplifies and indeed enables the causal and actionable characterization of control in large biochemical and neural network models, which are otherwise too large to study analytically [Rocha , 2022].
While a recent interest in linear control has lead much recent research to infer control from the structure of network interactions, we have shown that, structural controllability, minimum dominating sets and even feedback vertex set theory fail to properly characterize controllability in systems biology models of biochemical regulation, and even in small network motifs. Indeed, structure-only methods both undershoot and overshoot the number and which sets of variables actually control these models, highlighting the importance of the system dynamics in determining control [Gates and Rocha, 2016][Gates et al, 2021]. We have also shown that the logic of automata transition functions, namely how canalizing they are, plays a key role in dynamical regime and evolvability [Manicka, Marques-Pita and Rocha, 2022], as well as on controllability [Gates et al, 2021] (see more on canalization below).
Considering the possible dynamics that can unfold on specific network structure is a focus of research in our group. In addition to control, we study prediction [Kolchinsky and Rocha, 2011], modularity [Kolchinsky, Gates and Rocha, 2015, Marques-Pita and Rocha, 2013], multi-scale integration in the dynamics of complex network, such as brain networks [Kolchinsky et al 2014], and other scalable methods to study dynamics of networks [Rocha , 2022; Parmer, Rocha, & Radicchi, 2022]---for dynamics on networks see our work on distance backbones.
Schema redescription with two symbols is a method to eliminate redundancy in the transition tables of Boolean automata. One symbol is used to capture redundancy of individual input variables, and another to capture permutability in sets of input variables: fully characterizing the canalization present in Boolean functions [Marques-Pita and Rocha, 2011] [Marques-Pita and Rocha, 2013] (see figure). In our formulation, canalization becomes synonymous with redundancy present in the logic of automata. This results in straightforward measures to quantify canalization in an automaton (micro-level), which is in turn integrated into a highly scalable way to characterize the collective dynamics of large-scale automata networks (macro-level). This way, our approach provides a method to link micro- to macro-level dynamics---a crux of complexity. Our methodology is applicable to any complex network that can be modelled using automata, but we focus on biochemical regulation and signalling, towards a better understanding of the (decentralized) control that orchestrates cellular activity---with the ultimate goal of explaining how do cells and tissues "compute". Our group also focuses on providing open-source code so that others can test our approach [Correia, Gates, Wang, and Rocha , 2018].
By removing redundancy (canalization) from discrete models of biochemical regulation, we can extract the effective structure that controls their dynamics, revealing their dynamical modularity (modules in the dynamics rather than in the structure of networks) and robustness [Marques-Pita and Rocha, 2013]. In particular, we can extract the minimal conditions (as schemata or motifs) and critical nodes that control convergence to attractors---associated with phenotypic behavior in these models, such as the effect of specific medications in cancer [Gates et al, 2021]. The approach is scalable because it only needs to compute the redundancy of the transition functions of each node in the network, rather than the entire dynamical landscape of the multivariate dynamical system [Rocha , 2022]. This has lead us, for instance, to obtain a better understanding of a well-known 60-variable model of the intra- and inter cellular genetic regulation of body segmentation in Drosophila Melanogaster [Marques-Pita and Rocha, 2013]. We were able to measure more accurately the size of its wild-type attractor basin (larger than previously thought), to the identify novel minimal conditions and critical nodes that control wild-type behaviour, and estimate its resilience to stochastic interventions. Similarly, we are able to characterize analytically the causal pathways in large cancer models (without estimating via Monte-Carlo simualtions) [Gates et al, 2021] and better predict the dynamical regime of large ensembles of automata networks and experimentalli-validated systems biology models [Manicka, Marques-Pita and Rocha, 2022].
We have been interested on the problem of how information, symbols, representations and the like can arise from a purely dynamical system of many components. This is a topic of particular interest in Cognitive Science, where the notions of representation and symbol often divide the field into opposing camps. Often, in the area of Embodied Cognition the idea of self-organization in dynamical systems leads many researchers to reject representational or semiotic elements in their models of cognition. This attitude seems not only excessive, but indeed absurd as it ignores the informational processes so important for biological organisms. Therefore, we have been working both on a re-formulation of the concept of representation for embodied cognition, as well as on simulations of dynamical systems (using Celular Automata) where one can study the origin of representations.
The Evolving Cellular Automata experiments of Crutchfield, Mitchell et al, in the late 1990’s were very exciting, as the ability of evolved cellular automata to solve non-trivial computation tasks seemed to provide clues about the origin of representations and information from dynamical systems [Mitchell, 1998] [Rocha ,1998b]. We conducted additional experiments which extended the density classification task with more difficult logical tasks [Rocha, 2000; Rocha, 2004]. Later, we proposed a re-formulation of the concept of representation in cognitive science and artificial life which is based on this work, but argues that the type of emergent computations observed in these experiments do not produce representations quite as rich as those as observed in biology and cognition [Rocha and Hordijk ,2005]. These experiments allow us to think about how to evolve symbols from artificial matter in computational environments. The figure above, depicts a space-time diagram and particle model of a CA rule evolved to solve the AND task. Some additional Figures and experiment details of CA rules for logical tasks in our experiments are also available.
We have also used our dynamical redundancy removal method to show that despite having very different collective behavior, Cellular Automata (CA) rules can be very similar at the local interaction level [Marques-Pita and Rocha, 2011]—leading us to question the tendency in complexity research to pay much more attention to emergent patterns than to local interactions. Additionally, schema redescription allows us to obtain more amenable search spaces of CA rules for the Density Classification Task—obtaining some of the best known rules for this task. [Marques-Pita and Rocha, 2008, Marques-Pita, Mitchell, and Rocha, 2008].
Additionally, by removing redundancy (canalization) from discrete automata networks, we can identify their dynamical modularity (modules in the dynamics rather than in the structure of networks.) This allows us to observe that biochemical regulation relies of pathways comprised of specific states of biochemical variables which are largely dynamically decoupled from one another and function as macroscale building blocks for collective behavior (or emergent computation) [Marques-Pita and Rocha, 2013] (see side image).
Luis Rocha (PI)
Rion Brattig Correia
Felipe Xavier Costa
Manuel Marques Pita