Conceived and designed the experiments: KMK. Performed the experiments: KMK. Analyzed the data: TG. Wrote the paper: TG KMK JF.
The authors have declared that no competing interests exist.
Two main approaches in exploring causal relationships in biological systems using time-series data are the application of Dynamic Causal model (DCM) and Granger Causal model (GCM). These have been extensively applied to brain imaging data and are also readily applicable to a wide range of temporal changes involving genes, proteins or metabolic pathways. However, these two approaches have always been considered to be radically different from each other and therefore used independently. Here we present a novel approach which is an extension of Granger Causal model and also shares the features of the bilinear approximation of Dynamic Causal model. We have first tested the efficacy of the extended GCM by applying it extensively in toy models in both time and frequency domains and then applied it to local field potential recording data collected from
The right temporal cortex has previously been shown to play a greater role in the discrimination of faces in both sheep and humans. In the frequency domain, analysis of the relative causal contributions of low (theta 4–8Hz) and high (gamma 30–70Hz) frequency oscillations reveals that prior to learning, theta activity is more predominant in right than in left hemisphere processing, and that learning reduces this so that high frequency oscillations gain more control. We have been able to demonstrate that the frequency of connections increases in the right hemisphere and decreases between the left and right hemispheres after learning. The results are obtained based upon a way to combine aspects of both the Granger and Dynamic Causal Models, which can be used to establish significant causal relations in both time and frequency domains and applied to local field potential recordings from multiple (64 channel) electrodes implanted in the inferotemporal cortex of both sides of the brain in sheep in order to establish changes in causal connections within and between the two hemispheres as a result of learning to discriminate visually between pairs of faces. It is anticipated that this new approach to the measurement of causality will not only help reveal how the two brain hemispheres interact, but will also be applicable to many different types of biological data where variations in both frequency and temporal domains can be measured.
In order to exploit the full potential of high-throughput data in biology we have to be able to convert them into the most appropriate framework for contributing to knowledge about how the biological system generating them is functioning. This process is best conceptualized as first building a nodal network derived from empirically derived knowledge of the biological structures and molecules involved (nodes) and then secondly to use computational-based steps to discover the nature, dynamics and directionality of connections (directed edges) between them.
Causality analysis based upon experimental data has become one of the most powerful and valuable tools in discovering connections between different elements in complex biological systems
The key question we want to address in the current paper is whether we can develop an extended and biophysical constraint approach to share the features of the various approaches mentioned above, and in particular of the two causal models: DCM and GCM? The significance of such an approach is obvious and we would expect that its application could represent a powerful new tool in systems and computational biology, particularly in association with increasingly powerful genomic, proteomic and metabolic methodologies allowing time-series measurements of large numbers of putatively interacting molecules.
In this paper, we will show that GCM can be extended to a more biophysical constraint model by incorporating some features of the bilinear approximation of DCM. By setting up a conventional VAR model with additional deterministic inputs and observation variables, we can create a more general model: Extended Granger Causal Model (EGCM) which offers a new way to establish connectivity.
The EGCM is first tested in two toy models. With both state and observation variables, the interactions between nodes are successfully recovered using an extended Kalman filter approach and partial Granger to establish causality in both time (DCM) and frequency (GCM) domains respectively. The GCM approach itself is not tailored particularly well for biological experiments where we are often faced with the case of the data being recorded with and without a stimulus present. The time gap between two adjacent stimuli is very short and we would expect the network structure to remain unchanged during the whole experiment although the form and the intensity of the input may be unknown. This scenario is also the case for the gene network data considered in
To exemplify the direct application of EGCM to establishing causality in a specific biological system, we have applied it to local field potential (LFP) data recorded in the sheep inferotemporal cortex (IT) of both left and right hemispheres before and after they learn a visual face discrimination task
All animal experiments were performed in strict accordance with the UK 1986 Animals Scientific Procedures Act (including approval by the Babraham Institute Animal Welfare and Ethics Committee) and during them the animals were housed inside in individual pens and able to see and communicate with each other. Food and water were available ad libitum. Post-surgery all animals received both post-operative analgesia treatment to minimize discomfort and antibiotic treatment to prevent any possibility of infection.
The traditional and widely used Granger Causal Model takes the form
In spite of its successful application, GCM requires the direct observation of the state variables and does not include designed experimental effects in the model which form some of its limitations. Here we extend GCM to a more reasonable and biophysical constraint model by incorporating additional deterministic inputs and observation variables, closely following equations which are the features in the Dynamical Causal Model and its bilinear approximation form
Now, if we can recover the state variables
For the extended Granger Causal model Eq. (2), we now introduce an algorithm to estimate the state variables as well as all its parameters which will give us the first inspiration of the connection of the state variables.
Let
Then, the VAR(
In order to apply the model to real data, we have to estimate both the states and parameters of the model from input variables and noise observations. A widely used method for this dual estimation is extended Kalman filter (EKF)
Let
Given the estimated state
We use the new observation
After recovering the state variables using the EGCM algorithm above, we can define the causality with the idea proposed by Granger. The only difference is that, in our EGCM model, two deterministic inputs
For simplicity of notation, here we only formulate EGCM for two time series
Our EGCM also allows a frequency domain decomposition to detect the intrinsic causal influence which provides valuable information.
We define the lag operator
Rewrite equation (7) in terms of lag operator, we have:
Since what we really care about is the causal relationship caused by the intrinsic connection of the state variables rather than the outside driving force, i.e. the input
After normalizing equation (9) using the transformation proposed by Geweke
Note that although here we just provide the definition of pairwise Granger causality for EGCM, it's obvious that similar methods can be easily applied to the definition of conditional, partial or complex Granger causality in both time and frequency domains
Three female sheep were used (Ovis aries, one Clun Forest and two Dorsets). All experiments were performed in strict accordance with the UK 1986 Animals Scientific Procedures Act. During the experiments the animals were housed inside in individual pens. They were trained initially over several months to perform operant-based face (sheep) or non-face (objects) discrimination tasks with the animals making a choice between two simultaneously presented pictures, one of which was associated with a food reward. During stimulus presentations, animals stood in a holding trolley and indicated their choice of picture by pressing one of two touch panels located in the front of the trolley with their nose. The food reward was delivered automatically to a hopper between the two panels. The life-sized pictures were back projected onto a screen
Following initial behavioral training sheep were implanted under general anesthesia (fluothane) and full aseptic conditions with unilateral (Sheep A-right IT) or bilateral (Sheep B and C) planar 64-electrode arrays (epoxylite coated, etched, tungsten wires with
For data analysis of the stored signals LFP data contaminated with noise such as from animal chewing food were excluded as were LFPs with unexpectedly high power. For LFPs, offline filtering was applied in the range of
In order to evaluate the performance of EGCM for the estimation of the state variables as well as the prediction of the parameters, we first applied the method to two toy models.
The first toy model we used comes from a traditional VAR model which has been extensively applied in tests of Granger causality
A. Traces of the time series. B. The causal relationships considered in Toy Model 1 between the three state variables. C. The estimated parameters
The observation variables were
Now, we can apply the method to this toy model, i.e., to estimate all the parameters
After the state variables being recovered, we computed the partial Granger causality in both time and frequency domains (see
When dealing with real data it is quite common that we need to detect the causal influence between time series from several variables affected by some stimulus. The stimulus may be very complicated, or hard to measure, and it may be impossible to formulate its form explicitly. However, if we ignore the influence of these inputs and use a traditional VAR model to detect the causality it is quite probable that we will get a misleading structure even if we use a high-order VAR model.
We used the following toy model which has exactly the same connection coefficients between the three state variables considered in Toy model 1 with an additional simple constant input function
Here, we assumed that
Network structures with and without stimulus. A. Confidence intervals of all links between units. The data is generated with Eq. (15), but we use
With our extended model we can, to some extent, solve the above issue and detect the causal influence correctly amongst those state variables affected by some unknown stimulus intermittently, although our model is originally set up for deterministic inputs.
We next generated a time series of 10000 time points which was composed of 10 segments with equal length, a.e.,
The five segments
Hence, the network structure of the three state variables is still the same as shown in
Local field potential data were obtained from 64-channel multielectrode arrays implanted in the right and left inferior temporal cortices of three sheep (one sheep only had electrodes in the right hemisphere) as previously described
With these experimental data, we can directly use our EGCM to detect the global network for all electrodes in both brain hemispheres. However, due to the size of the network, there are at least a few thousand free parameters to fit. To avoid this issue, we adopt another approach here. For each session we randomly select 3 time series from each region respectively and apply our model to detect the network structure for the six electrodes. This procedure is repeated for 100 times for each session (see
The network detected by EGCM (top-panel) and the corresponding frequency decomposition (bottom-panel) for six randomly selected electrodes. In the frequency decomposition, significant causal influences are marked by red.
A further problem here is that if we intend to reconstruct the connections for each six electrodes (left and right) before and after the stimulus respectively, we could end up with two different structures for the time series (not shown). This is certainly not the case since the duration of the stimuli is quite short (1–3 seconds) and the connections will not change in such a short time. To recover a reasonable structure of the connection in these areas in the brain, we therefore assume here that the connections in each trial don't change and the time series before and after the stimulus are generated from a unified structure. With the application of our EGCM approach, we can include the intermittent stimulus and obtain a comparatively reliable structure.
In
A. A summary of the results in B, but locations in inferotemporal cortex are not precise, only for illustrative purposes. B. The mean connections from left hemisphere to right hemisphere, right hemisphere to left hemisphere and within both regions with the three bars corresponding to the results before learning (blue bar), after learning (green bar), and one month after learning (purple bar) in Sheep B. Significant changes after t-test are marked by arrows (right to left, all pairs are not significant, as indicated by “none”; within the right hemisphere, all pairs are significant, marked by “all”) . For Sheep C, an additional bar (one week after learning) is added (the third bar). Only significant changes from left to right and within the right hemisphere are indicated by arrows. C. Statistic summaries of results in B.
One of the advantages of the extended approach is that we have a frequency domain decomposition. Brain rhythms, not surprisingly, have also been intensively investigated in the literature
A. Mean and maximum ratio using all the three sheep before and after learning. B. Upper panel: Mean and maximum ratio of sheep B (see Experiment subsection in
In order to provide a deeper insight into this frequency story, we compute the ratios at different stages of learning.
EGCM has a strong connection with DCM as well as GCM. We consider the Dynamical Causal model:
Here we focus on the bilinear approximation of the Dynamical Causal model which is the most parsimonious but useful form
The bilinear approximation of DCM is represented in terms of nonlinear differential equations while the GCM (see Eq. (1)) is formulated in discrete time and the dependencies among state variables are approximated by a linear mapping over time-lags which seems to be quite different. However, we can find the difference is that the bilinear form includes deterministic inputs and observation variables and equations which are not considered in GCM. The formulation (18) comes from the Volterra series and is certainly a more accurate and biophysical constraint representation of a biological system. On the other hand, the GCM with autoregressive representation always takes the past information into consideration while the bilinear approximation of DCM has no time-lags included in the differential equations although the general form of DCM may have
In contrast to all previous methods in estimating Granger causality in the literature where essentially a regression method is employed, in EGCM we incorporate noise observation variables and apply the extended Kalman filter to recover the state variables. Additional inputs are also included in EGCM on the basis of an autoregressive model. The advantage of such an approach over the previous methods is obvious. The EGCM is more reasonable when we are faced with experimental data affected by a particular stimulus and applicable to cases where we cannot track the state variables respectively but just a function of them, or where the observation noise is considerable. Comparing to the traditional VAR models, all the coefficients in EGCM correspond to intrinsic or latent dynamic coupling and changes induced by each input which endow the model with interpretability power. Furthermore, all the previous methods in estimating Granger causality are batch learning: they require collection of all data before an estimation can be made. The extended Kalman filter, on the other hand, is an online learning: we can now update Granger causality instantaneously. One may argue that this is a common feature of online learning vs. batch learning. However, it is novel in the context of Granger causality. When we are faced with biological data, this feature becomes particularly significant. As we know, adaptation, or learning in animals, is very important but this makes it difficult to analyze since adaptation introduces dynamic change into the system. The classical way of estimating Granger causality can cope with this difficulty by introducing sliding windows in analyzing data. Of course, to select an optimal window size is always an issue in such an approach. However, in Kalman filtering, we can have the advantage of the connection of the state variables from the connection matrix and such an issue is automatically resolved.
In comparison with the bilinear approximation of DCM, the advantages of EGCM are the following: First, it allows time delay in the model more naturally and easier to deal with. Time delay is ubiquitous in a biological system, no matter whether we are considering gene, protein, metabolic and neuronal networks. Secondly, using Granger causality we are able to summarize the causal effect into a single number which is more transparent and easy to understand, particularly in a system with a time delay. Thirdly, it allows a frequency domain decomposition. We know that when we are dealing with a dynamic system it is sometimes much informative to view it in the frequency domain rather than in the time domain, as we have partly demonstrated here. Of course, since Eq. (19) is a continuous time version of Eq. (2), the results in the frequency domain obtained for Eq. (2) is essentially for the DCM model as well. We summarize our comparisons in
Commonalities | DCM | GCM | EGCM |
Multivariate analysis of time-series data | Yes | Yes | Yes |
Models directed coupling | Yes | Yes | Yes |
Inference on models | Yes | Yes | Yes |
Frequency decomposition | Yes | Yes | Yes |
Differences | DCM | GCM | EGCM |
Causality based on temporal precedence | No | Yes | EGCM is more general |
Causality based on control theory | Yes | No | Yes |
Requires known inputs | Yes | No | In general yes, but see example 2 |
Requires orthogonal innovations | No | Yes | Not necessary |
Requires stationary processes | No | Yes | Could use sliding window |
Requires a specific biophysical model | Yes | No | Yes |
Models non-linear coupling | Yes | No | Yes |
Inference on model parameters | Yes | No | Yes |
In the current paper, we have only applied EGCM to LFP data although it is clearly applicable to many other types of biological data. For example, in gene microarray data, we can have a readout of transcriptional changes in several thousand genes at different times over a period of many hours
The results of the EGCM analysis of our IT LFP data provide the first evidence for connectivity changes between and within left and right ITs as a result of face recognition learning. It is clear that learning is a dynamic and complex process
In the current paper, we have not explicitly introduced the spatio-correlation between each variables (electrodes). In other words, we have ignored the geometric relationship of electrodes in the array. This is certainly an over-simplification of the real situation due to the following reasons. First of all, despite the long history of multi-electrode array recordings,
We thank Prof. Karl Friston for his helpful discussions and advice which helped us improve our paper considerably.