Conceived and designed the experiments: RS PYB. Performed the experiments: RS PYB. Analyzed the data: RS PYB. Contributed reagents/materials/analysis tools: RS PYB. Wrote the paper: RS PYB.
The authors have declared that no competing interests exist.
Realistic, individual-based models based on detailed census data are increasingly used to study disease transmission. Whether the rich structure of such models improves predictions is debated. This is studied here for the spread of varicella, a childhood disease, in a realistic population of children where infection occurs in the household, at school, or in the community at large. A methodology is first presented for simulating households with births and aging. Transmission probabilities were fitted for schools and community, which reproduced the overall cumulative incidence of varicella over the age range of 0–11 years old.
Moreover, the individual-based model structure allowed us to reproduce several observed features of VZV epidemiology which were not included as hypotheses in the model: the age at varicella in first-born children was older than in other children, in accordance with observation; the same was true for children residing in rural areas. Model predicted incidence was comparable to observed incidence over time. These results show that models based on detailed census data on a small scale provide valid small scale prediction. By simulating several scenarios, we evaluate how varicella epidemiology is shaped by policies, such as age at first school enrolment, and school eviction. This supports the use of such models for investigating outcomes of public health measures.
Individual-based models of disease transmission have increasingly included detailed demographic data to more accurately describe places where population mix and infection occur. These models may help to understand in more detail heterogeneities in transmission and improve public health decisions. Here, the spread of varicella, a childhood disease, is studied in such a model where spatial and population structures are explicitly modelled. The model focuses on children, organized in households, schools and municipalities, in agreement with census data. The detailed structure of the population used in the model allows for reproducing several observed differences in the epidemiology of varicella, for example, variation in age according to birth rank and place of residence. These results support using detailed models with the eventual aim of improving decisions in public health.
Varicella is endemic in most Western countries where vaccination has not been implemented
The explanation of such differences is likely to be found in factors shaping the possibilities of infection in children, such as household structure
Data from OECD and median age at varicella infection in European countries
Indeed, computational models of disease spread have increasingly tried to more accurately describe places where population mix and infection occur, using detailed demographic data (see for example
Indeed, census data typically provide a cross-sectional view of the population, yielding the current distribution of household sizes and age of members. Using such data, it is possible to simulate populations in which household structure and age structure conform with the census (see
Here, the simulation of a realistic population of children over several years is described, using detailed census data. The spread of varicella is explored in this population, where infection can be transmitted in several locations (household, school and municipality). Model predictions are compared with data formerly obtained from the Corsican children population in 2008
A spatially explicit, stochastic, agent-based model was developed using detailed demographic data from the island of Corsica. The study population is limited to children with an age of less than 12 years old (y.o.), as varicella is infrequent in older subjects
Creating households directly from census data, as described above and in
(A) Distribution of household final number of children. (B) Observed (white) and simulated (grey) distribution of time lag between successive siblings (in years). (C) Observed (white) and simulated (grey) number of children in Corsica according to age. (D) Observed (white) and simulated (grey) number of children aged less than 12 years old per household. (E) Commuting to school outside the municipality of residence.
The Corsican population (300.000 inhabitants, comprising 35.000 children aged under 12 y.o.) was split over municipalities (n = 360, mean area: 24 km2), according to current number of households comprising of at least one children less than 12 years old. Schools (n = 268) were created using data from the French ministry of national education at the corresponding locations. The school capacity (number of children during the school year), and type of school (schools for children with ages from 3 to 7 y.o, for children with ages from 8 to 11 y.o, for all children with ages from 3 to 11 y.o) was also taken into account. Information on commuting and data from the ministry of education were used to allocate children to schools (
The unit of simulation time was the day.
The size of a household was sampled in the HFS distribution, and the age of the first born child
At each time step, the age of household members was updated. Children whose age became positive entered the simulation. New households were created (as above) at a rate inversely proportional to the average sojourn time of households in the simulation, and introduced in each municipality according to municipality size so that the number of households was approximately constant over time. The age of the oldest children in newly created households was set to 0.
Once a year during simulation time (in September), all children who had their 3rd birthday were allocated to a school in a municipality according to the commuting probabilities and school capacity as reported by the ministry of education. Older children changed schools as required according to age. The school calendar, defining days without school (weekends, small holidays and summer holidays) was recreated as in the year 2007.
The natural history of varicella was described by an M/S/E/I/R compartmental model
The daily probability of infection
From model simulations, we calculated the following quantities (averaged over 80 successive years):
Cumulated incidence of varicella according to age (
Weekly incidence of varicella.
Place where infection occurred. Each time a child was infected, the source of infection was calculated proportionally to the terms in
The parameters
The other terms were determined using maximum likelihood. More precisely, we computed the likelihood of the model based on the cumulative incidence according to age as
Exploration of the likelihood was done using Latin hypercube sampling
A realistic age-structured model has been implemented as described in
The population structure created in the simulation was stable over time (number of households, number of children), and the age distribution of children and current household size matched that of the census data (
We explored four versions of the model, allowing different locations for mixing in each version. Each time, we selected the parameter set leading to the best fit in terms of likelihood (
Observed cumulated incidence of varicella (red), simulated by the RAS model (green), and simulated allowing Households only (dashed dotted dark), Households and Municipalities (dashed dark), Households and Schools (dotted dark), Households and Schools and municipality (plain dark). Hatches correspond to the 95% CI of the “Households and Schools and Municipality” model.
Mixing levels allowed in the model | Homogeneous iage-structured mixing | Household; External | Household; Municipality; External | Household; School; External | Household; School; iMunicipality; External |
(Model name) | (RAS) | (HE) | (HME) | (HSE) | (HSME) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Log-Likelihood |
|
|
|
|
|
AIC |
|
|
|
|
|
Source of infection | |||||
Household |
|
|
|
|
|
School |
|
|
|
|
|
Municipality |
|
|
|
|
|
External |
|
|
|
|
Using the RAS model, it was possible to simulate cumulative incidence data close of what was observed in Corsica. However, the fit of this model, as measured by the likelihood, was worse than that derived in the best fitting individual-based models (
In the individual-based model, when the probabilities of being infected with varicella in the municipality and in the school were set to 0 (model HE), it was not possible to obtain a good fit with the cumulated incidence profile with age. In the best fitting combination, even when large external transmission was allowed, the cumulated incidence (CI) at age 12 was 48%, short of the 90% observed (
We then included transmission in the municipality, but not in schools (model HME). The fit improved, with cumulative incidence of varicella reaching 90% by 12 years of age. The percentage of infections occurring outside households and the municipality was 8% (
The model with transmission in schools, but not in the municipality (model HSE), provided a better global fit as judged by the AIC. However, this time, the CI increased too slowly in children aged less than 3 years old, reaching 18% compared with 20% in the real data. Moreover, the CI at 12 years old was 92%, more than the observed 89%.
Finally, the model allowing transmission in both schools and municipalities (model HSME) provided the best fit (i.e. smallest AIC, see
Using the best fitting model (HSME), the cumulative incidence of varicella was predicted in two special cases: in first-born and in other children; and in rural and urban settings.
(A) Cumulated incidence in first-borns (observed: plain, simulated: grey zone) and others (observed: dashed, simulated: hatched zone). (B) Cumulated incidence in urban (observed: plain, simulated: grey zone) and rural municipalities (observed: dashed, simulated: hatched zone). (C) Observed weekly incidence (plain) and simulated (grey zone) starting from the first week of September to the end of August. Hatches correspond to school holidays. In all simulations, the maximum and minimum from over 100 years of simulation are reported.
When all levels of mixing were not allowed, the model failed to reproduce these quantities. For example, when the municipality was not included, although the simulated overall cumulated incidence was close to that observed in Corsican children, the cumulated incidence in first-born children was largely underestimated at three years old: 8% at 3 years old (not shown), compared with 22% in the observed data.
The average weekly incidence agreed with those reported by the Sentinelles network system
In France, infected children are excluded from school as soon as varicella symptoms are recognized. We explored how this public health measure impacted the spread of varicella. Simulations were run assuming that infected children remained present in the community and school until they were cured, with the same level of infectivity as during the asymptomatic stage. As a consequence, age at varicella decreased in the whole population, with median age at infection shifting from 4.7 to 2.6 years old (
(A) Varicella CI according to school exclusion (school exclusion - current: plain, no school exclusion: dashed). (B) Varicella CI according to age at first-school enrolment: at 3 y.o. (current policy) (bold), at 2 y.o. (dashed), at 4 y.o. (dotted), at 5 y.o. (dash dotted).
In this paper, we have shown that including detailed population structure in models of varicella transmission allowed us to reproduce the cumulated incidence of varicella according to age. A better fit was obtained than with a realistic age structured model, even using mixing matrices based on real contacts. Importantly, specific features observed in varicella epidemiology were reproduced owing to the detailed structure of the model. This included, for example, differences in age at varicella according to birth rank and place of residence, indicating that the rich structure built into epidemiologic models using census data leads to improved models regarding disease spread.
Modelling varicella, or other childhood diseases, requires simulating populations over several years. Indeed, seroprevalence studies show that most infections occur during the first 12 years of life
Varicella provided an excellent case study, since it is a common childhood disease in France (as universal vaccination is not recommended) and surveillance data is available, however the model could however easily be applied to other childhood infectious diseases. The varicella natural history description was standard
One issue in the present modelling was how to initialise the population regarding susceptibility to the disease. Indeed, the susceptibility of siblings or schoolmates is not independent, since the disease is transmissible. We used two approaches: (1) start with an entirely susceptible population or (2) randomly assign a susceptibility status according to the observed cumulative incidence with age
The choice of Corsica to build the model was motivated by the availability of epidemiological data for comparison, nevertheless changing the input census data would make it possible to use the model in other settings. A child contact network was described as household, school and municipality. The detailed description of these places structured the possibilities of interaction according to age and space, in place of mixing matrices used in other models
Number of contacts | Log-likelihood of the best-fit model | % of cases occurring in households | % of cases occurring in school | % of cases occurring in municipalities | % of cases occurring outside |
3 contacts in the municipality | −5941.84 | 40 | 11 | 44 | 5 |
10 contacts in the municipality | −5941.12 | 40 | 11 | 45 | 4 |
20 contacts in the municipality | −5941.05 | 40 | 11 | 45 | 4 |
50 contacts in the municipality | −5941.55 | 40 | 11 | 45 | 4 |
100 contacts in the municipality | −5941.71 | 40 | 11 | 46 | 3 |
10 contacts at school | −5941.52 | 39 | 14 | 43 | 4 |
15 contacts at school | −5941.34 | 40 | 12 | 45 | 3 |
30 contacts at school | −5941.12 | 40 | 11 | 46 | 3 |
40 contacts at school | −5941.66 | 42 | 10 | 44 | 4 |
75 contacts at school | −5941.58 | 42 | 8 | 46 | 4 |
The basic reproduction number, corresponding to the number of secondary cases caused by one case in a totally susceptible population, (R0) was approximately 4 (average over 500 simulations with 1 initial random case). This value was in the low end among European countries
The model required estimating only 3 parameters corresponding to daily transmission probabilities, and provided a very good fit to the data. In the best fitting case, the daily probability of pairwise transmission in the household was approximately 17%, and it was 13% in the school and 12% in the municipality. Interestingly, these probabilities are consistent with the rate of varicella transmission (0.00133/minute) derived from models based on time use data
In conclusion, detailed simulation of realistic children populations over several years may improve the study of childhood disease transmission. Further comparisons with compartmental models using realistic mixing matrices are necessary to identify the best approaches to help public health decisions.
The authors would like to thank the French Education Nationale for their technical support. We thank Anders Boyd, Anne Cori and Pascal Crépey for useful comments.