trajectory matching

Apply a deterministic model to large populations in lightning speed, powered by Distributed Compute Labs.

Start with Sample DataRead More

Data analysis typically involves identifying regions of parameter space within which a postulated model is statistically consistent with the data. Additionally, one frequently desires to assess the relative merits of alternative models as explanations of the data.

A partially observed Markov process (POMP) model consists of incomplete and noisy measurements of a latent, unobserved Markov process. It has been a challenge to provide a software environment that can effectively handle broad classes of POMP models and take advantage of the wide range of statistical methodologies that have been proposed for such models. The pomp software package (King et al. 2016) provides us with a wide range of functions to represent POMP models.

When dealing with large populations deterministic models are often used. In deterministic compartmental models, the transition rates from one compartment to another are mathematically expressed as derivatives. Hence the model is formulated using ordinary differential equations (ODE), Kermack WO [1927]. Trajectory matching attempts to match trajectories of a deterministic model to data. In fact, the function estimates the parameter of the model by fitting the trajectory to data. It maximize the likelihood of the data assuming there is no process noise and all stochasticity is measurement error. The R package pomp provides useful tools for trajectory matching.

Infectious diseases of humans have been well-documented in literature and historical records due to their sometimes calamitous effects on civilizations. The effect of a disease on a population varies, depending on the structure of the population (rural or urban, aging or young, easy or difficult access to health care), and the history that the community has had with the specific disease. Mathematical and statistical tools have been used to describe the dynamics of infectious diseases.

The SEIR model is an extension of the model first studied by Kermack and McKendrick [1932, 1933]. This is a standard compartmental model framework used in epidemiology in which the hosts have the following compartments available: susceptible, exposed, infected, recovered. Considering N(t) the population at time t, $$S(t) + E(t) + I(t) + R(t) = N(t)$$

Here the number of new infected cases is defined by $$\text{cases}: C(t_1,t_2) = \int_{t_1}^{t_2} B(t) \, \frac{S(t) I(t)}{N(t)}\, dt$$ where \( B(t)\) is the forcing function that reflects the actual seasonally changing circumstances. \(\rho\) is the probability of reporting and \( \psi \) is reporting overdispersion. Reports between \((t_1, t_2)\) is normally distributed that is similar to the overdispersed binomial. $$\text{reports} \sim \text{Normal}(\rho \, C(t_1, t_2), \rho \, (1- \rho)C(t_1, t_2) + ( \psi \rho\, C(t_1, t_2))^2 )$$

Partially observed Markov process models consist of an unobserved Markov state process, connected to the data via a model of the observation (measurement) process. The process model is determined by the density \(P(x_n|x_{n−1})\) and the measurement process is determined by the density \(P(y_n|x_n)\).

Using process model, measurement model and initial point, trajectory matching can optimize the parameter of the model based on the maximum likihood estimation and fit the model to the data. The result can be illustrated as an simulation.