Trends in Covid-19 Spread
The John Hopkins University (JHU) collects data on how the COVID-19 virus spreads over the world and makes it available on github. In this analysis, we take the data and extract the numbers that describe how the growth of casualties develops. Looking at the trends in these numbers, one may deduce how fast or how slow the virus is spreading and also compare how the situation evolves in different countries.
Feel free to use the python implementation that produces the presented plots and that can be modified to show the results for any country that is in the JHU data base.
This post bases on joint efforts and discussions of the CSC group at the MPI Magdeburg.
The numbers in the presented plots are updated automatically1.
Table of Contents
Why This Analysis
If one looks at the actual numbers generated by an exponential growth, the overall growth is so strong, that one barely sees changes in the dynamics. If one looks at the numbers in log scale, the data becomes more handy. In log scale an exponential growth looks like a straight line and the slope of the line indicates how fast the exponential growth is.
To see whether the exponential growth changes its dynamics, one may check whether the line in the logarithmic plot changes its slope (hopefully it gets less steep). That’s why we plot the slope of the logarithmic line that connects the numbers of two consecutive days. Since the data is wiggly, we also plot averages of the slopes over 2 or 5 days to better spot the trends.
This post was inspired by an illustration that was discussed on twitter2 in the last days: It shows how, seemingly, the exponential growth in casualties did decrease some days after lockdowns have been implemented. In this analysis, we try to automatically detect such changes in the growth for a number of countries and represent it in an accessible form that allows easy comparisons.
We examine the (logarithmic) slope of the curves of the accumulated casualties
per day. That is for day
d and the day
d-1 before that day, we plot
log2(x[d]) - log2(x[d-1]), where
x holds the number of accumulated deaths for every day
log2 is a function that computes the logarithm of a number with
respect to the basis3
To make sense of the numbers, we start with some example scenarios. In the plot below, we have plotted the values for some fictitious growth scenarios.
An exponential growth – every day
dthe number of additional casualties is
1.1to the power of
dand think of a daily increase by 10%). This scenario leads to a constant value in the plots of around
A constant growth – every day another
10casualties are added. This is like the number of daily deaths due to traffic incidents in Germany4 in 2010.
A growth that decreases exponentially – every day the number of additional casualties gets smaller exponentially.
General Explanations of the Numbers
- A constant value means that the number of deaths grows exponentially.
- If this constant is 1, this means that the numbers of casualties doubles every day.
- A value of about 0.5 means a daily plus of 40%.
- A decreasing curve indicates that the exponential growth of seriously infected people is stalled or reversed.
- If the value approaches 0, this indicates that the virus is contained.
What It Would Look Like and What It Should Look Like
To have a comparison of what an uncontrolled spread and what a well controlled spread would look like, we ran two simulations in a covid simulator5.
- A scenario for Germany in which no interventions are taken.
- A scenario for which we defined interventions such that the number of hospitalized patients that require intensive care (ICU) was always below 45000.
In the uncontrolled case, exponential growth is detected with a slope of about 0.18. After some time, when most people have been infected the, growth decreases down to zero.
In the controlled scenario, the rate is brought down in the initial phase. Then, exponential growth happens at a lower rate (in the plot the value is about 0.07) though for a longer time before it fades out.
To relate the slopes to the actual cases, in a second plot, we display the corresponding numbers of people that will need intensive care in a hospital. In our model, we assumed that 1.15% of the infected people will have to be treated in an intensive care unit.
The Actual Numbers
As of the date indicated in the plots, the JHU data delivers the following numbers. We plot the logarithmic slopes for several countries for the last 80 days. See the pdf file for more countries and a better resolution of the plots.
Some Interpretation of the JHU Data
As of today, one may say that:
- In all European countries, the spread of the virus has been slowed down to rates well below 0.05.
- The curves look all similar, still the overall numbers differ.
- An indicator for high numbers in casualties are high rates in the phase after 300 deaths; see Italy 🇮🇹, Spain 🇪🇸, France 🇫🇷, and the US.
- In Germany 🇩🇪, in this phase, the rates were lower than in the countries mentioned above.
Other Things that can be Seen
- The curves seem to have similar phases. A phase of exponential growth followed by a decrease.
- The curves differ in the length of the phases (France seems to have a long phase of exponential growth if compared to Spain)
- And the starting points are different which gives a hint on when the outbreak became visible in the different countries.
The last 30 days
Since we look at the relative growth, so called saturation effects will make changes less visible. This happens in particular, with a high number of casualties but a low number of active cases. In the long run, exponential growth will still be visible, but also the short term dynamics are of interest. For example, to spot the start of a second wave.
That’s why the following plots consider the daily numbers for the last 40 days (leaving aside all cases that happened before) and show the daily increase of the number casualties in percent for the last 30 days.
Notes and Acknowledgements
On the data
Certainly, the casualties lag6 behind the actual spreading of the virus by a number of days. However, one may think that numbers of casualties are a more reliable data point than the number of infected.
This analysis is purely based on empirical trends. No statistical data analysis tools have been applied to (pre-)process the data, like data denoising (except that some outliers might not be shown because of the plot margins).
The initial work, namely making the data easily available in python as well as the code that produces the title picture, was done by Petar Mlinarić.
Updates once a day in the morning. ↩︎
The basis is not too important here. If one takes 2, then a value of 1 means a doubling. A different basis would only scale the plots but not change the qualitative outcomes. ↩︎