Since the COVID-19 pandemic began, nations and companies around the globe have been racing on several fronts. There have been races on distributing information to prevent and treat the illness; containment of spread; minimization of mortality rates; vaccine development and vaccine distribution. Much like in the case of SpaceRace that took place on the 60s, The COVRace — or Race against COVID-19 is an event that is comprised of mix advance science and geo-political factors. However, The COVRace has a critical distinction with respect to the Space Race: There are live at stakes.
Another distinction between the COVRace and the SpaceRace is the amount of data and analysis capabilities the scientific community has. By applying common Data Science techniques, publicly available raw data can be transformed into relevant that could ultimately save lives. This blog post is about applying Data Science techniques to two COVID-19 datasets obtained from Kaggle.com (listed below). These datasets will be utilized to provide a high-level answer to the question: Who is winning the race?
Provides time-stamped data from daily vaccinations around the world. It covers 90 countries and in addition to providing information on new vaccinations, it provides information on which vaccines are being used on a given country.
Provides time-stamped records with information on how COVID-19 is spreading around the world, and how deadly it has been since its beginnings.
Please note that this datasets are constantly being updated. For this initiative, a dataset from February 16, 2021 was utilized on A., and a dataset from February 15, 2021 on B.
Part I: Which vaccine brand is winning the race?
At the time of this write-up, there was a total of 8 vaccines brands circulating around the world. I was interested on knowing the total number of vaccines being produced and distributed by each of brands and rank them against each other. However, dataset A provided data on ALL the vaccines brands being used in a given country. In other words, the number of administered vaccinations was not broken down by brands. In order to stack vaccine brands against each other, I decided to look into the global reach of each of the brands. The global reach metric is defined as the number of countries using a given vaccine brand.
The analysis showed that Pfizer is the clear leader on the race to reach countries around the globe. Reaching a total of 65 countries, a 72% of all countries reporting vaccinations in the dataset. The Oxford/AstraZeneca vaccine follows with a reach of 45 countries or 50% of the countries in the dataset. This numbers could probably be explained by the fact that Pfizer came out with a vaccine first. Another factor in play may be that Pfizer (188.8B market cap) is almost twice the size of AstraZeneca (91.9B market cap). In general, having an edge in resources on research and manufacturing capabilities correlates with leadership on the race to save life through vaccine products.
Part II: Which country is winning the vaccination race?
A simple but insightful measure one can obtain from dataset A is daily vaccination rate or vaccination speed. To accomplish this, the number of new vaccines on a time frame for a given country is measured. My analysis focuses on a subset of industrialized countries. This focus is a mitigation strategy for potentially incomplete or low reliability data deriving non-industrialized countries. This strategy also allowed me to rank countries with comparable technological resources on their ability to drive vaccinations.
In terms of volume, United is clearly leading the race. According to the analysis, United States is vaccinating 0.9 million people a day. That is nearly 11 vaccines per second. However, to provide a fair comparison, the vaccination rates are normalized by population. This metric provides a more holistic perspective of the impact of the vaccination rates on the wellbeing of the country.
This graphic shows that Israel is capable of vaccinating 1.3E-4 percent of their population each day. This is over three times faster that the second leading Industrialized Country. In general, a very impressive accomplishment that could probably be explained by diligence of the nation’s leadership.
Part III: Which country is losing the Infection Rate race?
For this part, dataset B is being utilized. By analyzing the number of new reported infections over a time frame, an average infection rate is obtained. Combined analysis across datasets A and B will be discussed on Part IV. Data derived to meet the goals of Part III will be later utilized for analysis across datasets. On view of this, data for Part III will be extracted on a a time frame that fully overlaps with records in dataset A.
According to my analysis, China leads the pack on battling the spread of COVID-19 with an average of 49 new cases per day. However, the infection data points should be interpreted with the following caveat: Not all countries have the same COVID-19 testing requirements, nor the same access to COVID-19 tests.
To have a more concise insight about the impact of the infection rates on a given country, the values are normalized by the population of each of the countries.
According to this analysis, Israel is losing the race against the spread of COVID-19 with 6.5E-2 percent of their population (about 185,000 people) being infected each day.
Part IV: Races Compared
To gauge overall performance of a nation in the COVID-19 race, both infection rates and vaccination rates are compared. Perhaps a much more relevant and impactful analysis would look into weather a higher vaccination rate has a correlation with a lower infection rate. This information, in essence, would proof the effectiveness of the vaccination campaign.
The graphic shows that despite Israel’s lead on vaccination rates, their infection rate is exceptionally high with respect to other countries too. This debunks the hypothesis that a higher vaccination rate correlates with a lower infection rate. From this analysis we can further conclude that the most effective campaign against the pandemic could be attributed to Germany. Germany ranks within top best 5 on infection rates and ranks top 4 on vaccination rates.
To shed more light into the overall race performance, the following scatter plot utilizes infection rates and vaccination rates as parametric coordinates on an XY plane. As configured, a the further a country is from the origin (on both X and Y), the higher its ranking. For this plot, outliers China and India (with infection rates too small to show) had to be removed.
In the scatter plot, the x-axis is built by dividing all infection rates by the highest infection rate (Israel). The x axis, has also been inverted to illustrate that the smaller the infection rate, the better. In this configuration, the closest a given country is to the upper right of the plot, the better the country is performing on both fronts of the battle against the pandemic. The y axis is obtained by dividing all vaccination rates by the maximum vaccination rate per capita (also Israel), and is also scaled with a symlog function. This gives more importance to the vaccination rate and helps create more separation between countries with similar vaccination rates. These manipulations may throw away the numerical significance of certain data points, however, the goal of the plot is to rank the countries, and not to measure how much better one is over the other.
It the graphic it is possible to observe that Germany is one of the furthest from the origin on both the X and the Y axis. This further confirm earlier observation on that Germany could be considered the top contender.
On this post datasets on COVID-19 infections and vaccinations were analyzed with the end goal of ranking companies and countries on their response to the pandemic, but also determine if there was any correlation between vaccinations and infection rates over the same period of time.
· We looked into the reach of the vaccine brands at a global scale to identify Pfizer as the clear weaner on this category
· A vaccination rate was computed and the analysis showed that Israel is leading the pack (with respect to population) by great margins.
· An infection rate was computed and the analysis showed that China was leading the pack on preventing the spread of the virus. The analysis also showed Israel as the country with the highest infection rates per capita.
Finally, combined analysis on both dataset showed that there wasn’t any apparent correlation between an a high vaccination rates and a decrease on infection rates. Additionally, it became evident that Germany is ranking relatively high on both vaccination (5th) and infection (4th) prevention. When it comes to preventing the loss lives, Germany is leading the pack.