Coronavirus Data-Analysis using Mathematica
Data regarding the actual Coronavirus pandemic are public available from Johns Hopkins (Link) via GitHub. In the following sections I describe a way to use and analyze the data using Wolfram Mathematica. The corresponding notebook and more code be available in the Mathematica section of my website (Link)
First of all, the data needs to get read in, this is done via the Mathematica function Import.
The dataset has the following general structure. First column holds data of Province or state (e.g. Hubei) next column contains information about the country or the region (e.g. China). The following two columns are the geo-coordinates for the country. Starting from column 5 the data for the corresponding day follows. In the next step we form a dataset form the data
Making a Dataset (Version 1)
Mathematica provides a powerful data management via "Dataset". Here we form a dataset from the read in data. The (column) headers are the first element of the read in data, so we extract them and modify the given dates (from column 5 on) with a helper function to adjust the format in a way that is better accessible later.
Now making the headers
Finale we form a dataset in a way that we combine columns one and two to one column named "location". All parts together deliver a function to make the datasets directly
Now making the datasets for the confirmed, recovered and death cases is easy, the three datasets are constructed with this one line of code
Last step is to make a function to plot a comparison of the spread of the virus in two countries.
The application is now easy:
With just a few lines of code one can build a user interface to compare the spread of the disease in two countries.
Now having the data in one can also, quite easily, determine the mortality rate for each country and visualize this. Here is shown the top 15 mortality for countries with more than 100 persons confirmed infected (Based on data from Johns Hopkins University 2020-03-22: