Downloading COVID-19 Data

The URL for Johns Hopkins CSSE (Center for Systems Science and Engineering) COVID-19 GitHub website directs you to a master folder from which subfolders can be selected. In this example, I chose the csse_covid_19_data folder, then selected the csse_covid_19_daily_reports_us folder, then selected file 08-24-2020.csv. This is a URL version of the file; select Raw in the file header to obtain the comma-delimited file’s URL. Now we save the URL and the destination file names as variables url and destfile respectively, and use the download.file function in R to save the comma-delimited file to the default working directory.

url="https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports_us/08-24-2020.csv"
destfile="Aug_24_2020_Daily.csv"
download.file(url,destfile,quiet=TRUE)

The quiet argument suppresses some information about processing the request that I did not find particularly enlightening. You should open the file in your working directory in Excel or a text editor to make sure it imported correctly.

Create a function

Suppose we wish to download data every day. It is not so difficult to modify the arguments in url and destfile, but we can automate it somewhat using a function with a single argument and the paste command (paste0 is used as well, just to show an alternative for pasting character strings with no delimiter between them).

COVIDDaily=function(datestring){
url=paste("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports_us/",datestring,".csv",sep="")
destfile=paste0(datestring,"-Daily.csv")
download.file(url,destfile,quiet=TRUE)
}

We can then call our function to retreive the next day’s data, and read the data into RStudio:

COVIDDaily("08-26-2020")
COVID_Aug_26=read.csv("08-26-2020-Daily.csv")