NYCDOH Covid Dataset - Omicron Update

6 minute read

NYC Dept of Health’s Covid Dataset - Omicron Update

Introduction and Timeline

So it has been over a year since I wrote my original post exploring the NYC Covid Dataset. Periodically I would run my notebook to see updated data and yes there have been some changes since last year. I wanted to explore the dataset again since Omicron is all over the news and since I personally tested positive as well despite being vaccinated, wearing a mask in public, and being as cautious as possible. Maybe I didn't exercise enough caution, but here I am now under quarantine and looking at the data again. This is a continuation of my previous post on covid. You can check out that page here for background information on the data itself.

Data Source

The data sources are the same as before. I will explore the following data sets

  • data-by-date looks at cases, hospitalizations and deaths daily across the five boroughs
  • tests looks which looks at the testing that has been done across the city as well as positive test results (cases)
  • data-by-modzcta looks at all the data by zipcode
import pandas as pd
import requests
import json

# Git Repo URL
url="https://github.com/nychealth/coronavirus-data/raw/master"

# import data
daily_data = f'{url}/trends/data-by-day.csv'
testing_data = f'{url}/trends/tests.csv'
zipcode_data = f'{url}/totals/data-by-modzcta.csv'

Cases, Hospitalizations, Deaths

Again, we will look at total cases, hospitalizations and deaths. This time around I did not do a break down by borough. This time around the data range is from 02/29/2020 to 12/14/2021 Lists of the data were created in order to plot the data, and from the list we can see that the date range for the data is 02/29/2020 to 11/19/2020. Again, I checked the date range and the peak number for each figure I was interested in (cases, hospitalizations, deaths).

# Date range
print("The date range is",dates[0],"to",dates[-1] )
# Peak number of cases
print("For cases...")
peakdates(cases,dates)
# Peak # of hospitalizations
print("For hospitalizations...")
peakdates(hospitalizations,dates)
# Peak # of deaths
print("For deaths...")
peakdates(deaths,dates)
The date range is 2020-02-29 00:00:00 to 2021-12-14 00:00:00

For cases...
The peak occurred on 2021-01-04 and had a count of 6593

For hospitalizations...
The peak occurred on 2020-03-30 and had a count of 1848

For deaths...
The peak occurred on 2020-04-07 and had a count of 599

Covid Cases

This is where we see our first major difference from a year ago. When I ran this in Nov 2020, the peak number of cases has occurred on April 6th 2020 (6353) which was right around the time when the first covid wave hit NYC and we were a month into a lock down.

png

Looking at the updated data, the new peak occurred on January 4th 2021 (6593) during the second wave. The number of cases were dropping but in July of 2021 you begin to see a rise in cases again. This was during the period when the Delta Variant started to spread in NYC. This was just a small uptick and case numbers were dropping into November of 2021. This was short lived as the numbers rose again, including the recent surge this past week. As of December 14th cases reached 6072 nearing the previous peak. It will be interesting to see where the numbers will go next week.

png

We also see that when looking at the last 30 days, case numbers are trending upwards.

png

Hospitalizations

When looking at hospitalizations, the data shows us some good news. While there have been some small increases in the number of covid related hospitalizations, they have never increased to the levels we had back in March 30th 2020 (1848) during the beginning of the pandemic. The main reason I believe hospitalizations never returning to March 2020 levels is because we know more about the virus now than we did back then. At the start of the pandemic if you were suspected of having covid the only place to really go and get tested/treated was the hospital. Going to the hospital also meant you were immediately placed under a 14 day quarantine as not much was known about the virus. In addition, unfortunately I believe those that were most vulnerable were hit hard during the start of the pandemic. NYC Public schools closed on March 16th 2020, Bars and Restaurants closed on March 17th 2020, and the Statewide Pause Program (all non-essential workers must stay home) began on March 22nd 2020. Masks only became mandatory on April 15th 2020. The individuals who were hospitalized during the peak most likely came in contact with the virus before preventative measures such as wearing masks were in place.

The peak occurred on 2020-03-30 and had a count of 1835

png

For the past 30 days it looks like there was an increase in hospitalizations, with a significant drop off the last few days. Thankfully although numbers were increasing they were nowhere near the March 2020 numbers.

png

Covid Deaths

Like hospitalizations, Im happy to report that the numbers never returned to the previous peak.

The peak occurred on 2020-04-07 and had a count of 598

png

There has been a slight increase when looking at the 30 day trend.

png

7 Day Averages

Here is a look at the 7 day averages for Case4s, Hospitalizations, and Deaths together on a graph. While we do see that cases rose during the second wave as well as during the onset of both Delta and Omicron, hospitalizations and deaths have thankfully not had the same resurgences.

png

Testing

Testing has continued to increase. Previously the peak number of tests conducted occurred on Nov 16th 2020 (71,626). Currently we see that since 2020 there was an increase in testing. Testing began to decline in January 2021 as the vaccines became more available and the focus went from testing to vaccinations. That pattern of declining testing changed in July of 2021 right around the time the Delta variant made it's appearance and even so now with the emergence of Omicron. The current peak occurred on Dec 06th 2021 (102,709).

png

The peak occurred on 2020-11-16 and had a count of 71626

png

The peak occurred on 2021-12-06 and had a count of 102709

Positive Test Results

I also looked at positive test results. The reason for this is because testing has increased significantly since the start of the pandemic and we should look not just at the number of tests administered, but how many yielded positive results. Initially, the highest number of positive cases occurred the onset of the pandemic specifically on Apr 06, 2020 (6,780 positive cases). We now see that the new peak occurred during the second wave, specifically on Jan 4th 2021 (8,079 positive cases).

png

The peak occurred on 2020-04-06 and had a count of 6780

png

The peak occurred on 2021-01-04 and had a count of 8079

Now we must also look at percentage of positive cases as it is not fair to look at just the number of positive cases since testing has increase so much. Initially we see that the peak percentage of positive cases occurred at the start of the pandemic, specially on March 28th 2020 when 71.17% of tests yielded positive results. This high percentage is due to a bias caused by early testing which primarily occurred in the emergency rooms of NYC hospitals which were only taking in individuals suspected of having covid. Im happy to report that the percentage of positive cases have NOT reached those Early 2020 levels. We see that there were a few slight upticks during the second wave and the emergence of Delta as well as what looks like an uptick now for Omicron. However these upticks have yet to reach those 2020 levels.

png

The peak occurred on 2020-03-28 and had a count of 0.7117

png

The peak occurred on 2020-03-28 and had a count of 0.7117

Also we do see that when looking at the last 30 days, positive results are trending upwards.

png

The peak occurred on 2021-12-12 and had a count of 0.0735
The max positive result was 7.35 %
The max positive 7day avg result was 5.029999999999999 %

Conclusion & Code

I will probably run this code again on a later date. If numbers increase in NYC I will run an analysis sooner rather than later. You can check out the full Python code using the following methods:

  1. Github Page: Francisco’s Repository
  2. Google Colab: Open In Colab