Measles Reporting and The Dangers of Semi-Open Data
If you read local news, you might see headlines like: “Private schools putting NYC at risk for measles outbreak” in The New York Post, “Immune to Logic: Some New York City Private Schools Have Dismal Vaccination Rates” at New York Magazine and “Religious Schools Lead the Vaccine Opt-Out List” at WNYC. The articles are using a data set from the New York State Open Data Portal, providing vaccination rates for most schools in New York State.
When reporters use public data like this (or alternatively release copies of their data), it has the added advantage that readers can build off a reporter’s analysis with their own additional insights. That is one of the great advantages of Open Data. But when I dug in here, I noticed something strange: this data set had vaccination rates on most public and private schools in New York State, but it contained only private schools in NYC itself. For some reason, our City has not released public school level data with vaccination rates on the open data portal. These cases where we only get partial information is a class of issues I like to call Semi-Open Data. Other examples of Semi-Open data include the use of PDFs for data releases (e.g. NYPD, CCRB, NYC Budget), providing maps without underlying data (e.g. NYPD, DSNY, Comptroller) or providing only tiny slices of stale data (e.g. OEM).
Is Semi-Open Data a bad thing? Well, I decided to explore what it does to news reporting on the issue after catching the headlines above. To do that, I started digging into the underlying data set in detail. I created a map of private school measles vaccination rates as reported for the 2013-2014 school year, where the circles represent the percentage of students in each school that are not vaccinated. Larger circles represent lower vaccination rates. (Shout out to New York State Open Data for geocoding the data for easy mapping, something New York City has a poor record on!)
The majority of schools are marked as tiny orange circles, representing those with vaccination rates of 97% or higher. But there are areas of the city where schools with lower vaccination rates are more prevalent (as reflected by bigger circles), such as Williamsburg, Crown Heights, and Borough Park. That being said, even in those neighborhoods, the majority of schools have 97% or higher vaccination rates.
Taking a closer look, we can see that about 8.1% of schools, or 66 out of the 814 in the sample, have measles vaccination rates lower than 90%, which is considered the minimum for proper “herd immunity”:
I also pulled out the 25 private schools with the lowest measles vaccination rates in 2013-2014.
Note that enrollment numbers vary across private schools, though specific enrollment numbers are not accessible via the NYC Open Data Portal. A quick Google search of the school with the lowest vaccination rate (40%) in 2013-2014 reporting, World Harvest Deliverance Center Christian Academy, revealed that it has only 40 students.
So the data gives us a pretty good view on private schools, but what about public? How can all three articles point to private schools as the larger problem in measles vaccination without access to comparable public school data?
Well, although we can’t see data on any individual public schools, New York City does report out aggregate statistics on students in each borough, and all five boroughs come in at an average measles vaccination rate between 98.5%-99.2% according to their data for 2013-2014 school year. The average private school reports a 97% vaccination rate, which at first glance seems lower than the public school rate. But remember, the private school rate is based on school-level statistics so the calculation treats all schools as equal, regardless of size. This over-weights smaller schools, which might be over represented in the private schools with lowest vaccination rates. Thus the comparable private school student vaccination rate could theoretically be higher than the aggregate student vaccination rate in NYC public schools.
So with Semi-Open Data, we are unable to make a fair comparison. But that did not stop the media from trying. It seems that New York Magazine just went ahead and made the comparison any way:
For comparison, among the more than 800 private schools in the city, the overall immunization rate last year was 97 percent. In public schools, the current immunization rate is above 98 percent, according to the New York City Department of Education.
The New York Post made a different comparison, comparing the relevant private school vaccination rates with the % of public school students who have “exemptions”, as opposed to the number who are not vaccinated, which is about ten times higher:
Only 0.19 percent of public school students received the religious exemption this year, while 0.01 percent were granted a medical exemption.
And WNYC’s headline “Religious Schools Lead the Vaccine Opt-Out List” is definitely true “in this area” as they say in the article, but the area is the New York Metro area and the list referred to in the headline does not have NYC public schools in it.
Let’s say the private school rate really was 97%, even when measured at the student level. Well, a recent USA Today article claimed that 1 in 7 public and private schools in the country have vaccination rates below 90%. Here in New York, our private schools, the source of all these headlines, are more like 1 in 12, making them much better than the national average of all schools.
So should we all panic and run for the hills away from our private schools? No, we are doing relatively well despite what the headlines tell us. Of course, there are clearly individual schools with room for improvement.
I don’t blame the news reporters for struggling to make a fair comparison here. And if you think that the missing open data might be inadvertent, you might be surprised to learn that the Department of Education did not fulfill multiple requests for the data from at least one reporter I spoke with, all while Trenton and Albany came through. Semi-Open Data forces citizens and reporters alike to try and draw parallels between incomparable metrics. What came out was a patchwork of different and confusing comparisons on a serious health issue in NYC. Semi-Open Data is a real problem for New York City’s citizens, and in this case the Department of Education has the power to fix it.
There is some good news in all of this. A vast majority of New Yorkers seem to trust the overwhelming scientific data supporting vaccination use. And if you read this blog, you know that’s something I can get behind.