Cause of Death: Melanin | Evaluating Death by Police Data

(This article was first published on Econometrics by Simulation, and kindly contributed to R-bloggers)

Widespread attention towards the death of black men by police has sparked protests and public outrage in many a city. Recently, the Federal Government has launched an investigation into the conduct of the entire Chicago police department. However, by protests and videos popping up all over the country, the apparent problems of police abuse do not appear to be contained.

As part of this movement, Black Lives Matters (BLM) protesters have drawn much attention from the media over the last couple of years as they have caused large protests in cities such as Toronto, Chicago, San Francisco, and Ferguson. These protests have also drawn international attention especially in how they led to riots in Baltimore, Maryland.

How BLM has been received by the general American public has been mixed. Unfortunately, the violence of the protest has distracted from the message that the BLM movement has been attempting to convey. That is, that black people are disproportionately likely to be abused or killed by police than their white counterparts and that this abuse is unjustified.

But is this true? The answer is not immediately obvious as data about people getting killed by police is not systematically collected by any federal agency as police departments report deaths only voluntarily to the FBI. What partially comprehensive data exists, is instead collected by the efforts of activists such as those at FatalEncounters (, see discussion of data issues below).

The folks at FatalEncounters say that their data should not be used to analyze racial differences due to the high rate of incompleteness of reports regards race. However, making two simple assumptions we are able to start making a lot of inferences from our data (see discussion of assumption issues below):
A1. Data is missing at random without regard to the race.
A2. Data is missing at random without regard to state.

Before getting any further into this I would like to point out that the likelihood of an individual officer being involved in a killings is relatively rare (9% over the last 16 years or 0.56%/year [1]). Furthermore, without stretching truth we can probably assume the vast majority of such are justifiable to varying degrees.

Thus the average person probably should not be concerned for one’s life when encountering police officers. That said, what is the case for the average person might not be the case for all segments of the population. The BLM movement would argue that young black males should be justifiably much more concerned about being killed by police than other segments of the population. So let’s get into the data and see what we can find.

Q1. Are black males more likely to be killed by police than their white counterparts?

Table 1:  Shows that the relative likelihood, for a 16-30 year old, the likelihood of being killed nationally proportional to your representation within the population is 7 times greater than that of a white person (2.1/.3) using assumption A1. Loosening the assumption and assigning all of the unknown race kills to that of white people (an extreme assumption) we get column Kill2. Even under this extreme assumption, the likelihood of being killed by a police officer is 2.3 times greater for a black person than that of a white person (2.1/.9).

     Race  Kills  Kill%  Pop%  Kill%/Pop%  | Kill2 Kill2/Pop%
1   black    937    27%   13%         2.1  |   27%        2.1
2 unknown   1202    35%    0%           -  |    0%          -
3   white    712    21%   63%         0.3  |   56%         .9
4   other     79     2%    8%         0.2  |    2%         .2
5  latino    527    15%   17%         0.9  |   15%         .9

Q2. How does the risk of being killed by police for black people relative to that of white people vary by state?

Figure 1: The relative likelihood of a black person being killed in states to that of a white person. Grayed out states are those in which no black people were killed by police.

From figure 1 we can see that the relative risk of being killed by police vary significantly by state. Surprisingly those states south of the Mason-Dixon line seem to have more equalized rates of being killed by police than some of those states north of the line with a Illinois, Ohio, and Massachusetts in which the likelihood of being killed by police for a black person is over 7 times higher than that for a white person.

Q3. Is poverty the driving force causing black people to be at higher risk than their white counterparts?

Figure 2: The relative risk of being poor for black people relative to white against that of being killed for black people relative to that of white.

From Figure 2 we can see that there does appear to be a relationship between the risk of being black and poor and the risk of being killed by a police officer relative to that of your white counterpart (with the exception of OH and IL which have much higher risk than should otherwise be expected). If you regress relative risk of being killed by relative risk of being poor and relative risk of failing to complete high-school, for each 1 times more likely a black person is of being poor that person is 2.8 times higher risk of being killed by police.

Figure 3: The relative risk of being poor mapped against the relative rates of high school completion.

From Figure 3 we can see that poverty and failure to complete high-school are correlated with some states such as Minnesota and Wisconsin having less than 25% completion rates among blacks relative to their white peers while simultaneously having the highest rates of poverty among blacks relative to white in all of the data.

Q4. So is it wise for black people to move to southern states if they would like to avoid police violence?

Figure 4: The overall risk of anybody getting killed by police.

From Figure 4 we can see the story is not quite so simple. Louisiana, Mississippi, Alabama, and Nevada, four states which had the lowest relative risk for black people to be killed by police have some of the highest rates of any state in terms of likelihood of any resident of the state being killed. In contrast, Illinois, Indiana, and Vermont now have some of the lower rates in the country in terms of likelihood of being killed by a police officer.

Figure 5: Likelihood of a black person being killed by police.

From Figure 5 we can see that black people in the Northwest of the United States have the lowest possible chance of being killed by the police since in the last 16 years there is no record of any deaths by police. However, this might be comforting to few since the population of black people in this part of the country is very small. Strangely Vermont and Nevada both have very high rates of deaths for black people. The high rate in Vermont could be explained by the state having the second smallest black population while the even higher rate in Nevada cannot easily be explained as it has the 19th largest population of black people in the country.

Q5: What about white people?

Figure 6: Likelihood of a white person being killed by police.

From Figure 6 we can see that Nevada is also the highest risk state for being killed by police. Interestingly, states above the Mason-Dickson line now seem to be the safest place for white people when it comes to being at risk from the police. Overall, we should not be looking at Figure 6 compared with Figure 5 and saying “oh well white people are killed by the police also”.

The reason is that we need to keep the scale of the legend in mind. In Figure 5 the scale goes all the way up to nearly 20 deaths per 100 thousand people while in Figure 6 the scale goes only to around 8 deaths per 100 thousand people. This means that consistent with the national level data, black people are significantly more likely to be killed by police than white people.

Q6: So what? Black people are at higher risk of committing crime as well! Aren’t they just getting their fair share of the risk inherent in criminal activity?

The blog FactCheck discusses this point, especially with regards the higher rate of black males killed by police officers than that of white males potentially due to black males being much more likely than white males of engaging in violent crime. I will not discuss at this time theories as to why black people might be at a higher risk of committing crimes except to say that I think poverty is a better explanation for these differences.

That said, I think the Fatal Encounters data can get us a further into examining the nature of police violence towards black people than the FBI data used by FactCheck. In particular the Fatal Encounter data set has a brief description of the circumstances involved in the deaths. The text in these circumstances can be quantified in order to help naive data explorers better understanding of what is going on.

If no racial discrimination exists in how police force is used against individuals then there should be an equal proportion of unarmed blacks killed as that of unarmed whites killed as from those killed. If however, black who are killed are more likely to be killed unjustifiably then we should see that evidenced in the increased likelihood of being killed .

In the data there are 301 cases in which “unarmed” appears in connection with a death resulting from interaction with a police officer. 104 of the 1934 black people killed were unarmed while 60 of the 2578 white people who died at the hands of police were killed. This means that the proportion of unarmed people killed by police among blacks (104/1934=5.4%) is 2.3 times greater than that of white people killed by police (60/2578=2.3%). In probability notation:

“Unarmed”: 315 Cases
P(Unarmed|Black & Killed)/P(Unarmed|White & Killed)=2.3

In other words this means that the likelihood of police killing someone who is unarmed is 130% greater when they are encountering a black person than when encountering a white person.

Now you might say, that this is just the case for the term “unarmed”. However, it is really hard to find any term associated with police brutality which does not appear disproportionately higher for blacks than it does for whites.

Table 1: Table shows the number of cases and frequency by which key words were reported for black people’s death relative to that of whites.
Descriptor          Cases  (Word|Black)/P(Word|White)
Unarmed             315     2.3
Naked               67      1.6
Toy                 41      3.6
Cooperative         23      1.7

Excessive Force Words
Age 9Age 10-15           131     3.1
Shot 10+ times      168     1.7
Wrongful Death      83      1.4
Indictment          106     1.5

Cause of DeathTrauma/beating/etc. 105     1.7
Taser               372     1.7
Asphyxiation        115     1.8
Medical Compli.     173     2.0
Gunshot            7450     1.0
Vehicle            1225     0.9

Justified Homicide keywords
“Reaching…gun”    109     1.6
“Robb(ed/ing)”     1006     1.4

Violent Crime Act
“Hostage”           160     0.8
“Standoff”          360     0.6

From Table 1 we can see that black people are 130% more likely to be killed unarmed, 60% more likely to be killed while naked, 260% more likely to be killed in association with a toy (probably a toy handgun), and 70% more likely to be killed even when the word “cooperative” was used to describe the situation.

Black people who are killed are are also 40% more likely to be reported as aged less than 10 and 210% more likely to be between 11 and 15 than their white counterparts. When killed black people are 70% more likely to be shot 10 or more times. Furthermore, black people who are killed are 40% more likely to give rise to a wrongful death suit and 50% more likely to result in some kind of indictment against the officers invovled.

When in comes to cause of death, black people are 70% more likely to die from trama such as beating, stabbing, falling, etc. while also being 70% more likely to have being tased as a contributing cause of death. Likewise, black people to die of asphyxiation is 80% more likely and medical complications 100% more likely than that of their white counterparts.

Overall, blacks are equally likely as whites to die by gunshot while being 10% less likely than whites to die by vehicle. The killing of black people is often justified as resulting from them either being in the acting of robbing or stealing 60% more likely than whites or “reaching for a gun”, 40% more likely than whites.

However among serious crimes that result in hostages blacks are 20% less likely to be killed while involved in and 40% less likely to develop a situation with police that results in a standoff ending in death.

Conclusion: So, what are we to take from this?

Black people are clearly much higher risk of being killed by police officers than that of white people. This statistics does not immediately lead to the conclusion that there is discrimination by police. However it does raise some red flags. Attempting to look more closely at state level data we can see in states above the Mason-Dixon line black people are often at much higher relative risk than their white peers of being killed by police while certain states below the Mason-Dixon line have generally much higher rates of citizenry being killed by police.

Critics of any kind of simple analysis of death by police for black people argue that black people are more likely to commit violent crimes and therefore should be more likely to suffer violent deaths. Those who live by the sword, die by the sword and all that.

If that were the case then we should expect higher likelihoods of deaths of black people by police than white (which is what we see) but there to be more or less equivalent rates or lower rates of black people getting killed while vulnerable than that of their white counterparts. These vulnerable states are difficult to measure but text flags for “unarmed”, “naked”, less than 10 etc. attempt to get at this.

In the data we do not see exactly the opposite with black people much more likely to be killed while in a vulnerable state. We also see that the rates of unusual or brutal deaths (result of beatings, asphyxiation, trauma, excessive or ineffective taser usage, etc.)  resulting from police contact tend to be much higher among blacks.

We also see that black people seem to be at higher risk of dying while committing robbery or while “reaching of a gun” while whites are more likely to die while participating in more serious sounding situations such as taking a hostage or after a standoff with police.

Overall, the picture that the data paints is that of a population which is more likely to commit crimes but also much more likely to suffer excessive force resulting from police prejudice. If there were evidence that excessive force leads neighborhood reforms, lower crime, and rehabilitation then these kinds of actions might be seen as justifiable. However, there is no evidence that excessive force does anything but reinforce racial stereotypes resulting in more cyclical crime and poverty.

You can find my code to do your own analysis here. Apologize that it is messy. I have been working on this “quick one-day-post” for two weeks now and need to get it out so that I can focus on my dissertation work.

Footnote: [1] Total number of killings * Average number of police involved / Number of police nationally = 20,000 killings/16 years * lets say 4 police / 890,000 police = 9%/16 years = 0.56%/year.
* Fatal Encounters Data Issues

This data has some obvious issues. Primarily that it is based on what can be gathered from spotty and inconsistent newspaper reports. Newspaper reports are problematic because they often fail to provide important information such as the ethnicity of victims or the cause of death or how the case was resolved such as if any disciplinary action was taken with regards the officers involved.

Yet even more problematic, newspaper data cannot be assumed to be complete. That is some portion of deaths are not reported or if they are reported the article is only kept for a limited time frame. FatalEncounters has attempted to remedy this problem by going through government records. However, as they say many government records are only required to be kept for only a few years. Since the records are so sparse in two indicators (race, disposition and mental state) they suggest not using this records for analysis.

This, however is cutting ourselves short as all we need is to make some basic assumptions in order to let the data work for us. The biggest issue with the data is that though it has nearly ten thousand cases, the data set maintainer D. Brian Burghart believes that it is about 46 percent complete. This is based on assuming a constant rate of deaths for all 16 years of data collection. For the two years with the most complete information 2013 and 2014 there are 1,257 and 1,292 deaths respectively.

* Simplifying assumption discussions

The first assumption might be debatable since some might argue that race might be more likely to be reported for black people or for white people. However, since our data deals with an abundance of black deaths relative to that of white the only type of under-reporting of race that might make our analysis invalid is that of under-reporting of white race. Fortunately we can test our analysis in that case by looking at what would happen if we assumed all unknown races were white.
Race breakdown:
black latino other unknown white 1934 1122 213 3624 2578

We can see that the number of unknowns is large but not that large. If we assume all of them are white people then this more than doubles the number of white people who died from police. However, even doubling the number of white people

To leave a comment for the author, please follow the link and comment on their blog: Econometrics by Simulation. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…

Posted in Benefits of open data, Data, Data Journalism, Informing Decision-making, Journalism, Posts from feeds, Public safety, Research, Smart communities, Transparency, Visualising data Tagged with: , , , , ,