We Will Never Know the Actual Rate of Sexual Assault

The U.S. Department of Justice has released a report entitled “Rape and Sexual Assault Victimization Among College-Age Females, 1995-2013.”  In light of the Rolling Stone and Lena Dunham controversies, the DOJ report is getting some media attention.

But the report simply does not answer the question at hand, nor can it.  Nor can any report.  The public and policymakers are going to eventually need to understand that we will never, ever know the real rate of sexual assaults in any demographic group, never mind among the populace as a whole.

The DOJ report is both unscientific and irresponsible in its reporting.  The report's language does not appropriately address the uncertainty in the underlying data.  Witness the first paragraph of the report:

For the period 1995-2013, females ages 18 to 24 had the highest rate of rape and sexual assault victimizations compared to females in all other age groups. Within the 18 to 24 age group, victims could be identified as students enrolled in a college, university, trade school or vocational school or as nonstudents. Among student victims, 20% of rape and sexual assault victimizations were reported to police, compared to 32% reported among nonstudent victims ages 18 to 24.

Words such as “alleged” and “claimed” should be embedded throughout this report.  Unfortunately, they are not.  The paragraph above is written as a statement of fact.  It should not be.  The data within this report are – at best – very approximate estimates whose errors may be massive.

The second paragraph in the report describes the methodology:

This report describes and compares the characteristics of student and nonstudent female victims of rape and sexual assault, the attributes of the victimization, and the characteristics of the offender. The findings are from the Bureau of Justice Statistics' (BJS) National Crime Victimization Survey (NCVS), which collects information on nonfatal crimes reported and not reported to police against persons age 12 or older. Rape and sexual assault are defined by the NCVS to include completed and attempted rape, completed and attempted sexual assault, and threats of rape or sexual assault.

The NCVS is a voluntary survey whose responses are not verified, nor are they generally verifiable.  In this dataset, there will be false positives (i.e., individuals who claim they were raped or sexually assaulted but who were, in fact, not raped or sexually assaulted) and false negatives (i.e., individuals who claim they were not raped or sexually assaulted but who were, in fact, raped or sexually assaulted).  It is impossible to determine the magnitude of each of these errors, or the net direction of error.  There is also no reason to assume that the errors cancel each other out, and that the reported rates are accurate.

It has become unfashionable to even discuss such concerns over accuracy in the context of sexual assault crimes, but rigorous science does not seek to be politically correct – or to concern itself with “feelings.” Rather, rigorous science – both the social and natural sciences – concerns itself only with reality.  In short, accuracy is the only goal.  Nothing else is relevant.

Sexual assault is defined very broadly in the survey:

Sexual assault is defined across a wide range of victimizations separate from rape or attempted rape. These crimes include attacks or attempted attacks usually involving unwanted sexual contact between a victim and offender. Sexual assault may or may not involve force and includes grabbing or fondling.

Given that sexual assault includes perceived “attempted attacks” that “may not involve force” and “includes grabbing or fondling,” the alleged statistics for this offense should be used with extreme caution because of possible positive bias.

Rape is defined as follows in the survey methodology:

Rape is the unlawful penetration of a person against the will of the victim, with use or threatened use of force, or attempting such an act. Rape includes psychological coercion and physical force, and forced sexual intercourse means vaginal, anal, or oral penetration by the offender. Rape also includes incidents where penetration is from a foreign object (e.g., a bottle), victimizations against males and females, and both heterosexual and homosexual rape. Attempted rape includes verbal threats of rape.

Critical readers will note the extreme subjectivity in these definitions, which highlights a fundamental flaw in the methodology and reporting.  According to this definition, the verbal threat of rape is classified as a rape, as are attempts.  (More precisely, a verbal threat of rape is considered an attempted rape, which in turn is considered an actual rape.)  In other words, this survey definition of “rape” is far too broad.  It is simply inaccurate, rendering the survey results themselves inaccurate.

Even with these extremely subjective, entirely unproven, and overly broad definitions of rape and sexual assault, the report concludes that “the rate of rape and sexual assault was 1.2 times higher for nonstudents (7.6 per 1,000) than for students (6.1 per 1,000).”  These translate into rates of 0.76 percent and 0.61 percent for nonstudents and students, respectively, far lower than the rates of 20 to 33 percent being thrown around all too casually in the mainstream media.

That said, the rates presented in this DOJ report may be either significant under- or over-estimates.  Here is the methodological description of the NCVS questioning:

The NCVS is administered to persons age 12 or older from a nationally representative sample of households in the United States ... All first interviews are conducted in person with subsequent interviews conducted either in person or by phone ...

The NCVS used a two-phased approach to identifying incidents of rape and sexual assault. Initially, a screener was administered, with cues designed to trigger the respondent's recollection of event and ascertain whether the respondent experienced victimization during the reference period. The screener questions directly focused on rape and sexual assault were --

- (Other than any incidents already mentioned), has anyone attacked or threatened you in any of these ways: ... (e) any rape, attempted rape, or other type of sexual attack;

- Incidents involving forced or unwanted sexual acts are often difficult to talk about. (Other than any incidents already mentioned), have you been forced or coerced to engage in unwanted sexual activity by (a) someone you didn't know before, (b) a casual acquaintance? OR (c) someone you know well?

Even if the respondent did not respond affirmatively to these specific screeners on rape and unwanted sexual contact, the respondent could still be classified as a rape or sexual assault victim if a rape or unwanted sexual contact was reported during the stage-two incident report.

Words like “cues” and “trigger” – and concepts such as “coerced” and “unwanted” – should raise red flags over potential false positives.  Similarly, the fact that “all first interviews are conducted in person” should raise red flags over possible false negatives.  One could reasonably foresee many respondents being unwilling to describe such attacks to a surveyor, either in person or over the phone.  There also appears to be some discretion in the classification by the survey taker, as evidenced by the last sentence in the quote above, which also raises a major red flag regarding data integrity.

For comparison, the U.S. Department of Justice – Federal Bureau of Investigation has recently issued a user manual and technical specification on “Reporting Rape in 2013” under the Criminal Justice Information Services (CJIS) Division Uniform Crime Reporting (UCR) Program.  According to this DOJ-FBI document released in April 2014, “in December 2011, FBI Director Robert S. Mueller, III, approved revisions to the UCR Program's definition of rape: 'Penetration, no matter how slight, of the vagina or anus with any body part or object, or oral penetration by a sex organ of another person, without the consent of the victim,'” and “the new definition of Rape went into effect on January 1, 2013.”  This DOJ-FBI definition is very different from the DOJ's definition of rape in the Office of Justice Programs (Bureau of Justice Statistics) report under discussion.  Why?

Under this revised definition of rape, the FBI-UCR rate of rape in the United States during 2013 was 39.8 per 100,000 population, or 0.040 percent.  Using the “legacy definition” of rape, the FBI-UCR rate of rape for 2013 was 23.1 per 100,000 population, or 0.023 percent.

The best data we have to date on the rate of rape and sexual assaults comes from the criminal justice system – a system where claims are appropriately tested.  Yet even these statistics are unreliable.  Some individuals do not report crimes.  Some individuals report crimes that never took place.  Some individuals are found not guilty of crimes they committed.  And some individuals are found guilty of crimes they never committed.

The phrase “trust but verify” was used in the right-of-center media over the past few days with respect to sexual assault allegations.  This, too, is not the correct approach, as it indicates an initial presumption of guilt against the accused coupled with an investigatory approach by authorities that seeks to prove guilt.  Instead, we must always presume innocence and design our investigatory mechanisms so that they objectively seek to find the truth, whatever that truth may be.

Western civilization was founded on the concepts of due process and the presumption of innocence within the overarching fabric of the rule of law.  We must never forsake these principles, especially in light of very dubious statistics.

The U.S. Department of Justice has released a report entitled “Rape and Sexual Assault Victimization Among College-Age Females, 1995-2013.”  In light of the Rolling Stone and Lena Dunham controversies, the DOJ report is getting some media attention.

But the report simply does not answer the question at hand, nor can it.  Nor can any report.  The public and policymakers are going to eventually need to understand that we will never, ever know the real rate of sexual assaults in any demographic group, never mind among the populace as a whole.

The DOJ report is both unscientific and irresponsible in its reporting.  The report's language does not appropriately address the uncertainty in the underlying data.  Witness the first paragraph of the report:

For the period 1995-2013, females ages 18 to 24 had the highest rate of rape and sexual assault victimizations compared to females in all other age groups. Within the 18 to 24 age group, victims could be identified as students enrolled in a college, university, trade school or vocational school or as nonstudents. Among student victims, 20% of rape and sexual assault victimizations were reported to police, compared to 32% reported among nonstudent victims ages 18 to 24.

Words such as “alleged” and “claimed” should be embedded throughout this report.  Unfortunately, they are not.  The paragraph above is written as a statement of fact.  It should not be.  The data within this report are – at best – very approximate estimates whose errors may be massive.

The second paragraph in the report describes the methodology:

This report describes and compares the characteristics of student and nonstudent female victims of rape and sexual assault, the attributes of the victimization, and the characteristics of the offender. The findings are from the Bureau of Justice Statistics' (BJS) National Crime Victimization Survey (NCVS), which collects information on nonfatal crimes reported and not reported to police against persons age 12 or older. Rape and sexual assault are defined by the NCVS to include completed and attempted rape, completed and attempted sexual assault, and threats of rape or sexual assault.

The NCVS is a voluntary survey whose responses are not verified, nor are they generally verifiable.  In this dataset, there will be false positives (i.e., individuals who claim they were raped or sexually assaulted but who were, in fact, not raped or sexually assaulted) and false negatives (i.e., individuals who claim they were not raped or sexually assaulted but who were, in fact, raped or sexually assaulted).  It is impossible to determine the magnitude of each of these errors, or the net direction of error.  There is also no reason to assume that the errors cancel each other out, and that the reported rates are accurate.

It has become unfashionable to even discuss such concerns over accuracy in the context of sexual assault crimes, but rigorous science does not seek to be politically correct – or to concern itself with “feelings.” Rather, rigorous science – both the social and natural sciences – concerns itself only with reality.  In short, accuracy is the only goal.  Nothing else is relevant.

Sexual assault is defined very broadly in the survey:

Sexual assault is defined across a wide range of victimizations separate from rape or attempted rape. These crimes include attacks or attempted attacks usually involving unwanted sexual contact between a victim and offender. Sexual assault may or may not involve force and includes grabbing or fondling.

Given that sexual assault includes perceived “attempted attacks” that “may not involve force” and “includes grabbing or fondling,” the alleged statistics for this offense should be used with extreme caution because of possible positive bias.

Rape is defined as follows in the survey methodology:

Rape is the unlawful penetration of a person against the will of the victim, with use or threatened use of force, or attempting such an act. Rape includes psychological coercion and physical force, and forced sexual intercourse means vaginal, anal, or oral penetration by the offender. Rape also includes incidents where penetration is from a foreign object (e.g., a bottle), victimizations against males and females, and both heterosexual and homosexual rape. Attempted rape includes verbal threats of rape.

Critical readers will note the extreme subjectivity in these definitions, which highlights a fundamental flaw in the methodology and reporting.  According to this definition, the verbal threat of rape is classified as a rape, as are attempts.  (More precisely, a verbal threat of rape is considered an attempted rape, which in turn is considered an actual rape.)  In other words, this survey definition of “rape” is far too broad.  It is simply inaccurate, rendering the survey results themselves inaccurate.

Even with these extremely subjective, entirely unproven, and overly broad definitions of rape and sexual assault, the report concludes that “the rate of rape and sexual assault was 1.2 times higher for nonstudents (7.6 per 1,000) than for students (6.1 per 1,000).”  These translate into rates of 0.76 percent and 0.61 percent for nonstudents and students, respectively, far lower than the rates of 20 to 33 percent being thrown around all too casually in the mainstream media.

That said, the rates presented in this DOJ report may be either significant under- or over-estimates.  Here is the methodological description of the NCVS questioning:

The NCVS is administered to persons age 12 or older from a nationally representative sample of households in the United States ... All first interviews are conducted in person with subsequent interviews conducted either in person or by phone ...

The NCVS used a two-phased approach to identifying incidents of rape and sexual assault. Initially, a screener was administered, with cues designed to trigger the respondent's recollection of event and ascertain whether the respondent experienced victimization during the reference period. The screener questions directly focused on rape and sexual assault were --

- (Other than any incidents already mentioned), has anyone attacked or threatened you in any of these ways: ... (e) any rape, attempted rape, or other type of sexual attack;

- Incidents involving forced or unwanted sexual acts are often difficult to talk about. (Other than any incidents already mentioned), have you been forced or coerced to engage in unwanted sexual activity by (a) someone you didn't know before, (b) a casual acquaintance? OR (c) someone you know well?

Even if the respondent did not respond affirmatively to these specific screeners on rape and unwanted sexual contact, the respondent could still be classified as a rape or sexual assault victim if a rape or unwanted sexual contact was reported during the stage-two incident report.

Words like “cues” and “trigger” – and concepts such as “coerced” and “unwanted” – should raise red flags over potential false positives.  Similarly, the fact that “all first interviews are conducted in person” should raise red flags over possible false negatives.  One could reasonably foresee many respondents being unwilling to describe such attacks to a surveyor, either in person or over the phone.  There also appears to be some discretion in the classification by the survey taker, as evidenced by the last sentence in the quote above, which also raises a major red flag regarding data integrity.

For comparison, the U.S. Department of Justice – Federal Bureau of Investigation has recently issued a user manual and technical specification on “Reporting Rape in 2013” under the Criminal Justice Information Services (CJIS) Division Uniform Crime Reporting (UCR) Program.  According to this DOJ-FBI document released in April 2014, “in December 2011, FBI Director Robert S. Mueller, III, approved revisions to the UCR Program's definition of rape: 'Penetration, no matter how slight, of the vagina or anus with any body part or object, or oral penetration by a sex organ of another person, without the consent of the victim,'” and “the new definition of Rape went into effect on January 1, 2013.”  This DOJ-FBI definition is very different from the DOJ's definition of rape in the Office of Justice Programs (Bureau of Justice Statistics) report under discussion.  Why?

Under this revised definition of rape, the FBI-UCR rate of rape in the United States during 2013 was 39.8 per 100,000 population, or 0.040 percent.  Using the “legacy definition” of rape, the FBI-UCR rate of rape for 2013 was 23.1 per 100,000 population, or 0.023 percent.

The best data we have to date on the rate of rape and sexual assaults comes from the criminal justice system – a system where claims are appropriately tested.  Yet even these statistics are unreliable.  Some individuals do not report crimes.  Some individuals report crimes that never took place.  Some individuals are found not guilty of crimes they committed.  And some individuals are found guilty of crimes they never committed.

The phrase “trust but verify” was used in the right-of-center media over the past few days with respect to sexual assault allegations.  This, too, is not the correct approach, as it indicates an initial presumption of guilt against the accused coupled with an investigatory approach by authorities that seeks to prove guilt.  Instead, we must always presume innocence and design our investigatory mechanisms so that they objectively seek to find the truth, whatever that truth may be.

Western civilization was founded on the concepts of due process and the presumption of innocence within the overarching fabric of the rule of law.  We must never forsake these principles, especially in light of very dubious statistics.