Pat Leahy: Canary in a Data Mine

The new chairman of the Senate Judiciary Committee is demanding he be kept apprised of covert technologies our intelligence agencies use to thwart terrorism. The legislation he's cosponsoring would compel the White House to provide regular reports on all current and future intelligence data-mining operations.  Such a plan to trust Congress not to expose mechanisms which inherently demand obscurity would certainly be ill-advised regardless of its source.  But this scheme was hatched by the Senator once voted least likely to keep a top secret -- Patrick Leahy. 

As you may recall, Leahy was stripped of his Senate Intelligence Committee vice-chair during the mid 80's for making good on threats to sabotage classified strategies he didn't personally care for.  During Ronald Reagan's own war on terror, the Vermont Democrat was aptly nicknamed "Leaky Leahy" for proving time and again that he would do absolutely anything to discredit the Republican President -- including revealing the most vital of national security secrets

In 1985, he was charged with disclosing a top-secret communications intercept which had led to the capture of the murderous Achille Lauro hijacking terrorists.  That leak likely cost an Egyptian counterterrorist agent his life shortly thereafter.  Then, in 1986, Leahy threatened to leak secret information about a covert operation to topple Libyan dictator Moammar Gadhafi.  When the details of the operation later appeared in the Washington Post, the mission was immediately aborted.

The loose-lipped liberal was finally forced to resign his post a year later when he was caught singing like a canary to an NBC reporter about classified information on the Senate Iran-Contra hearings.  On his third strike he was out, but, unfortunately, the game was not over.

Data Mining for Democrats

Now somehow holding his own committee gavel 20 years later, Leahy has promised to uncover abuses related to his latest pet peeve -- the balance of privacy and security in government use of evolving investigative technologies.  So it came as no surprise when his first target was the liberal-dreaded predictive analysis technique known as Data Mining (DM).  On January 10th, hearings began to investigate both the efficacy and legality of applying the process to counterterrorism.

In his opening remarks, Leahy complained that:
"Although billed as counterterrorism tools, the overwhelming majority of these data mining programs use, collect, and analyze personal information about ordinary American citizens.  Despite their prevalence, these government data mining programs often lack adequate safeguards to protect privacy and civil liberties."
These words betray a man investigating a process he doesn't understand.  Even those dimly illuminated understand that in order to isolate the extraordinary, you first define the ordinary.  Here's why.

Data Mining is the normalization, correlation, and analysis of multidimensional datasets to train standardized models through sophisticated Artificial Intelligence algorithms for future predictive analysis.  Okay -- that's quite a mouthful of techno-babble.  In plain English it's the establishment of standard models based upon trends found in vast stores of data.

Once proven through rigorous testing, these models can then be applied to other datasets to predict patterns and to flag common tendencies, variations and precursors worthy of investigation.  It's actually must less complicated than it sounds. A simple and familiar example of behavioral pattern prediction is Amazon's ability to accurately suggest books, CD's and movies that might spark your interest. A less familiar example of basic variation flagging provides investigators benefits which include shorter, better vetted watch and suspect lists. By expanding these concepts, DM is used successfully in the fields of fraud detection, risk profiling, financial and resource planning, claims analysis, demand forecasting and countless other applications. 

Not surprisingly, one key point which consistently eludes the ACLU crowd is that DM is not the study of individuals but rather of trends. In other words, it analyzes the forest, not the trees, as Leahy's misleading words would imply.  And, last time I checked, neither trends nor forests have any rights or expectations of privacy.  Moreover, it is not, as Leahy and others of his ilk have often referred to it, electronic eavesdropping, wiretapping or anything even remotely related. 

However, as with all statistical analysis, margin of error and population are inversely proportional.  Therefore, Leaky, the more "ordinary" American citizens you catalog, the greater the accuracy of your extraordinary predictions. 

This Canary Sounds like a Cuckoo

Having proven his ignorance of DM mechanics, Leahy then proceeded to showcase his equal lack of knowledge about the currently deployed programs he distrusts:
"Just recently, we learned through the media that the Bush Administration has used data mining technology secretly to compile files on the travel habits of millions of law-abiding Americans.  Incredibly, under the Department of Homeland Security's (DHS) Automated Targeting System program ("ATS"), our government has been collecting and sharing this sensitive personal information with foreign governments and even private employers...."
Actually, ATS has been evaluating millions of travelers since 2002 when DHS began requiring air and cruise lines to provide advance data on all passengers and crew entering and leaving the country.  And yes, the data may be shared with state, local and foreign governments (as suggested by the venerated 9/11 commission) for use in hiring decisions and in granting licenses, security clearances, and contracts.  It all sounds quite sensible when you actually take the time to give it some thought.  In fact, ATS is just one of many sources the DHS Customs and Border Protection (CBP) agency relies on 24/7.  They also reference terrorist watch lists and mine numerous other federal data warehouses when isolating potentially dangerous people and cargo entering or leaving our over 300 ports. 

The contemptible chairman continued with another attempt to demonize the process:

"Following years of denial, the Transportation Security Administration ("TSA") has finally admitted that its controversial "Secure Flight" data mining program - which collects and analyzes airline passenger data obtained from commercial data brokers - violated federal privacy laws by failing to give notice to U.S. air travelers that their personal data was being collected for government use."
This was blather of the lowest grade. True -- the TSA did admit to Secure Flight's minor non-compliance with the Privacy Act during system pressure testing.  But Leahy failed to mention that they also characterized the problems as "largely unintentional," attributing them to a failure to revise public announcements after the test parameters required modification.  Leahy also conveniently neglected to reveal that one of the primary goals of Secure Flight is to facilitate safer air travel and faster boarding for non-threats while providing enhanced screening for potential-threats.  But then, who wants faster and safer airport screening?

And Deceives Like a Mockingbird

You'd best be seated for this next one.  Leahy continued:
"And last month, The Washington Post reported that the Department of Justice will expand its ONE-DOJ program - a massive data base that will allow state and local law enforcement officials to review and search millions of sensitive criminal files belonging to the FBI, DEA and other federal law enforcement agencies.  This will make sensitive investigative information about thousands of individuals - including those who have never been charged with a crime - available to local and state law agencies."
Wait a minute!  Wasn't interagency communication failure among the chief complaints of the 9/11 Commission - the so called "Wall of Separation" erected by the Clinton Administration? And wasn't one of their primary recommendations the sharing of information between federal agencies, as well as with state, local, and selected foreign officials?  In fact, OneDOJ is in complete compliance with the Law Enforcement Information Sharing Program (LEISP) in that it allows searches to be conducted across disparate federal and local law-enforcement computer systems.  Yet, he shamelessly holds this forth as a fault?

But here's the real measure of this man.  Last August, WaPo was delighted to report the failure of the FBI's Virtual Case File project. The VCF was another networked system for tracking criminal cases, designed to replace the bureau's antiquated paper files.  But more than $600 million later, myriad system flaws and milestone setbacks caused the DOJ to scrap the project.  And what do you suppose Leahy's response was to the collapse of a system quite similar to OneDOJ, which he denounced? 

"We had information that could have stopped 9/11. It was sitting there and was not acted upon. . . . I haven't seen them correct the problems. . . . We might be in the 22nd century before we get the 21st-century technology."
Please -- read that quotation more than once.

Data Mining as Profiling

Once Leahy's diatribe ended, the floor was opened to the invited panel of experts. These ranged from Jim Harper, director of information policy studies at the Cato Institute, to the astute James Carafano, a senior research fellow at the Heritage Foundation.  Each had his own take on both the privacy and the technical issues.

Harper, an advisor to the DHS privacy office, essentially reiterated the theme of his December article which questioned the value of DM against terrorism given its privacy costs.  He concluded that,
"terrorist acts and their precursors are too rare in our society for there to be patterns to find. There simply is no nugget of information to mine."
While his assessment fairly addressed the lack of suitable population in our society for classic DM, it unfairly disregarded the data available worldwide and the constant evolution of DM heuristics. Carafano wisely agreed that "behavior science modeling is a rapidly developing field" and defended the effectiveness and legality of the programs by comparing them to what police officers routinely do in the field:
"When a cop goes on the street, he's collecting information every second. He's looking for behavior that's out of place. He pulls a car over and everything else. And that leads to a whole thing. So there, he's not starting with a suspect, yet he's continually gathering freely accessible information." 
Jim may have just nailed the left's irrational hatred for this technology.  Many liberals would reflexively brand the behavioral profiling he described as potentially racially, ethnically, or religiously motivated.  One wonders -- if long term predictors of terrorism should include MidEast Muslims, can outcries of Racial Data Mining from the left and CAIR be far behind?

Connecting the Dots

So, while Leahy focused on downplaying DM's potential upside and up-playing its downside, the experts disagreed on both.  But while the arguments continue, this much is clear -- as an adjunct to intelligence and surveillance, DM has delivered very promising results in the fields of law enforcement and counterterrorism.

For Example:
  • In Business, Empire Blue Cross/Blue Shield, New York State's largest health insurer, has realized hundreds of millions of dollars in fraud-and-abuse savings using IBM's Fraud and Abuse Management System (FAMS).
  • In Banking, Citibank used DM techniques to locate and capture Vladimir Leonidovich Levin, who funneled $10 million into accounts around the world.  Citibank refuses to explain their DM system as they are understandably wary of publishing all the details.
  • And yes, in counterterrorism, DM helped flag a Jordanian national trying to enter the country at Chicago's O'Hare Airport in 2003. CBP officers said Raed Mansour al-Banna was calm and polite and that his is papers were in perfect order. The DM flag alone prevented his entry. Two years later, he detonated a mammoth car bomb in Hilla, Iraq, killing 130 people and wounding 146.
Many have depicted our vulnerability on 9/11 as a failure to "connect the dots," and the term has come to describe the ability to take seemingly unrelated circumstances and find the common thread.  But these allegorical dots must be seen before they can be connected, and not limited to those immediately apparent.  In an asymmetrical war where attacks must be prevented rather than defended against, Data Mining and other forms of behavioral pattern prediction are our best and perhaps only hope in accomplishing both.

We Know Why the Caged Bird Sings

It's indeed noteworthy that in his opening address, Leahy used the term "checks and balances" 3 times.  He also used the word "safeguard" 4 times and we find "protect(ing)" 5 times, but each in regard to privacy and civil rights and never to the country and its citizens or from our enemies. In fact, the words "safe" and "safety" only appear in a positive note once in the entire speech, and even then "not just from enemies abroad, but also from abuses at home."  And yet, he only briefly mentioned expert opinion of the technology's place in counterterrorism and then emphasized the negative.  Is there any doubt where he stands personally on the subject, notwithstanding his complete lack of workable knowledge about it?

Obviously, Data Mining represents neither wiretapping nor eavesdropping nor any of the other totally inappropriate aliases the totally misinformed have used to besmirch it.  The Democrats and the MSM have shamelessly exaggerated its "invasion of privacy" facets while minimizing its potential as a defensive weapon against our worst nightmares.

As such, there would certainly seem to be no harm in discussing what Leahy has called "adequate checks and balances and oversight."  But let's keep in mind that two decades ago, this man exposed just how vulnerable the words "top secret" can be when he is not personally satisfied with policies held as such. 

Accordingly, the real danger lies in the extent of the oversight and the level of report detail he is allowed.  CitiBank was shrewd in guarding the specifics of its DM system, knowing that transparency would undermine its power. We would be wise to follow their lead. After all, that man hitting blocks with a gavel in Washington should likely be hitting rocks with a sledge-hammer in Kansas. 

Here's a closing thought. Early coal miners knew they were in trouble when the canary stopped singing.  Being a likely predictor of the presence of noxious gases, they'd evacuate for fear of losing many lives to potential asphyxiation or explosion.  Conversely, modern data miners will know they're in trouble when the canary starts singing.

And in that case, were he to know all the words, it would be a likely predictor that the lives lost from asphyxiation and explosions may well number in the millions.

Marc Sheppard is a technology consultant, software engineer, writer, and political and systems analyst. He is a regular contributor to American Thinker. He welcomes your feedback.
The new chairman of the Senate Judiciary Committee is demanding he be kept apprised of covert technologies our intelligence agencies use to thwart terrorism. The legislation he's cosponsoring would compel the White House to provide regular reports on all current and future intelligence data-mining operations.  Such a plan to trust Congress not to expose mechanisms which inherently demand obscurity would certainly be ill-advised regardless of its source.  But this scheme was hatched by the Senator once voted least likely to keep a top secret -- Patrick Leahy. 

As you may recall, Leahy was stripped of his Senate Intelligence Committee vice-chair during the mid 80's for making good on threats to sabotage classified strategies he didn't personally care for.  During Ronald Reagan's own war on terror, the Vermont Democrat was aptly nicknamed "Leaky Leahy" for proving time and again that he would do absolutely anything to discredit the Republican President -- including revealing the most vital of national security secrets

In 1985, he was charged with disclosing a top-secret communications intercept which had led to the capture of the murderous Achille Lauro hijacking terrorists.  That leak likely cost an Egyptian counterterrorist agent his life shortly thereafter.  Then, in 1986, Leahy threatened to leak secret information about a covert operation to topple Libyan dictator Moammar Gadhafi.  When the details of the operation later appeared in the Washington Post, the mission was immediately aborted.

The loose-lipped liberal was finally forced to resign his post a year later when he was caught singing like a canary to an NBC reporter about classified information on the Senate Iran-Contra hearings.  On his third strike he was out, but, unfortunately, the game was not over.

Data Mining for Democrats

Now somehow holding his own committee gavel 20 years later, Leahy has promised to uncover abuses related to his latest pet peeve -- the balance of privacy and security in government use of evolving investigative technologies.  So it came as no surprise when his first target was the liberal-dreaded predictive analysis technique known as Data Mining (DM).  On January 10th, hearings began to investigate both the efficacy and legality of applying the process to counterterrorism.

In his opening remarks, Leahy complained that:
"Although billed as counterterrorism tools, the overwhelming majority of these data mining programs use, collect, and analyze personal information about ordinary American citizens.  Despite their prevalence, these government data mining programs often lack adequate safeguards to protect privacy and civil liberties."
These words betray a man investigating a process he doesn't understand.  Even those dimly illuminated understand that in order to isolate the extraordinary, you first define the ordinary.  Here's why.

Data Mining is the normalization, correlation, and analysis of multidimensional datasets to train standardized models through sophisticated Artificial Intelligence algorithms for future predictive analysis.  Okay -- that's quite a mouthful of techno-babble.  In plain English it's the establishment of standard models based upon trends found in vast stores of data.

Once proven through rigorous testing, these models can then be applied to other datasets to predict patterns and to flag common tendencies, variations and precursors worthy of investigation.  It's actually must less complicated than it sounds. A simple and familiar example of behavioral pattern prediction is Amazon's ability to accurately suggest books, CD's and movies that might spark your interest. A less familiar example of basic variation flagging provides investigators benefits which include shorter, better vetted watch and suspect lists. By expanding these concepts, DM is used successfully in the fields of fraud detection, risk profiling, financial and resource planning, claims analysis, demand forecasting and countless other applications. 

Not surprisingly, one key point which consistently eludes the ACLU crowd is that DM is not the study of individuals but rather of trends. In other words, it analyzes the forest, not the trees, as Leahy's misleading words would imply.  And, last time I checked, neither trends nor forests have any rights or expectations of privacy.  Moreover, it is not, as Leahy and others of his ilk have often referred to it, electronic eavesdropping, wiretapping or anything even remotely related. 

However, as with all statistical analysis, margin of error and population are inversely proportional.  Therefore, Leaky, the more "ordinary" American citizens you catalog, the greater the accuracy of your extraordinary predictions. 

This Canary Sounds like a Cuckoo

Having proven his ignorance of DM mechanics, Leahy then proceeded to showcase his equal lack of knowledge about the currently deployed programs he distrusts:
"Just recently, we learned through the media that the Bush Administration has used data mining technology secretly to compile files on the travel habits of millions of law-abiding Americans.  Incredibly, under the Department of Homeland Security's (DHS) Automated Targeting System program ("ATS"), our government has been collecting and sharing this sensitive personal information with foreign governments and even private employers...."
Actually, ATS has been evaluating millions of travelers since 2002 when DHS began requiring air and cruise lines to provide advance data on all passengers and crew entering and leaving the country.  And yes, the data may be shared with state, local and foreign governments (as suggested by the venerated 9/11 commission) for use in hiring decisions and in granting licenses, security clearances, and contracts.  It all sounds quite sensible when you actually take the time to give it some thought.  In fact, ATS is just one of many sources the DHS Customs and Border Protection (CBP) agency relies on 24/7.  They also reference terrorist watch lists and mine numerous other federal data warehouses when isolating potentially dangerous people and cargo entering or leaving our over 300 ports. 

The contemptible chairman continued with another attempt to demonize the process:

"Following years of denial, the Transportation Security Administration ("TSA") has finally admitted that its controversial "Secure Flight" data mining program - which collects and analyzes airline passenger data obtained from commercial data brokers - violated federal privacy laws by failing to give notice to U.S. air travelers that their personal data was being collected for government use."
This was blather of the lowest grade. True -- the TSA did admit to Secure Flight's minor non-compliance with the Privacy Act during system pressure testing.  But Leahy failed to mention that they also characterized the problems as "largely unintentional," attributing them to a failure to revise public announcements after the test parameters required modification.  Leahy also conveniently neglected to reveal that one of the primary goals of Secure Flight is to facilitate safer air travel and faster boarding for non-threats while providing enhanced screening for potential-threats.  But then, who wants faster and safer airport screening?

And Deceives Like a Mockingbird

You'd best be seated for this next one.  Leahy continued:
"And last month, The Washington Post reported that the Department of Justice will expand its ONE-DOJ program - a massive data base that will allow state and local law enforcement officials to review and search millions of sensitive criminal files belonging to the FBI, DEA and other federal law enforcement agencies.  This will make sensitive investigative information about thousands of individuals - including those who have never been charged with a crime - available to local and state law agencies."
Wait a minute!  Wasn't interagency communication failure among the chief complaints of the 9/11 Commission - the so called "Wall of Separation" erected by the Clinton Administration? And wasn't one of their primary recommendations the sharing of information between federal agencies, as well as with state, local, and selected foreign officials?  In fact, OneDOJ is in complete compliance with the Law Enforcement Information Sharing Program (LEISP) in that it allows searches to be conducted across disparate federal and local law-enforcement computer systems.  Yet, he shamelessly holds this forth as a fault?

But here's the real measure of this man.  Last August, WaPo was delighted to report the failure of the FBI's Virtual Case File project. The VCF was another networked system for tracking criminal cases, designed to replace the bureau's antiquated paper files.  But more than $600 million later, myriad system flaws and milestone setbacks caused the DOJ to scrap the project.  And what do you suppose Leahy's response was to the collapse of a system quite similar to OneDOJ, which he denounced? 

"We had information that could have stopped 9/11. It was sitting there and was not acted upon. . . . I haven't seen them correct the problems. . . . We might be in the 22nd century before we get the 21st-century technology."
Please -- read that quotation more than once.

Data Mining as Profiling

Once Leahy's diatribe ended, the floor was opened to the invited panel of experts. These ranged from Jim Harper, director of information policy studies at the Cato Institute, to the astute James Carafano, a senior research fellow at the Heritage Foundation.  Each had his own take on both the privacy and the technical issues.

Harper, an advisor to the DHS privacy office, essentially reiterated the theme of his December article which questioned the value of DM against terrorism given its privacy costs.  He concluded that,
"terrorist acts and their precursors are too rare in our society for there to be patterns to find. There simply is no nugget of information to mine."
While his assessment fairly addressed the lack of suitable population in our society for classic DM, it unfairly disregarded the data available worldwide and the constant evolution of DM heuristics. Carafano wisely agreed that "behavior science modeling is a rapidly developing field" and defended the effectiveness and legality of the programs by comparing them to what police officers routinely do in the field:
"When a cop goes on the street, he's collecting information every second. He's looking for behavior that's out of place. He pulls a car over and everything else. And that leads to a whole thing. So there, he's not starting with a suspect, yet he's continually gathering freely accessible information." 
Jim may have just nailed the left's irrational hatred for this technology.  Many liberals would reflexively brand the behavioral profiling he described as potentially racially, ethnically, or religiously motivated.  One wonders -- if long term predictors of terrorism should include MidEast Muslims, can outcries of Racial Data Mining from the left and CAIR be far behind?

Connecting the Dots

So, while Leahy focused on downplaying DM's potential upside and up-playing its downside, the experts disagreed on both.  But while the arguments continue, this much is clear -- as an adjunct to intelligence and surveillance, DM has delivered very promising results in the fields of law enforcement and counterterrorism.

For Example:
  • In Business, Empire Blue Cross/Blue Shield, New York State's largest health insurer, has realized hundreds of millions of dollars in fraud-and-abuse savings using IBM's Fraud and Abuse Management System (FAMS).
  • In Banking, Citibank used DM techniques to locate and capture Vladimir Leonidovich Levin, who funneled $10 million into accounts around the world.  Citibank refuses to explain their DM system as they are understandably wary of publishing all the details.
  • And yes, in counterterrorism, DM helped flag a Jordanian national trying to enter the country at Chicago's O'Hare Airport in 2003. CBP officers said Raed Mansour al-Banna was calm and polite and that his is papers were in perfect order. The DM flag alone prevented his entry. Two years later, he detonated a mammoth car bomb in Hilla, Iraq, killing 130 people and wounding 146.
Many have depicted our vulnerability on 9/11 as a failure to "connect the dots," and the term has come to describe the ability to take seemingly unrelated circumstances and find the common thread.  But these allegorical dots must be seen before they can be connected, and not limited to those immediately apparent.  In an asymmetrical war where attacks must be prevented rather than defended against, Data Mining and other forms of behavioral pattern prediction are our best and perhaps only hope in accomplishing both.

We Know Why the Caged Bird Sings

It's indeed noteworthy that in his opening address, Leahy used the term "checks and balances" 3 times.  He also used the word "safeguard" 4 times and we find "protect(ing)" 5 times, but each in regard to privacy and civil rights and never to the country and its citizens or from our enemies. In fact, the words "safe" and "safety" only appear in a positive note once in the entire speech, and even then "not just from enemies abroad, but also from abuses at home."  And yet, he only briefly mentioned expert opinion of the technology's place in counterterrorism and then emphasized the negative.  Is there any doubt where he stands personally on the subject, notwithstanding his complete lack of workable knowledge about it?

Obviously, Data Mining represents neither wiretapping nor eavesdropping nor any of the other totally inappropriate aliases the totally misinformed have used to besmirch it.  The Democrats and the MSM have shamelessly exaggerated its "invasion of privacy" facets while minimizing its potential as a defensive weapon against our worst nightmares.

As such, there would certainly seem to be no harm in discussing what Leahy has called "adequate checks and balances and oversight."  But let's keep in mind that two decades ago, this man exposed just how vulnerable the words "top secret" can be when he is not personally satisfied with policies held as such. 

Accordingly, the real danger lies in the extent of the oversight and the level of report detail he is allowed.  CitiBank was shrewd in guarding the specifics of its DM system, knowing that transparency would undermine its power. We would be wise to follow their lead. After all, that man hitting blocks with a gavel in Washington should likely be hitting rocks with a sledge-hammer in Kansas. 

Here's a closing thought. Early coal miners knew they were in trouble when the canary stopped singing.  Being a likely predictor of the presence of noxious gases, they'd evacuate for fear of losing many lives to potential asphyxiation or explosion.  Conversely, modern data miners will know they're in trouble when the canary starts singing.

And in that case, were he to know all the words, it would be a likely predictor that the lives lost from asphyxiation and explosions may well number in the millions.

Marc Sheppard is a technology consultant, software engineer, writer, and political and systems analyst. He is a regular contributor to American Thinker. He welcomes your feedback.