Collecting Data Without Garbage Filters

 
 
By John McCormick  |  Posted 2005-06-14
 
 
 

Among his alleged crimes: child molestation and rape.

Calderon tried to protest his innocence.

He told his captors that his Social Security number and birth certificate had been stolen nine years before. He told them that they could look up his file.

He had reported the theft to the Los Angeles County Sheriff's Station in Norwalk, Calif., 15 miles up Interstate 5.

He asked the Anaheim Police Department to check his fingerprints. Instead, his hands were put in cuffs.

He was taken to a back room. The file that got checked was ... his wallet. Sure, his driver's license listed him as three inches shorter and 15 pounds lighter than the man described on the arrest warrant. But he was Hispanic, and he had the right name and birth date.

Calderon spent the next week in jail—for crimes he didn't commit.

How did he end up in this mess?

A Fry's manager, Tyra Fizel, had requested a background check when Calderon was being hired. The criminal warrants came in a report provided by The Screening Network, a service of ChoicePoint, the $1 billion-a-year data broker based in Alpharetta, Ga. When she saw the felony charges, she called the police.

Eighty percent of medical bills have errors. Sixty percent of retail invoices include wrong products, wrong quantities or wrong weights. Ten percent of all direct mail is undeliverable because of bad addresses.
It's only going to get worse. The information stored by U.S. companies is doubling every three years. No company's information is always accurate. Companies such as ChoicePoint, which sell information on individuals for a profit, say it's not their problem. But if you're the wrong Steven Calderon, it becomes your problem. Or you wind up in jail.

 

But no one—not Fizel, not Fry's, not the police—stopped to ask if the data ChoicePoint supplied was accurate. If they had, they might have found out that he was, indeed, an innocent man. Calderon's identity theft report, which he made in Norwalk, Calif., in 1993, wasn't connected with the criminal files that were created in his name.

ChoicePoint, since its Feb. 15 admission that it was fooled into selling personal information on 35,000 Californians to fake businesses set up by Nigerian criminals—and its admission two days later that it really sold information on 145,000 people—has become the poster child for problems in keeping corporate data secure.

Several class-action lawsuits have been filed in the wake of the February security snafu, both by ChoicePoint shareholders and by people whose information ChoicePoint may have sold.

Government bodies—from Congress, to the Federal Trade Commission, to a group of state attorneys general—are in the midst of investigating ChoicePoint for violation of laws regarding the security of information held about consumers by for-profit companies.

Story Guide:

Blur: The Importance of Accuracy

  • Not Just Security, But Accuracy
  • 'Serious' Errors Are Common
  • Data Customers Pay the Costs
  • Collecting Data Without Garbage Filters
  • Records 'Full of Inaccuracies'
  • Crap In, Crap Out
  • Fix It Yourself
  • No Way to Check
  • ChoicePoint Data at a Glance

Not Just Security — Accuracy

But, as the Calderon case shows, data security isn't necessarily ChoicePoint's only challenge.

The other shoe that has yet to drop is the accuracy of the information it is supplying. The accuracy of the files inside the information repositories of ChoicePoint and America's other data brokers is critical to decision-making by companies that increasingly rely on automated rules that act on electronic flows of unseen information to make decisions in a fraction of a second, in great volume.

ChoicePoint's list of clients totals 50,000 businesses and has included such well-known names as General Electric, Home Depot, IBM, Boston Market, the Archdiocese of New Orleans and the Federal Bureau of Investigation.

"The private sector and, increasingly, government rely on the data provided by ChoicePoint to determine whether Americans get home loans, are hired for jobs, obtain insurance, pass background checks and qualify for government contracts," said Marc Rotenberg, president of the Electronic Privacy Information Center (EPIC), a public interest group, in prepared testimony on ChoicePoint before a March 15 meeting of the House Subcommittee on Commerce, Trade and Consumer Protection.

About three-quarters of ChoicePoint's business involves distributing information under federal and state regulations, including the Fair Credit Reporting Act (FCRA), which states: "Whenever a consumer reporting agency prepares a consumer report, it shall follow reasonable procedures to assure maximum possible accuracy of the information concerning the individual about whom the report relates."

Yet ChoicePoint has faced at least a half-dozen suits over the accuracy of its reports, filed by persons as disparate as an assembly line worker retired from General Electric to a marketing professional applying for a job at IBM. In at least one case, a court ordered ChoicePoint to pay more than a quarter-million dollars to a Kentucky woman because the company erroneously reported that she had a number of claims against her insurance carrier—leaving her unable to afford insurance.

But that's a small number, from ChoicePoint's view. "Obviously, from time to time, we are subject to litigation," says Doug Curling, ChoicePoint's president and chief operating officer. "However, we work very hard. And our track record would show we are very successful at making sure the information product that we deliver to our customer is accurate."

ChoicePoint has said that incorrect information is produced less than once in every 1,000 of the 7.3 million background checks it performs each year. Even at less than one-tenth of 1%, however, that leaves close to 6,000 errors.

And ChoicePoint doesn't really know how accurate the records in its systems are. When asked how ChoicePoint gauges accuracy, Curling said the company only counts errors "associated with a dispute." That means individuals must find or obtain a copy of a ChoicePoint report, be their own fact-checkers on the reports, try to bring the errors to the attention of the company and then follow up to make sure the errors are actually corrected, permanently.

When Curling was asked what percentage of the data stored in its systems was accurate, he called the line of questioning "hostile." The premise, he maintained, is "so broad you know it can't be accurately answered."

But by ChoicePoint's own admission, it does not check nor feel it is responsible for the accuracy of the estimated 17 billion files it has collected and stored. It assumes only that the facts it acquires are accurate when they arrive. "We do not verify the factual basis of a record, but instead rely on the assertions of our data sources that created the record," wrote James Lee, the company's chief marketing officer, in an e-mail to Baseline.

Story Guide:

Blur: The importance of Accuracy

  • Not Just Security — Accuracy.
  • "Serious" Errors are Common
  • Data Customers Pay the Costs
  • Collecting Data Without Garbage Filters
  • Records "Full of Inccuracies"
  • Crap In, Crap Out
  • Fix It Yourself
  • No Way To Check
  • ChoicePoint Data at a Glance

"Serious" Errors are Common

Not only is information going unchecked by ChoicePoint, but the way the company aggregates data and then distributes it can introduce errors, such as mismatching profiles of different individuals with their insurance or criminal histories. As might have happened with the two Calderons.

 

There appear to be no generally accepted statistics on data accuracy rates among data brokers such as ChoicePoint, which deals in insurance claims history, court documents and other public records. However, there are statistics on the closely related field of credit reporting, in which companies such as Equifax, Experian and TransUnion operate. ChoicePoint, in fact, was spun off from Equifax in 1997.

A 2004 study by the U.S. Public Interest Research Group found that 54% of all credit agency reports contain errors. Twenty-five percent of these errors were considered "serious," meaning the reports include erroneous listings of delinquencies, accounts in collection, bankruptcies and other information that could result in a denial of credit.

Similarly, a 2003 Government Accountability Office report, citing statistics from the Consumer Federation of America, found that 78% of credit-agency files omitted account information, 82% had inaccuracies regarding revolving accounts or collections, and 96% had bad credit-limit information.

Yet every company wrestles with data accuracy. The fact is, America's vast panoply of databases are riddled with errors, according to data quality experts such as Ted Friedman, a vice president at research company Gartner, and Larry English, president of Information Impact, a consulting company that specializes in data quality.

Indeed, data quality problems extend to just about every corner of corporate America:

 

  • 80% of medical bills have multiple errors, from typos to erroneous charges, according to the Medical Billing Advocates of America.

     

  • 60% of all retail invoices contain data errors, such as incorrect products, quantities and weights, according to management consultancy A.T. Kearney.

     

  • 20% of U.S. mail and commercial package deliveries were returned because of incorrect names or addresses, according to business and information-technology executives polled by Forrester Research in 2004.

     

     

  • 10% to 15% of all direct mail is undeliverable because the addresses are bad, according to the Direct Marketing Association.

     

    In fact, a Gartner report last year said that more than 25% of the critical data in Fortune 1,000 databases is inaccurate or incomplete. This includes faulty inventory descriptions, bad product codes and product descriptions, erroneous financial data, inaccurate supplier information and incorrect employee data.

    "If someone says their data is always accurate, they're lying," says Dana Rafiee, the U.S. director of Destiny Corp., an international business and technology consulting company. The accuracy of information, he points out, "is a major issue in every organization."

    The cost of all this bad data? The Data Warehouse Institute, a business intelligence and data analysis industry consortium, estimates errors are costing U.S. businesses about $600 billion a year. The fallout? Everything from redoing a job that went astray because of bad data, to misdirected shipments due to faulty addresses, to the cost of correcting errors in a database.

    Poor data collection and analysis can even spoil a presidential election, a situation that continues to shadow ChoicePoint nearly five years later.

    To get ready for the 2000 election, the state of Florida contracted with data broker Database Technologies (DBT) to clean its voter registration list of convicted felons, people registered to vote in more than one county and the names of people who had died. But because of the way Florida defined the search criteria, DBT warned the review would be overly inclusive. As a result, according to various reports, DBT's housecleaning didn't just net felons, for instance; it picked up people whose names were similar to those of felons. DBT officials said voting officials were supposed to check the list. It's still unclear how many voters were prevented from casting ballots in Florida, a state George W. Bush carried by a little more than 500 votes.

    ChoicePoint acquired Database Technologies in May 2000, just a few days after DBT turned in its "cleaned" file to Florida officials.

    Story Guide:

    Blur: The importance of Accuracy

  • Not Just Security — Accuracy.
  • "Serious" Errors are Common
  • Data Customers Pay the Costs
  • Collecting Data Without Garbage Filters
  • Records "Full of Inccuracies"
  • Crap In, Crap Out
  • Fix It Yourself
  • No Way To Check
  • ChoicePoint Data at a Glance

Data Customers Pay the Costs

These errors may be just a start, as the amount of data kept on individuals rises exponentially and new forms of complex data, like medical scans, get added to the record. Some 11 exabytes of new information are expected to be generated this year, most of it stored by magnetic means. This is the equivalent of 80,000 libraries holding as many books as the Library of Congress. And the amount of information being produced—and that needs to be kept accurate—is doubling every three years, according to a study by the University of California at Berkeley.

 

Data accuracy "is going to be a bigger problem than data security," says Larissa Moss, a data management consultant with the Cutter Consortium and a former information quality assurance manager for Security Pacific National Bank, now part of Bank of America.

The problem is fundamental: Few organizations really know how much of their data is accurate, and few have the necessary tools for validating the information they have or for cleaning it up. And even fewer have the staff in place to find, isolate and correct inaccurate information, English says.

Companies without proper information management and control, according to English, are already spending 10% or more of their operating revenue on fixing problems that stem from bad data: The incomplete specification that messes up a manufacturing run, a bad address that sent a package to the wrong place, an inaccurate bill or invoice that requires the intervention of a clerk. The work has to be done again, and the wrong data has to be found and fixed. Then there are the fines or penalties from a late shipment or missed deadline. And costs grow if the problem isn't easily identified and corrected—and, of course, many aren't.

English simply calls it the "cost of failure as a result of poor-quality information."

Those problems, however, pale in comparison to what he sees as an even bigger consequence of bad data—a loss of customer confidence or trust.

"The real cost," English points out, "is in alienation of the customer."

While companies realize that an incorrect bank statement or extra charges on a bill may send a customer into the arms of a competitor, English says they also need to understand that customers care about the little details, too. Barbra Streisand once pulled an account from a bank that misspelled her first name, he contends.

ChoicePoint has not reported any alienation from its corporate customers because of its data accuracy problems. Fry's, for instance, is still a ChoicePoint customer. Deutsche Bank in an April assessment of the company, however, said ChoicePoint might lose customers over the security breach.

ChoicePoint began life as a public company in 1997 when Equifax spun off its insurance services unit and related businesses. ChoicePoint would collect information on individuals for the insurance industry and check past claims, credit reports and other information to analyze individuals' insurance risk. Since then, it has acquired more than 60 companies that specialize in information ranging from drug testing to employee screening to DNA analysis. Along the way, it has collected, according to EPIC, Social Security numbers, property and vehicle information, addresses, and other bits of sensitive information. The company has data on more than 220 million U.S. citizens—about four of five Americans.

ChoicePoint information has been used to help families find missing children, law enforcement track down criminals, and insurance companies offer quick policy approvals.

But all companies find it problematic to keep data accurate and secure. The task at ChoicePoint and other data brokers is magnified by the billions of records they handle and the effect on individuals' lives that incorrect information can have, according to Randy Bean, managing partner of NewVantage Partners, an information-technology consultancy that works with large companies including Fidelity and Liberty Mutual Insurance.

Bean, who was also the general manager of business-to-business database marketing at Harte-Hanks Data Technologies and held information-technology planning positions at Bank of Boston, says data brokers and credit agencies, compared to other companies, have a greater responsibility to ensure the information they distribute is accurate.

"There needs to be the greatest sensitivity involved because people very willingly share their information with these organizations, be it when they make retail purchases or open bank accounts or medical records—whatever the case may be," he says. "And the organizations that capture that data ... ultimately must have responsibility for how that data is used and where it goes."

Story Guide:

Blur: The importance of Accuracy

  • Not Just Security — Accuracy.
  • "Serious" Errors are Common
  • Data Customers Pay the Costs
  • Collecting Data Without Garbage Filters
  • Records "Full of Inccuracies"
  • Crap In, Crap Out
  • Fix It Yourself
  • No Way To Check
  • ChoicePoint Data at a Glance

Collecting Data Without Garbage Filters

ChoicePoint strenuously guards details of its information gathering and processing procedures. Baseline's understanding of how the company works is based on court filings, company statements, and interviews with industry experts and company executives.

ChoicePoint feeds its huge databases with streams of data tapes and compact discs from insurance firms, marketing companies and other commercial sources, as well as public records such as court documents and licenses. Insurance companies, for example, then use ChoicePoint's data stores to check the claims history of policy applicants.

But little of the data in its vast repositories is verified by ChoicePoint, which mostly aggregates files and generates reports from its files without checking the information in those files for duplication, omissions or inaccuracies. "Very little, if anything" is checked, according to an industry expert with 20 years of experience in the kind of data brokering and data processing business done by ChoicePoint who asked not to be named.

Files, this industry expert says, can contain misspelled names, out-of-date addresses, faulty insurance claims or any number of other inaccuracies.

Most of their systems are "fire and forget," he says, meaning that ChoicePoint simply loads data from outside sources into its system and moves on to the next task.

ChoicePoint will check these files for anomalies. For example, if an insurance company that regularly sends ChoicePoint updated files with a small percentage of changes—say, less than half a percent—suddenly sends ChoicePoint a tape with more than 1% of the data changed, ChoicePoint will spot the jump. The company will then ask why the new file doesn't match historical patterns.

But the company does not check any of the information on these tapes and CDs.

"Almost all of the information we hold is information we acquired from a source," Curling says. For instance, he explains, "We buy [data] directly from the state. Is that accurate or not? Well, in our case, accuracy there would be dominated by the fact that we got it directly from the source. And we applied that in an electronic form to a file."

In effect, the company says it's responsible for making sure the data gets loaded into its systems correctly. And it counts on the data suppliers, such as the insurers or the states, to provide accurate data.

"ChoicePoint has exhibited the attitude, 'Oh, we just need to pass on this information. We're not responsible for whether it's accurate,'" says Evan Hendricks, editor and publisher of the Washington-based newsletter Privacy Times.

In addition to receiving data, ChoicePoint also goes out and gets data that it needs to build reports on individuals.

The company says it has electronic gateways into some databases, such as some state motor vehicle department files. With these gateways, the company can electronically collect driver names, license numbers and car registrations. ChoicePoint often pays agencies for their data—about $500 million a year to state motor vehicle departments alone.

Don McGuffey, the company's senior vice president for data acquisition and strategy, told the California State Senate Banking Committee on March 30: "We get updates from the various states' agencies regularly and rely upon the state agency to give us all the information and the complete information."

ChoicePoint also says it employs an army of researchers to verify information on, say, an employment form. They travel around the country, pen or PDA in hand, and stop by courthouses to write down data from bankruptcies, judgments, licensing sanctions and other proceedings. According to published statements by CEO Derek Smith, ChoicePoint collects up to 40,000 records manually a day through this network of employees and contractors.

The workers fax or e-mail the documents to ChoicePoint. Later, ChoicePoint checks their work.

"We will audit them periodically throughout the year, not only by randomly collecting individual records that they would have developed and returned to us; we would also require them to physically send in copies of documents periodically so a separate team of people could review the actual hard copies of documents in the courthouse versus what was actually entered and delivered back to us," McGuffey says.

And ChoicePoint does say it puts the data it collects through what's known as a data cleansing process. The company uses Firstlogic's Information Quality Suite software, which can drill through files looking for inconsistencies and, in some cases, can fix problems automatically. The software, for instance, will check a name and address against a U.S. Postal Service file. The software can also determine that data on "Smith, Sam" in one file also applies to "Samuel Smith" in another if there is other matching information, such as an address, so that the two files can be consolidated.

But no product will catch every inconsistency. Or typos. Or other input errors, according to English.

In addition, much of what the researchers gather comes from court documents and other public records and can contain wrong names and addresses, duplications and omissions.

Story Guide:

Blur: The importance of Accuracy

  • Not Just Security — Accuracy.
  • "Serious" Errors are Common
  • Data Customers Pay the Costs
  • Collecting Data Without Garbage Filters
  • Records "Full of Inccuracies"
  • Crap In, Crap Out
  • Fix It Yourself
  • No Way To Check
  • ChoicePoint Data at a Glance

"Records "Full of Inccuracies"

"We know that the public record is full of inaccuracies," says Marty Abrams, a former vice president of information policy and privacy at Experian who now heads the information policy group at Hunton & Williams, a business law firm that represent companies such as Bank of America and GE.

Many errors happen for the simplest of reasons. Say, for example, a government worker transposed two digits of Sam Smith's Social Security number. Four digits written as 1, 2, 3 and 4 on a property deed may get entered on an electronic form as 1, 2, 4, 3. From that moment on, whenever the data broker collected information on Sam Smith, it could pull data using both Social Security numbers. And if there were a person with the 1243 number who had declared bankruptcy, that information could be pulled into a report the data broker compiled on Sam Smith.

ChoicePoint's privacy policy notes that it will "follow reasonable procedures to assure maximum possible accuracy" of the information in Screening Network reports. However, according to the Subscriber Agreement the company requires people to sign before they use the service, ChoicePoint demands to be released from any liability for accuracy if the information is provided by third parties. " ... ChoicePoint cannot be either an insurer or a guarantor of the accuracy of the information reported ... " the agreement reads.

Inaccurate data is, if nothing else, persistent. Abel Obabueki found this out the hard way.

In 1999, IBM offered Obabueki an $85,000 marketing job, contingent on a pre-employment screening. Obabueki had pled no contest to a misdemeanor a few years earlier. After a successful probation period, the charges were dismissed and his record cleared.

IBM had Obabueki fill out a pre-screening questionnaire, with the instructions to reveal criminal convictions but omit arrests without convictions, and convictions or incarcerations "for which a record has been sealed or expunged." Obabueki correctly answered that he had no convictions.

IBM hired ChoicePoint to do the background check on Obabueki and instructed ChoicePoint to report only current and pending criminal charges. ChoicePoint, however, told IBM Obabueki had a conviction.

IBM rescinded its offer to Obabueki.

Obabueki explained what happened to IBM and sent the computer company a copy of the court order dismissing the charges against him. ChoicePoint later acknowledged its error, according to Obabueki's lawyers, and issued a revised report that listed Obabueki's criminal record as "clear."

IBM, however, did not re-offer the job to Obabueki, who sued ChoicePoint and IBM for Fair Credit Reporting Act violations. A U.S. federal court in New York heard the case.

IBM was eventually dismissed from the suit, but Obabueki won a jury verdict against ChoicePoint; the verdict found that the data merchant had not maintained strict procedures to ensure the information it reported was complete, up to date and accurate. ChoicePoint was told to pay Obabueki $450,000 for lost wages and mental distress.

The court, however, took it upon itself to dismiss the case, saying, among other things, that Obabueki had "cured" the data problem by sending IBM the court order that had dismissed the charges. His $450,000 award was erased.

Obabueki and his lawyers appealed the judge's decision and, when the appeal was denied, petitioned the U.S. Supreme Court. The high court, however, refused to hear the case.

"There is no dispute in this case that ChoicePoint failed to follow adequate procedures to verify the information it obtained and reported, that it failed to determine that the conviction had been vacated, and that, had it followed its legal obligations and its agreement with IBM, it would have accurately reported that petitioner had no convictions," stated the petition filed to the Supreme Court by Obabueki's lawyers, Gregory Antollino and Erik Jaffe.

Antollino says he thinks ChoicePoint hired a third party to gather data, and that the firm had an old file on Obabueki that it never updated before sending the file to ChoicePoint.

ChoicePoint's only comment on the case came in an e-mail from the company's chief marketing officer, James Lee: "ChoicePoint won this case at the Appellate Court, which overturned a trial verdict against ChoicePoint."

In addition to human error, the way ChoicePoint's computer systems handle data may cause inaccuracies.

ChoicePoint has hundreds of databases—many are Oracle systems running on Hewlett-Packard Unix servers. This includes two databases—one for property insurance and the other for auto—that make up its Comprehensive Loss Underwriting Exchange (CLUE), which stores 200 million insurance records.

If there is an error in a file, ChoicePoint—if it is aware of the mistake—can work to fix it. However, in many cases, data suppliers often send in new tapes and CDs with updates. The data is loaded when it comes in and the old data is purged. If an error in a file isn't corrected at the source, the erroneous data will be reloaded into ChoicePoint's systems. "They just simply replace data each month," says the 20-year data expert. "The data loading techniques are not sophisticated."

Data experts are aware of how these systems work. "It's a weird cycle," says Bruce Schneier, the founder of Counterpane Internet Security, a security software and consulting firm.

That cycling seems to be the culprit in the case of Mary Boris, a Kentucky woman who discovered in February 2000 that she had lost her homeowner's insurance. Boris had filed four insurance claims for water damage, but ChoicePoint's CLUE database was reporting them to her insurer as four claims for fire damage plus an "extended loss" claim.

Boris called the insurance company, ChoicePoint and the Kentucky Department of Insurance and got her report corrected. Months later, however, the bad data reappeared. ChoicePoint now reported she had made nine claims—four for water damage, four for fire and one for extended loss.

Boris filed suit, and ChoicePoint disclaimed all responsibility, saying it was the insurer's job to ensure that the coding on the claim reports was accurate. For 11 months after the suit was filed, the bad data remained on Boris' claims report, according to court records, until it was kicked out of ChoicePoint's computer because it had expired.

U.S. District Court Judge John G. Heyburn II, in awarding $350,000 damages to Boris, noted that ChoicePoint showed "a complete lack of sympathy" for her problems. The judge also said the company never explained the "computer glitches" that apparently caused her problem. "To this day, the Court is still unclear what procedures, if any, ChoicePoint uses to ensure the accuracy of its mass circulated reports," he wrote. The case was later settled out of court.

ChoicePoint's only comment on the case came in the e-mail from chief marketing officer Lee: "The parties to this case entered [in]to a confidential settlement agreement."

California Insurance Commissioner John Garamendi has been trying to regulate the quality of data used by insurers since 2003, when his office saw consumer complaints against insurers rise to more 100 a month. His latest effort is a bill that would force insurers to rely on "complete information"—not just historical databases like CLUE—when making underwriting decisions that go against consumers.

The insurance industry opposes Garamendi. "Only three changes are made per 10,000 files," says Dan Dunmoyer, president of the Personal Insurance Federation in Sacramento, Calif. "We get six complaints and three are justified. To us, three out of 10,000 is not a big number."

Story Guide:

Blur: The Importance of Accuracy

  • Not Just Security — Accuracy.
  • "Serious" Errors are Common
  • Data Customers Pay the Costs
  • Collecting Data Without Garbage Filters
  • Records "Full of Inccuracies"
  • Crap In, Crap Out
  • Fix It Yourself
  • No Way To Check
  • ChoicePoint Data at a Glance

Crap In, Crap Out

Another issue with CLUE is the way the service pulls information from its data stores.

According to ChoicePoint's Curling, a typical request from an insurance company might be to find out about a driver, the driving records and claims histories of any other people living at the driver's address, and to verify that the vehicles in the driver's household realistically match the number listed by the driver on his or her policy application.

A query from an insurance company might start with a name and state—say, Sam Smith and California.

ChoicePoint will pull data from the state's motor vehicles department, which will provide information on all drivers at an address. It will also pull car registrations. It will then dig into CLUE to get the claims histories.

But you have to be careful when you cull any database, say English, Navesink Consulting president Tom Redman and other data experts. There may be a lot of Sam Smiths. Many might be living in the same area and be about the same age. And the query might return files under "Sam Smith" and "Samuel Smith."

"So when you marry this data up, you have to interpret who this person is," the 20-year data expert explains.

Not only that, but if one of the Sam Smiths' driver's license numbers or Social Security numbers was put incorrectly into the system and wound up matching someone else's number—even if that person's name isn't Sam Smith—that information could be melded into the report, too.

This may be what happened to Robert Burkhead, a retired GE assembly line worker.

In 1998, Burkhead was buying a car in Kentucky and tried to get car insurance. But, he says, his insurance history, which was supplied by ChoicePoint, showed that another person's Social Security number, that of a "K. Caye," had been added to his file, along with Caye's auto accidents. Not surprisingly, this put Burkhead into a higher risk category, with premiums that were $500 to $600 higher than the standard annual rate, according to his attorney, Bernard Leachman.

After Burkhead got a look at his report and figured out what was going on, he tracked down Caye's agent at Nationwide Insurance, which apparently had been using Burkhead's Social Security number with Caye's account. Burkhead said he sent a letter to ChoicePoint and also placed a couple of calls to customer service reps at ChoicePoint, who said they would take care of the mistakes. But nothing happened. This went on for five years.

Burkhead's headache got worse. In 2003, Burkhead claims ChoicePoint substituted his driver's license for Caye's and added multiple claims by Caye to Burkhead's report.

In April 2004, six years after the mixup began, Caye and his accidents were finally deleted from Burkhead's report. But even that didn't end Burkhead's problems.

Burkhead says his son, Robert Burkhead III, somehow was added last year to his report and listed as living with his parents—even though the son lives on his own and was filing for both bankruptcy and divorce.

Burkhead had enough and last year sued ChoicePoint. The complaint filed in the case cites ChoicePoint's "gross, wanton, willful and malicious faulty management of data and reports." Leachman says ChoicePoint does not scan claim reports for accuracy when they enter or leave its system. "There is a cross-check of birth date and Social Security number and driver's license number and similar names that the computer automatically does, but there is no reason I know of why Mr. Caye kept showing up on these reports," he says.

ChoicePoint refused to comment on an ongoing case. However, in its answer to the complaint, ChoicePoint admitted that it had received a letter and phone calls from Burkhead, and that Nationwide had submitted information to CLUE that tied Burkhead's driver's license with Caye's claims.

However, ChoicePoint said "it lacked sufficient knowledge and information" to know whether the other complaints lodged against the company were accurate. And the company denied any wrongdoing on its part. "Plaintiff's damages, if any were incurred, are as a result of the actions or omissions of other parties or individuals for whom ChoicePoint has no responsibility," reads the answer to the complaint.

Indeed, ChoicePoint's service agreement with insurance carriers stipulates that, "Neither CPS [ChoicePoint] nor third parties shall be liable ... for any loss or injury arising out of or caused in whole or in part by CPS's or third parties' negligent acts or omissions in procuring, compiling, collecting, interpreting, reporting, communicating or delivering services or in otherwise performing this agreement."

That gets right to the crux of the problem. A broker such as ChoicePoint does not vouch for the accuracy of the data it collects and then resends to insurance companies, law enforcement agencies and employers.

As the 20-year data expert puts it: "Crap in, crap out."

Despite the disclaimers, data experts, including English and Redman, say data brokers should be responsible for the accuracy of the information they distribute. "They don't verify. They just capture," English points out. "That's got to change."

Story Guide:

Blur: The importance of Accuracy

  • Not Just Security — Accuracy.
  • "Serious" Errors are Common
  • Data Customers Pay the Costs
  • Collecting Data Without Garbage Filters
  • Records "Full of Inccuracies"
  • Crap In, Crap Out
  • Fix It Yourself
  • No Way To Check
  • ChoicePoint Data at a Glance

Fix It Yourself

ChoicePoint says the building blocks of its business are data, analysis and distribution—and the company's most publicized failing of late has been in data distribution.

California law requires companies to notify residents if their personal information is compromised. ChoicePoint in February sent letters to the 35,000 Californians affected by last year's breach to tell them that someone unauthorized "may" have accessed their data.

One recipient of the letter, a Los Angeles nurse named Elizabeth Rosen, contacted a law professor, a reporter at MSNBC and an attorney with EPIC.

"I had never heard of ChoicePoint," Rosen says. "I didn't know if I should be concerned."

The thieves did not breach ChoicePoint's databases—they signed up for a ChoicePoint service called AutoTrackXP using fake identification and business licenses.

AutoTrackXP is a search service for law officers, collection agents and others looking to track people down. According to the company's Web site, with nothing more than a name or Social Security number, a customer can search for identity information, the names of relatives and associates, property records and deed transfers. AutoTrackXP searches billions of records in national and state databases for summary assets, licenses and criminal records, and compiles the information into reports.

ChoicePoint says it has tightened its procedures for selling products containing sensitive personal data like Social Security and driver's license numbers—for example, the credentials of all small-business customers are being reviewed. As a result, the company expects to lose $15 million to $20 million in revenue this year.

Rosen is not comforted. As of mid-May, she said she still didn't know exactly what, if any, of her data ChoicePoint sold to the Nigerians.

But, to Rosen, perhaps just as shocking as the possible data theft was the number of inaccuracies on her ChoicePoint report. Five of the report's six pages contained repeated errors.

For example, she was incorrectly listed as an officer of a Texas company that went bankrupt and as a deli owner. Her nursing licenses in California and New York were omitted.

"[They told me] to call each source of information, get them to change it and submit the corrected information to ChoicePoint," she says. "I said, 'You've got to be kidding.'"

ChoicePoint itself points out all the errors that can come out of AutoTrack. A sample report posted on its Web site indicates that partial birth dates, typos, bad Social Security numbers and a half-dozen other mistakes are all possible.

ChoicePoint, however, argues that this is a search service—a tool for investigators looking to find out who lives at an address or to locate a witness. The service does not fall under FCRA guidelines. And customers who use the service are aware of its limitations, Curling says.

In May, the California Senate passed a bill that would allow residents to see files kept on them by ChoicePoint and other data brokers, and to correct any inaccuracies and to know who requests information about them. One obstacle for people like Rosen who want to see and correct their data is that ChoicePoint doesn't keep static reports on people—it conducts fresh public records searches as they're requested.

But ChoicePoint's McGuffey adds that the company is thinking about how to accommodate requests like Rosen's. Meanwhile, Congress is also considering several bills, including ones similar to California's notification law.

ChoicePoint's carelessness with Californians' data hasn't stopped the company and the state from doing business, however. On April 22, ChoicePoint announced an $845,500 contract with the California Department of Justice to build a "distributed network" so the state's criminal intelligence analysts can assess information without relying on a central database. The state attorney general's office, which is simultaneously investigating ChoicePoint for the security breaches, says the network will tie together various public and private databases, but ChoicePoint won't be handling the data.

Story Guide:

Blur: The Importance of Accuracy

  • Not Just Security, But Accuracy
  • 'Serious' Errors are Common
  • Data Customers Pay the Costs
  • Collecting Data Without Garbage Filters
  • Records 'Full of Inaccuracies'
  • Crap In, Crap Out
  • Fix It Yourself
  • No Way to Check
  • ChoicePoint Data at a Glance

No Way To Check

Unfortunately for Steven Calderon, he could not check or dispute whatever data ChoicePoint had on him before the Anaheim police arrested him at Fry's in January 2002.

Calderon spent a week being transferred from jail to jail in Southern California. He continued to protest his innocence, but the police were unmoved.

Only after an officer in Ventura noticed that Calderon's driver's license number, height and weight did not match the ones on file on the suspect being sought, did they check his prints and realize they had the wrong man. So, they let him go. He returned to work at Fry's. A year later, he filed suit against Fry's, ChoicePoint, the city of Anaheim and the California counties of Orange and Ventura.

Everyone who spoke about the case—Fry's and Anaheim officials—expressed regret that it happened. But all said their actions were correct, or at least understandable, given the situation. Calderon's lawsuits against the city and counties were dismissed. Mark Facer, the attorney for Anaheim, says the police acted on an arrest warrant that looked valid.

ChoicePoint's only comment on the case is that the plaintiff dropped the complaint. Calderon could not be reached for comment, but Stephen Gargaro, Calderon's attorney, says he requested a dismissal—which the court granted—because he felt "the law was on the edge as to whether they [ChoicePoint] had liability."

Fry's attorney, Alex Curotto, "disputes vehemently that [Fry's] did anything wrong." Calderon settled with Fry's, but the company's community relations manager, Manuel Valerio, says Calderon should have told Fry's when he was hired about the ID theft. "The crime of identity theft is compounded by not making the employer aware," he says.

Such finger-pointing is common in cases where bad data is involved. Bad data sets off a chain of errors, in which each misstep is linked to the last. Calderon told the court he'd forgotten about the identity theft, although he certainly would have told Fry's if he knew he'd spend a week in jail.

But it never occurred to Calderon to refuse a background check. As he told the court: "My record was clean."

Story Guide:

Blur: The Importance of Accuracy

  • Not Just Security, But Accuracy
  • 'Serious' Errors Are Common
  • Data Customers Pay the Costs
  • Collecting Data Without Garbage Filters
  • Records 'Full of Inaccuracies'
  • Crap In, Crap Out
  • Fix It Yourself
  • No Way To Check
  • ChoicePoint Data at a Glance

ChoicePoint Data at a Glance

Headquarters: 1000 Alderman Drive, Alpharetta, GA 30005

Phone: (770) 752-6000

Business: Collects, stores and distributes personal information on consumers to help organizations reduce fraud and mitigate risks.

Chief Executive Officer: Derek Smith

Financials in 2004: $148 million in net income; $918.7 million in revenue; net profit margin of 16%.

Challenges: Repair reputation and raise revenue, despite data-accuracy and security obstacles.

Baseline Goals:

  • Increase revenue at least 14% to $1.1 billion, from $918.7 million.
  • Maintain free cash flow at between $180 million and $190 million in 2005, compared to $182 million in 2004.
  • Sustain operating income margin in 2005 at 26% of sales.
  • Limit 2005 revenue loss from decision to reduce access to public records to $20 million.