Preface for Data Cartels
PREFACE
I GOT INTO STUDYING DATA ANALYTICS companies by accident. I’m a law professor, and for over a decade, I was also a law librarian who worked at law schools and law firms. That means I’ve spent a lot of time using Lexis (RELX’s legal information platform) and Thomson Reuters’s Westlaw, the “gold standard” research products for the legal profession. In my role as a librarian, I spent so much time training students, and my colleagues, about how to use these products that some days I felt like little more than a glorified product rep for their parent companies, RELX and Thomson Reuters.
In 2017, someone sent me a news article reporting that another division within RELX called LexisNexis, and Thomson Reuters were vying to help the U.S. Immigration and Customs Enforcement (ICE) build its “extreme vetting” surveillance program.1 By 2017, ICE had solidified its reputation as the cruel immigration police ruthlessly separating families and deporting people who’d lived in the United States for their whole lives. This news story about LexisNexis and Thomson Reuters came out amidst other reports of ICE agents raiding workplaces and holding children in cages. So I was stunned to see the signatures of LexisNexis and Thomson Reuters employees who attended ICE events meant for companies trying to do surveillance work for the nation’s cruelest cops. Suddenly, the stacks of Lexis and Thomson Reuters books, records, and the Westlaw and Lexis printouts cluttering my office seemed unsavory.
Not only was the possible Westlaw-Lexis-ICE connection unsavory, using legal research products linked to ICE surveillance could run counter to ethical convictions held by both librarians and lawyers. Both professions prize confidentiality, and they protect people from unwarranted government intrusion. If Lexis and Thomson Reuters were working for ICE, it would raise uncomfortable questions: Was I training immigration lawyers to use products that help ICE arrest their clients? Were people who used my library giving their data to ICE?2
My concerns intensified when I tried to discuss the issue with the companies. I wondered how, as a profession, we’d failed to notice that Westlaw and Lexis’s companies were also building surveillance products for police and ICE agents, so I asked my school’s Lexis and Westlaw representatives about what their companies were doing with ICE. The companies’ product reps were usually helpful and eager to assist, so I expected prompt, reassuring responses. I was accustomed to the companies lavishing us with gifts and attention. When I was in law school, I got points for using Westlaw, which I eventually turned in for a designer handbag and silver watch. The companies even hired attorneys who were available 24/7 to answer our research questions by phone or online chat. I knew them as benevolent, generous entities that wanted to please their customers.
“Surely this is a misunderstanding,” I thought, hoping the people who worked at Westlaw and Lexis could clear up the issue with an easy explanation. I wanted to find peace of mind with companies that I’d trusted, and that I relied on for so much of my job. But when I brought my concerns up directly with vendors, they didn’t do anything to assure me that our data was safe from being sold to ICE or law enforcement. They weren’t even willing to issue a promise to my law school that our personal data wouldn’t end up in ICE’s data systems.
RELX, especially, exerted its power. A Lexis representative started camping out at my law school, calling my work phone, and my boss, demanding that I speak to her manager. She also started monitoring me through my students, asking them to report back to her if I talked about LexisNexis’s ICE contracts.3 A blog post that a librarian and I wrote about the issue for the American Association of Law Libraries was erased moments after it was posted, and the organization replaced it with a single sentence: “This post has been removed on the advice of General Counsel.”4 The takedown was an aberration: librarians are not the type to censor. They’re usually the ones fighting censorship. Spurred by this baffling takedown, I started digging for answers to my questions about the companies’ data businesses in earnest.
Ever since, I’ve been trying to connect the hidden informational world of data analytics companies like RELX and Thomson Reuters. I wasn’t trying to find proof that these companies were harming the public—I was actually trying to do the opposite. Learning that Westlaw and Lexis’s corporate overlords were helping ICE track people and feeding problematic predictive policing systems with flows of personal data shattered my career. I couldn’t, in good conscience, train hundreds of new lawyers to rely on companies that were selling their future clients’ data to police and ICE officers. Law enforcement data brokering conflicted with the confidentiality obligations and zealous advocacy responsibilities inherent in the immigration law and criminal defense work that most of my students go on to do. I kept hoping that I’d find something that would prove the ICE story wrong.
But the more desperately I searched for information that would prove that I was mistaken, the more my research surpassed my worst suspicions. I uncovered a web of information and data analytics products that hurt people in multiple information markets. These companies were building policing products. They were also making “risk” products that help our landlords, bosses, insurers, and healthcare systems decide whether to give us care and services that we need. They were selling “academic metrics,” data analytics products that determine whether scholars will get tenure or grant funding using outdated, elitist, and discriminatory ratings systems. They were paywalling critical information, including law and science, so people who aren’t wealthy, or affiliated with wealthy institutions, can’t get access.
With every information market I investigated, I found a different group of people uniquely oppressed by data analytics companies. Immigrants’ rights groups are fighting to get data analytics companies’ invasive personal data dossiers removed from ICE’s digital surveillance programs. People of color are disproportionately made targets by predictive policing algorithms. Academic scholars’ research is trapped behind the companies’ paywalls, and their funding and tenure depend on the companies’ metrics. Pro se litigants, including prisoners, can’t see the legal information that’s relevant to their cases because it’s only available on the companies’ platforms. The best financial data is only accessible to people who can afford pricey data analytics products. Data analytics companies have even paywalled news resources from the public, making it so that people can’t get critical updates on local and national emergencies.
The companies hadn’t always been quite as ethically fraught as they’ve become in recent years. When I was introduced to these companies as a new librarian in the early 2000s, they called themselves academic, legal, news, and financial “publishers.” Their various publishing brands seemed like separate, standalone platforms: Elsevier, LexisNexis, Thomson Financial, Reuters news, and Westlaw had different purposes and unique customers. But over the past decade, the companies switched their business models from publishing to data analytics, winding down their publishing businesses while spending billions of dollars to hire thousands of technologists and build data analytics development labs.
Today, the companies are no longer publishers that sell journals or casebooks, they’re data analytics firms that sell “risk solutions” and “business insights.” They sell informational content, a slurry of structured information and unstructured data points. They sell published content, and they also sell predictions and prescriptions they make with their algorithms, machine learning, “AI,” and other data-crunching technologies. They make “academic insights,” “legal insights,” “financial insights,” “law enforcement insights,” and any other “insights” their technologists can formulate.
As data analytics companies, these gigantic corporations dominate multiple information markets with some of the biggest troves of personal data and academic, legal, financial, news, and other information on the planet. They can use these raw informational ingredients to cook up even more information products in their technology labs. They can sell raw data, structured information, and analytics-driven “answers” to a broad range of consumers across major industries and institutions.
It might be fine to have information markets run by a few data analytics companies if the companies were operating with good ethics and public interest goals in mind, working towards providing better access to essential information and protecting intellectual freedom.5 But my research has led me to believe that the dominant data analytics firms are not operating with the public interest in mind. Instead, they’re impeding information access and privacy, privatizing and paywalling information that should be public, and exploiting people’s personal data. Just before this book went to press, two different public interest organizations filed separate reports to the Federal Trade Commission (FTC) complaining about how the companies’ strongholds in the legal information market and the academic information market were harming consumers.6 One complained that Westlaw and Lexis were blocking consumers’ access to “the laws of the land” even though access to the law is guaranteed by law. The other warned that Elsevier’s stronghold on the academic journal market puts government-funded scientific research on platforms that subject their visitors to personal data collection without an opt-out.
Perhaps the most shocking aspect of the FTC reports is that most people, and possibly even people working at the FTC, don’t know that Lexis and Elsevier are parts of the same company. The companies have hidden the full spectrum of their business models. They keep their products in the various information markets relatively siloed so consumers can’t see the full span of their data analytics business. Few people know how much information these companies control. They don’t know that Lexis is also ICE’s biggest data broker, or that Elsevier’s academic journals and BePress’s journal preprint sharing system are under the same corporate umbrella, giving RELX primary control of both published and unpublished scholarship systems.
Data technologies are simultaneously miraculous tools for sharing knowledge and dangerous tools for controlling information flows. Like most people, I was in awe of the first data technologies that companies, including data analytics firms, provided. Databases were life-changing inventions. It was thrilling to be able to pull up news articles by the dozens instead of rummaging around in dusty old stacks or squinting through the viewfinder of a microfilm machine. When Westlaw’s KeyCite replaced paper Shepard’s legal citator volumes, it felt fantastical that a tiny icon could indicate whether the law was still good, like some magic informational thermometer popping out of a judicial turkey.
But as information technologies and platforms have proliferated, I’ve grown more wary of the seemingly miraculous computers and data systems, and the private companies that are building our new information infrastructure. A lot of the “bells and whistles” on our digital products take power away from the humans making choices, replacing our wishes with automated selections that benefit the data companies by forcing certain content in front of readers and making other content prohibitively expensive or harder to find. Our informational services also seem to be collecting more and more personal information about us, forcing us to trade our privacy in order to access information. In addition, they’re taking over markets by squashing competitors with lawsuits and hostile takeovers, pushing out startups that have more public interest-oriented offerings. A few companies are cutthroat, dominating information markets by crushing innovative new information enterprises.
The chapters of this book share what I’ve discovered as I’ve peeled back the lid on this can of worms I opened by accident back in 2017. Along the way, I’ve met activists and workers in every informational market I discuss. Their stories show that data analytics companies don’t just hurt libraries or financial markets, they have an impact on real people who rely on information, which is all of us.
Notes
1. Sam Biddle and Spencer Woodman, “These Are the Technology Firms Lining Up to Build Trump’s ‘Extreme Vetting’ Program,” The Intercept, August 7, 2017, https://theintercept.com/2017/08/07/these-are-the-technology-firms-lini….
2. Joe Hodnicki, “Does WEXIS Use Legal Search User Data in Their Surveillance Search Platforms?” Law Librarian Blog, July 16, 2018, https://perma.cc/MQ2V-HDXG.
3. The representative even sent an email to students accusing me of purposely and repeatedly “spreading incorrect information.” Email on file with author.
4. https://perma.cc/9UQR-YVXE. See also Joe Hodnicki, “AALL’s ‘Extreme Vetting’ Removes Post on Professional Ethics for Suggesting Collective Action by AALL Readers,” December 13, 2017, https://perma.cc/26PZ-7LZB.
5. Guaranteeing access to information is so central to our modern existence that we’ve dedicated a whole field of study, information science, to setting standards for collecting, storing, retrieving, and using information. We’ve also coined the concept “intellectual freedom” as a foundational ethical principle that governs our information flows. Intellectual freedom doesn’t have a strict definition, but it’s the idea that people should have the freedom to research, and access, information free from censorship and surveillance. Office for Intellectual Freedom, Intellectual Freedom Manual, 8th ed. (Chicago: ALA Editions, 2010), 12.
6. The two groups that filed FTC reports are Public.Resource.Org and SPARC (the Scholarly Publishing and Academic Resources Coalition). Lisl Dunlop, John O’Toole, and Sam Sherman, “Submission to Federal Trade Commission on Behalf of Public.Resource.Org,” Axinn, October 29, 2021, https://law.resource.org/pub/us/case/ftc/Public%20Resource%20FTC%20Submission.pdf; “Opposing the Merger Between Clarivate PLC and ProQuest LLC, SPARC, October 22, 2021, https://sparcopen.org/wp-content/uploads/2021/10/ SPARC-FTC-Letter-in-Opposition-to-the-Clarivate-ProQuest-Merger.pdf.