In early 2012, the New York Times Magazine put out a cover story about Andrew Pole, a statistician working for Target who was tasked with inventing a way to identify potentially pregnant shoppers, even if those shoppers didn’t want the company to know. The rationale, Pole said, was that moms-to-be are a multi-million dollar market, and Target wanted a way to pepper these moneymakers with promos and coupons before its competitors did the same.
Pole obliged. After crawling through the freight of sale data from statewide shoppers on Target’s public baby registry, he came up with a “pregnancy prediction” score that the company would internally assign to each of its regular customers. If you believe the rumors (not everyone does!), Target’s algos were so accurate that the company sent coupons for cribs to a teenage girl before her own father knew she was due.
A decade later, the story reads less like a quirk of capitalism and more like an ominous sign. Now it’s not just Target, every company is hounding you for data. And thanks to the Supreme Court’s decision to overthrow Roe v. Wade, a good chunk of the nation’s police and private citizens can go after people seeking abortions and the doctors that would serve them if there’s enough evidence.
And in 2022, there is plenty of data to go around and plenty of players willing to pawn it off if the price is right. A Gizmodo investigation into some of the nation’s biggest data brokers found more than two dozen promoting access to datasets containing digital information on millions of pregnant and potentially pregnant people across the country. At least one of those companies also offered a large catalogue of people who were using the same sorts of birth control that’s being targeted by more restrictive states right now.
In total, Gizmodo identified 32 different brokers across the U.S. selling access to the unique mobile IDs from some 2.9 billion profiles of people pegged as “actively pregnant” or “shopping for maternity products.” Also on the market: data on 478 million customer profiles labeled “interested in pregnancy” or “intending to become pregnant.” You can see the full list of companies for yourself here.
In all cases, these datasets were sold on what’s known as a “CPM” or “cost per mille” basis—which essentially means that whoever buys them only pays for the number of end-users that are reached with a given ad. Depending on who was offering up a dataset, the price per user ranged from 49 cents per user reached to a whopping $2.25.
The datasets offer information on some 3.4 billion people in total, though how many unique individuals those data cover is unclear, as the datasets obviously overlap. Multiple brokers are likely hawking the same information, as half the world does not live in the United States, and half the world is not pregnant. Their sources do differ, however. Some brokers were gleaning this information directly from pregnant people who had agreed to have their data shared through these channels when they signed up for coupon sites or downloaded a given app. In other cases, these companies were doing exactly what Target had done all those years before: instead of collecting data from end-users that were explicitly saying they’re pregnant, the brokers instead modeled a core base of potentially pregnant users with internal data analysis.
Gizmodo was able to find likely data sources for 19 of the data brokers by scouring announcements about past partnerships and integrations. For the remaining handful of these players, the mind-boggling complexity of the data-sharing ecosystem meant it was completely impossible to suss out where, exactly, they were deriving their data. Eerie.
In one case, for example, a company called AlikeAudience was selling access to an estimated 61 million iOS users who were at a “Pregnancy & Maternity Life Stage,” but the listing didn’t go into detail about the source of that data. It simply notes that “AlikeAudience collects data from various sources such as users’ mobile app downloads & usage, geolocations, public records such as POI and self-declared information.”
One possibility is that AlikeAudience leveraged its relationship with Mastercard to see who was buying items in the “Maternity Care” category. While the company’s listing didn’t go into specifics about what a “maternity care” product is in this particular listing, you can kind of fill in the blanks yourself: maternity clothes, prenatal vitamins, etc.
After the publication of this story, AlikeAudience said in a statement, “The health audience segments at AlikeAudience, particularly pregnancy-related segments, are not collected from transaction or credit card data,” adding their company “does not reveal individual information, only predictive user groups based on aggregated data.”
Another data broker called Quotient was more explicit, offering marketers access to the iOS and Android devices of 9.6 million “pregnancy test kit” and 960,000 “female contraceptive” buyers.
Quotient didn’t make it clear in either of those cases where it was getting that purchasing data from, but Gizmodo’s investigation revealed that the company also owns the popular couponing site, coupons.com. The site has offered coupons for products like Plan B in the past, though it does not currently. Gizmodo also found that Quotient had access to purchasing data from shoppers at Giant Eagle—a chain of small pharmacies in the Northeast and Midwest—via a proprietary ad network the data broker operates.
Quotient has yet to respond to Gizmodo’s requests for comment.
In an email statement, a spokesperson for Mastercard said the company only uses “anonymized transaction data” to gather data at the postal code level. As shown in the image above, though, AlikeAudience claims it can create links between such anonymized IDs and users who “voluntarily” give up their data. Mastercard further said it limits how insights from data may be used, but did not clarify in which ways partners were limited.
“When we hear about data that impacts the privacy of people seeking reproductive care, oftentimes it’s easy just to think about period tracking apps or the name of a person who visits an abortion clinic,” said Justin Sherman, a cybersecurity fellow at the Atlantic Council who focuses specifically on data privacy. “But there are whole categories of data around ‘maternal products,’ for example, that also threaten those people’s privacy. It’s really startling to see a lot of that data here.”
As Sherman pointed out, any person seeking reproductive care in the U.S. right now is leaving behind a “massive digital footprint” that they might not always be considering. In the past, for example, we’ve seen at least one case of a woman’s Google Search queries being used to prosecute her in her stillborn baby’s death. Even if someone deletes one of those pesky period tracking apps, Sherman went on, there’s still websites potential parents might visit—or posts they’ll make on social media—that might give them away anyway.
Bennett Cyphers, a staff technologist for the the Electronic Frontier Foundation, said these commercial data brokers are “a big risk” for abortion seekers since those companies “label people and put people into lists that makes it easier for someone who is coming at it like a fishing expedition to narrow down who they want to target and subject them to more scrutiny or and surveillance.”
Gizmodo was able to find each of these datasets up for sale through Liveramp, a company that, in part, functions as a clearinghouse and distribution hub countless data brokers’ wares. Liveramp did not put any restrictions on buying two-thirds of the databases Gizmodo found. As for the minority that did come with purchasing conditions—one dataset containing a collective 2,030,000 iOS and Android users who were “interested in pregnancy,” for example, required authorization from Liveramp before purchasing. The same went for another dataset of 5,400,000 iOS users that were labeled “expectant” mothers, and another dataset of 17,000,000 iOS users that one broker had labeled as “likely to have a baby in the next year.”
Ultimately though, these minor hurdles can be bypassed by just cutting Liveramp out of the equation entirely and going directly to the smaller broker selling that data instead. This approach is “a zillion times easier,” said a product manager working for one popular data broker, who spoke on the condition of anonymity.
Pregnancy data is poised to be a huge boon for law enforcement in the post-Roe era. If you’re a cop, the product manager said, it’s as easy as “filling out [a broker’s] ‘contact us’ form and ask how much it costs. Maybe they say ‘ACAB, pound sand!’ But more likely, they’ll say ‘Put another zero after it, and see if we say yes.’”
“This is purely speculative, but there’s clearly precedent in this industry for selling to law enforcement,” he went on. “And if you don’t do it, someone else probably will.”
Federal law enforcement has been all over data brokers’ and apps’ troves for years. Just recently, a watchdog group revealed that Coinbase had been selling data on crypto users to U.S. Immigration and Customs Enforcement. In a study produced by the nonprofit Center for Democracy and Technology in 2021, researchers showed that agencies were exploiting loopholes in the Electronic Communications Privacy Act by purchasing data from brokers. Just this month, in fact, documents obtained by the ACLU revealed that border patrol officers were collecting location data from phone-owners spread across the southern border every minute. Most of that data came courtesy of a contract with Venntel, a location data broker that itself is a subsidiary of Gravy Analytics, an adtech firm also specializing in location data.
Gravy Analytics was also a name that showed up in Gizmodo’s search for companies brokering maternity data. The company boasted access to about four million iOS and Android devices from people that had recently shopped for maternity clothing, based on “100% deterministic location data, collected via [software development kits] embedded in mobile applications.” Meanwhile, Gizmodo also found another location data-broker, Cuebiq, offering access to the devices of 11 million Android owners that recently visited maternity “destinations.”
In an email statement statement, Cuebiq claimed that the maternity destination tag was for stores selling kids apparel or toys. The company further said the data set doesn’t include “sensitive data” related to healthcare, and they have a policy to not have a commercial relationship with federal or local law enforcement, or “anti-abortion activists.”
“After the overturn of Roe v. Wade last month, we also formalized a policy to legally challenge any warrant or subpoena related to reproductive healthcare cases in states that outlawed abortion,” the company said.
A spokesperson for Gravy Analytics said their data is based on foot traffic in maternity stores, further claiming they don’t share data with law enforcement while also pointing to a recent blog post from their chief privacy officer about their company’s efforts to “protect” user health data post-Roe.
In both cases, it’s almost impossible to know which apps each company is sourcing this location data from. Instead of maintaining a direct relationship with people’s apps, most of these outfits source their data from other brokers, which source their data from other brokers, which source their data from... You get the idea.
Some companies have claimed that this data included in the data sets is from aggregated sources without including any personal identifiable information, or “PII” in industry-speak. Still it’s relatively trivial for anybody with the right knowhow to tie that information back to individual online users.
The Electronic Frontier Foundation said it best in a recent post about companies like Venntel: “The developers of the apps fueling this industry likely have no idea where their users’ data ends up. Users, in turn, have little hope of understanding whether and how their data arrives in these data brokers’ hands.”
But thanks to heaps of newfound regulatory scrutiny—not to mention recent privacy updates that Apple and Google are unleashing on their app stores—that data is getting harder to collect, according to the product manager Gizmodo spoke with.
“We routinely get updates about how we can expect scale to go down in the raw data. Most apps don’t get to access location all the time anymore,” he said—and that means data brokers, in turn, can’t get a full picture of your location, either.
This doesn’t mean authorities aren’t buying people’s location data anymore—in fact, ICE signed another contract with Venntel this past November, which isn’t expiring until June of next year. But even that contract “is a bit like throwing the dice,” he said.
“If the police buy data broker A, but not data broker B, they’ll end up with a partial view of who was at location X,” he explained. “But then you need to ask—does the hypothetical use of data as a way to indict people hinge on what free game app they play? Like, ‘Oops sorry, Temple Run sold you out, should have played Alphabears.’”
Kade Crockford, the Technology for Liberty program director at the ACLU of Massachusetts, said in a phone interview, “One of the very bleak realities that we’re facing right now is that the business model of the internet and the existence of these data brokers have created a really dangerous situation for people, where our most sensitive and private health related information is up for sale to the highest bidder.”
Still, it’s unclear how much use police will get out of these data brokers’ datasets. Crockford and several other digital privacy experts Gizmodo talked to for this story have not come across any cases where local and state law enforcement have tapped data brokers for information on suspects’ pregnancies related to an alleged crime. That’s not to say this data isn’t useful to police. According to Crockford, police have mostly used commercial data that includes location info to find associations of property ownership, including houses, boats or cars.
The Washington Post recently reported there have been 60 cases of prosecutions against pregnant women since the start of the 21st century, based on research from nonprofit advocacy group If/When/How. A significant number of these cases have relied on women’s online activity, though that information is often handed over to police willingly or is taken off of digital devices with a warrant.
Commercial data could be the next step in law enforcement’s playbook. In an interview, Jumana Musa, the Fourth Amendment Center director for the National Association of Criminal Defense Lawyers, called this kind of commercial data on pregnancies “a really valuable treasure trove… [overzealous prosecutors] could buy this information from data brokers and start to follow through and decide what behavior looks suspect enough to prosecute.”
For Musa, broker data represents a way for cops and prosecutors to effectively get around the need for a warrant, since there are few laws governing who’s allowed to purchase commercial datasets. Still, in cases where the info could be out of reach for prosecutors due to expense or a broker’s refusal to sell, Musa said prosecutors have the option to subpoena the data brokers to get their hands on info for specific cases.
“Can they decide that they want to track everybody who’s gone to a particular clinic or a place they suspect to be providing abortions? Absolutely,” Musa said. “And they can already do that, it’s very easy.”
The bar for evidence in abortion cases could be pretty low, depending on each state’s laws or even the moral makeup of the jury, according to NACDL Executive Director Lisa Wayne. Many of these anti-abortion prosecutions could hinge on the legal concept of “mens rea,” which is the supposed intention or knowledge of wrongdoing in the case of an alleged crime.
It’s a legal bar used in many alleged infanticide cases, which have relied on tangential evidence pointing toward intent. One oft-cited example is the case of Mississippi mother Latice Fischer, a Black woman who reportedly had a miscarriage in her home at about 35 weeks into her pregnancy back in 2017. Critics pointed out investigators did not have any direct evidence for Fischer’s intent or decision-making, or even whether she bought and took any pills. Instead, prosecutors relied upon web history that included alleged searches for miscarriage and “apparent” pill purchases to charge her with second degree murder. Several advocacy groups helped get the the charges against Fisher dropped in 2020.
The EFF’s Cyphers said the commercial data could be useful if law enforcement is trying to do “dragnet-like surveillance” of people who might be interested in abortions. But more likely, he said, would be anti-abortion groups getting their hands on this data.
Though Texas has its abortion “bounty hunter” law, there have so far been few cases from the state that indicate how far abortion opponents will go in conducting their lawsuits. Dr. Alan Braid was sued back in 2021, but in his case, he was literally asking for it by posting an op-ed in The Washington Post. The group Texas Right to Life still has a website going to attract tips on abortion seekers or providers, even after it was booted from multiple web hosters.
There have already been cases of anti-abortion activists taking data to push their agenda on pregnant women. Cyphers pointed to the case of an advertising agency in Massachusetts had used geofencing technology to target women in and around Planned Parenthood clinics with anti-abortion ads back in 2016. What’s to stop anti-abortion groups from advertising to users found in commercial data sets?
It’s also worth remembering that the 32 brokers that Gizmodo’s investigation turned up are unlikely to be the last players trafficking in data related to people’s pregnancies or birth control options. After all, all available estimates show that the market for pregnancy care products is only going to keep spiking—and the same can be said about the market for contraceptives. When any of those products need to be marketed, there’s going to be a lot of money involved for whoever coughs up data on that target market.
“Data brokers talk a good game about ‘consumer interest’ and ‘legitimate business reasons,’ Sherman said. “But at the end of the day, they’re transacting in highly sensitive information about people who usually don’t even know they’re being surveilled. And it’s all so they can make a profit.”
Update 08/18/22 at 10 a.m. ET: This story was updated to include a comment from AlikeAudience.