Rebecca Porter and I were strangers, as far as I knew. Facebook, however, thought we might be connected. Her name popped up this summer on my list of “People You May Know,” the social network’s roster of potential new online friends for me.
The People You May Know feature is notorious for its uncanny ability to recognize who you associate with in real life. It has mystified and disconcerted Facebook users by showing them an old boss, a one-night-stand, or someone they just ran into on the street.
These friend suggestions go far beyond mundane linking of schoolmates or colleagues. Over the years, I’d been told many weird stories about them, such as when a psychiatrist told me that her patients were being recommended to one another, indirectly outing their medical issues.
What makes the results so unsettling is the range of data sources—location information, activity on other apps, facial recognition on photographs—that Facebook has at its disposal to cross-check its users against one another, in the hopes of keeping them more deeply attached to the site. People generally are aware that Facebook is keeping tabs on who they are and how they use the network, but the depth and persistence of that monitoring is hard to grasp. And People You May Know, or “PYMK” in the company’s internal shorthand, is a black box.
To try to get a look into that black box—and the unknown and apparently aggressive data collection that feeds it—I began downloading and saving the list of people Facebook recommended to me, to see who came up, and what patterns might emerge.
On any given day, it tended to recommend about 160 people, some of them over and over again; over the course of the summer, it suggested more than 1,400 different people to me. About 200, or 15 percent of them, were, in fact, people I knew, but the rest appeared to be strangers.
And then there was Rebecca Porter. She showed up on the list after about a month: an older woman, living in Ohio, with whom I had no Facebook friends in common. I did not recognize her, but her last name was familiar. My biological grandfather is a man I’ve never met, with the last name Porter, who abandoned my father when he was a baby. My father was adopted by a man whose last name was Hill, and he didn’t find out about his biological father until adulthood.
The Porter family lived in Ohio. Growing up half a country away, in Florida, I’d known these blood relatives were out there, but there was no reason to think I would ever meet them.
A few years ago, my father eventually did meet his biological father, along with two uncles and an aunt, when they sought him out during a trip back to Ohio for his mother’s funeral. None of them use Facebook. I asked my dad if he recognized Rebecca Porter. He looked at her profile and said he didn’t think so.
I sent the woman a Facebook message explaining the situation and asking if she was related to my biological grandfather.
“Yes,” she wrote back.
Rebecca Porter, we discovered, is my great aunt, by marriage. She is married to my biological grandfather’s brother; she met him 35 years ago, the year after I was born. Facebook knew my family tree better than I did
“I didn’t know about you,” she told me, when we talked by phone. “I don’t understand how Facebook made the connection.”
It was an enjoyable conversation. After we finished the phone call, I sat still for 15 minutes. I was grateful that Facebook had given me the chance to talk to an unknown relation, but awed and disconcerted by its apparent omniscience.
How Facebook had linked us remained hard to fathom. My father had met her husband in person that one time, after my grandmother’s funeral. They exchanged emails, and my father had his number in his phone. But neither of them uses Facebook. Nor do the other people between me and Rebecca Porter on the family tree.
Facebook is known to buy information from data brokers, and a person who previously worked for the company and who is familiar with how the tool works suggested the familial connection may have been discerned that way. But when asked about that scenario, a Facebook spokesperson said, “Facebook does not use information from data brokers for People You May Know.”
What information had Facebook used, then? The company would not tell me what triggered this recommendation, citing privacy reasons. A Facebook spokesperson said that if the company helped me figure out how it made the connection between me and my great aunt, then every other user who got an unexpected friend suggestion would come around asking for an explanation, too.
It was not a very convincing excuse. Facebook gets people to hand over information about themselves all the time; by what principle would it be unreasonable to sometimes hand some of that information back?
The bigger reason the social network may be shy about revealing how the recommendations work is that many of Facebook’s competitors, such as LinkedIn and Twitter, offer similar features to their users. In a 2010 presentation about PYMK, Facebook’s vice-president of engineering explained its value: “People with more friends use the site more.” There’s a competitive advantage to be gained by being the best at this, meaning Facebook is reluctant to reveal what goes into its algorithm.
The caginess is longstanding. Back in 2009, users getting creepily accurate friend suggestions suspected that Facebook was basing the recommendations on their contact information—which they had volunteered when they first signed up, not realizing Facebook would keep it and use it.
Though Facebook is upfront about its use of contact information now, when asked about it in 2009, the company’s then-chief privacy officer, Chris Kelly, wouldn’t confirm what was going on.
“We are constantly iterating on the algorithm that we use to determine the Suggestions section of the home page,” Kelly told Adweek in 2009. “We do not share details about the algorithm itself.”
Not being told exactly how this tool works is frustrating for users, who want to understand the extent of Facebook’s knowledge about them and how deeply the social network peers into their lives. The spokesperson did say that more than 100 signals go into making the friend recommendations and that no one signal alone would trigger a friend suggestion.
One hundred signals! I told the spokesperson that it might be in the search giant’s interest to be more transparent about how this feature works so that users are less creeped out by it. She said Facebook had “in the name of transparency” recently added more information to its help page explaining how People You May Know works, an update noted by USA Today.
That help page offers a brief bulleted list:
People You May Know suggestions come from things like:
• Having friends in common, or mutual friends. This is the most common reason for suggestions
• Being in the same Facebook group or being tagged in the same photo
• Your networks (example: your school, university or work)
• Contacts you’ve uploaded
Depending on how you count them, the listed possibilities are roughly 95 signals shy of adding up to 100 signals. What are all the others?
“We’ve chosen to list the most common reasons someone might be suggested as part of People You May Know,” a Facebook spokesperson wrote in an email when asked about the brevity of the list.
Rather than explaining how Facebook connected me to my great aunt, a spokesperson told me via email to delete the suggestion if I don’t like it.
“People don’t always like some of their PYMK suggestions, so one action people can take to control People You May Know is to ‘X’ out suggestions that they are uninterested in,” the spokesperson wrote via email. “This is the best way to tell us that they’re not interested in connecting with someone online and that feedback helps improve our suggestions over time.”
Now, when I look at my friend recommendations, I’m unnerved not just by seeing the names of the people I know offline, but by all the seeming strangers on the list. How many of them are truly strangers, I wonder—and how many are connected to me in ways I’m unaware of. They are not people I know, but are they people I should know?
If you’ve had a similar experience with a recommendation, or if you’ve worked on PYMK technology, I could use your help.
This story was produced by Gizmodo Media Group’s Special Projects Desk.