The next time you visit a natural history museum, don’t believe everything you see. At least that’s according to Oxford University researchers, whose new study suggests that half of the specimens held in their collections may have the wrong name.


Often, when a sample arrives at a natural museum—or its exclusively botanical brethren known as a herbarium—it doesn’t have a name. Instead, it’s simply a specimen, plucked from the wild, preserved, and sent along to the institution for safekeeping. There, it will sit in storage until a resident naturalist has time to identify it. But even the most accomplished of biologists can sometimes struggle to tell one insect from the next, or pick out a rare plant sample from a host of others.

“Finding the right name from existing records can sometimes prove difficult,” explains David Harris, an author of the new paper and curator of 3 million herbarium specimens at Royal Botanic Garden Edinburgh. As a result, the specimens put on display in museums don’t necessarily have the correct name—merely the one that’s chosen for them by the collection’s staff.


While that may not sound particularly problematic in isolation, on aggregate it can cause some serious headaches for biologists. “The whole of biology, from evolution to conservation, is underpinned by accurate naming,” explains Dr Robert Scotland from Oxford University’s Department of Plant Science. “Without accurate names on specimens, what’s out there in the real world doesn’t correspond to the name it’s given in a herbarium or museum. Many of the records held in collections around the world simply don’t make sense.”

Sadly, it’s becoming more of a problem. The benefits of modern technology have given rise to large, aggregated online databases of natural history specimens held in collections around the world. The Global Biodiversity Information Facility database is a prime example, which proudly claims to hold details of 577,786,135 specimens describing 1,611,321 species from 767 separate collections at the time of writing. The problem is that those records are taken straight from the museums and herbaria without being checked for accuracy. Many of them are plain wrong.

“The whole of biology, from evolution to conservation, is underpinned by accurate naming.

But how many, exactly? “There is very little data on how many specimens in collections are misidentified,” admits Rudolf Meier, a Professor in the Department of Biological Sciences at the National University of Singapore.


Concerned by the potential problems that such misnaming might create, a team of researchers from Oxford University and Royal Botanic Garden Edinburgh, led by Scotland, embarked on a research project to put some numbers to the problem. The team took three different approaches to understanding how many such mistakes might be found in flowering plants from around the world, which they describe in a new paper published in Current Biology.

First, they thought about how the name of a single specimen can end up changing over time. Over the years, the specimens that are held in the collections of museums and herbaria gradually have their names refined. That’s a natural result of scientific progress, as researchers learn more about the family, or new specimens make it clearer which species a specimen belongs to. “The specimen will have a series of labels added to it over time, with people writing down the date at which they decided a certain name belonged to it,” explains Zoe Goodwin, one of the researchers.


So the team studied 4,500 specimens of Aframomum, a genus in the ginger family found mainly across tropical Africa, to understand how the names given to each specimen had changed over the years.

A full monograph — the gold standard in botanical taxonomy, which demarcates each and every species with long descriptions, genetic analyses and detailed illustrations — of the Aframomum genus had been performed in 2014. That provides an accurate description of all the specimens ever placed into collections. But just prior to the completion of that monograph, the team found that 58 percent of specimens were either misidentified, given a name that was either outdated or redundant, or only identified to the genus or family level. Given few species are monographed because it’s such a time-consuming process, many others likely suffer a similar problem.

Next, Scotland’s team considered how multiple specimens from the exact same sample may be given different names by different museums. It’s common for a sample found in the wild to be used to create a series of different specimens, which are then sent to museums and herbaria around the world. “When you make a herbarium specimen, you basically take a piece of the plant, dry it, and stick it down—but you might make several duplicates to send to different collections,”explains Zoë Goodwin, the lead author of the paper. “Its like separating twins at birth, and it means if you live in Oxford you don’t have to fly to, say, Singapore to go and look at a specimen.” Only once they arrive at their destination, though, are they given a name by an in-house biologist. From there on, they all have completely different naming histories.


Analysis of Dipterocarpaceae, a family of lowland rainforest trees mainly found in Asia, found 9,222 such samples had been turned into at least two duplicate specimens in this way, making a total of 21,075 specimens. Of those, 29 percent had different names in different herbaria. “And one of them has to have got it wrong,” points out John Wood, one of the authors on the paper.

“If I saw a sweet potato recorded as being from Greenland, say, I’d mark that as wrong.”

Finally, the team turned their attention to the online databases themselves. Taking to the Global Biodiversity Information Facility database, the team searched through all the records describing Ipomoea from the Americas, which is a large and diverse genus including morning glory, sweet potato and bindweed, looking for obvious errors.

“If I saw a sweet potato recorded as being from Greenland, say, I’d mark that as wrong,” explains Wood. “If it said it came from Brazil, it would be considered correct even though I’d not seem the specimen. So we’re definitely under- rather than over-estimating.” Studying 49,500 specimens, they found that 40 percent of them used synonyms rather than the real name and 16 percent had names were unrecognisable as real plant names at all.



Taken together, these approaches seem to suggest that somewhere in the region of half of the specimens in collections might be incorrectly named. That is, the team reckons, a conservative estimate. So why are things quite so bad?

The team suggests there are a variety of factors at play. First, the number of specimens in existence around the world has exploded in recent decades: of all the specimens held in collections as of 2000, over half of them were collected since 1969. Second, those specimens are increasingly held in more locations than ever before. It used to be that a handful of collections in the Western world held the majority of samples; now they’re scattered around the world, from San Francisco to Singapore and everywhere in between.

Finally, the team points out that — especially given the first two problems — there simply isn’t enough research time applied to the problem. To weed out all the mistakes contained in specimen names around the world would require something akin to full monographs being carried out for each and every genus of flowering plant. Given the amount of work involved in carrying out a single monograph, it seems unlikely that experts will be able revise their names accordingly.

But there’s worse news. The researchers believe their snapshot of incorrect naming in plants points to a more worrying problem. In a 2004 paper published in Conservation Biology, points out Scotland, Rudolf Meier and Torsten Dikow demonstrated there’s a similar problem among insect specimens. “Their figures for mis-identifications are actually worse than ours,” Scotland explained to me in an email.


Scotland’s reckoning from there goes something like this. Of 1.8 million different described species on Earth, 0.35 million are flowing plants and a further 0.95 million are insects. If half of all the world’s specimens of flowering plants are incorrectly named, and the situation’s worse for insects, it easy enough to do a little maths and be end up rather alarmed. “We think a conservative estimate is that up to half of the world’s natural history specimens could be incorrectly named,” explains Goodwin.

“We think a conservative estimate is that up to half of the world’s natural history specimens could be incorrectly named.”

While it’s impossible to say for sure whether that claim’s true or not—at least without, well, solving the problem in its entirety— Meier agrees that “one would expect that specimens that have not been re-identified in decades to have a fairly high chance of being incorrectly identified.” He also shares the concerns that motivated Scotland’s study in the first place. “This is a serious problem for those museum digitization programs that are not careful enough about only digitizing data for specimens that have a high chance of being correctly identified,” he adds.


All of which raises the obvious question: what the hell can we do about it?

Scotland points out that it’t not a new question. “[Meier and Dikow argued] that the best value for money with regards to organising, using, filtering and most importantly quality controlling biodiversity data is by doing taxonomic revisions rather than merely digitizing poorly curated images and label information,” he points out. “Ten years on there is little evidence that their suggestion was followed.”

The reason for that is, primarily, related to cold, hard cash. “The only way to solve the problem is to have a larger number of specialists working on the neglected taxa,” points out Meier — and they all need to be paid.


That’s not to say the task is implausibly expensive. “Lionel Messi, the footballer, is worth about £100 million,” explains Scotland. “We reckon you can describe a species of plant for about £500. There are 350,000 species of flowering plants, of which about 200,000 are tropical. You could monograph the whole world of tropical fauna for the price of Lionel Messi.”

But it’s unlikely that funding on that scale will be found for taxonomy any time soon because, sadly, it’s not really considered to be a particularly sexy field of science these days. Instead, Scotland suggests that biologists should try and do more with the new kinds of tools they have at their disposal—including rich digital images and, where possible, full genetic analyses with each sample that’s added to an online database.

“You could monograph the whole world of tropical fauna for the price of the footballer Lionel Messi.”

Along those lines, he and his team have been developing something that they refer to as the foundation monograph—a kind of streamlined version of the research exercise currently considered the gold standard, which abandons some of the time-consuming parts, borrows heavily from other published works and generally aims for answers that are accurate rather than exhaustive. They’ve already been trying the technique out, applying it to species of flowering plant such as Ipomoea and Convolvulus L. It seems to work: their reports suggest that they can clean up records for an entire genus in a period of time measured in months rather than years.



As for whether it will actually solve the naming issues at a global scale or not, Scotland is philosophical. “Realistically, it will be very hard to solve the problem entirely,” admits Dr Scotland in a press release. “But by using a clever sampling technique like the foundation monograph, we might at least be able to make a difference.”

Images by Zoë Goodwin et al/Oxford University, Leeds Museums and Galleries, Olga Filonenko, Thomas Quine, matt northam and BLUR LIFE 1975/Shutterstock.