The U.S. National Institutes of Health deleted gene sequences taken from early COVID-19 carriers at the request of Chinese researchers, raising concerns about Beijing’s efforts to conceal information crucial to the virus origin investigation.
A Chinese scientist asked the NIH to eliminate the sequences after submitting them three months prior, the NIH told the Wall Street Journal.
“Submitting investigators hold the rights to their data and can request withdrawal of the data,” the NIH said in a statement.
According to the NIH statement, the researcher asked that the sequences be removed from the NIH database because they had been updated and were to be rerouted to another database, the name of which remains unknown. The paper mentions the use of an advanced sequencing technology to detect SARS-CoV-2, the virus that causes COVID.
The deleted data includes sequences from early virus samples taken from hospitalized patients in Wuhan who were believed to have contracted COVID in January and February of 2020, according to a non-peer reviewed paper authored by Jesse Bloom, a virologist at the Fred Hutchinson Cancer Research Center in Seattle.
Bloom told the Journal that the deletion of the sequences from the NIH created “a somewhat skewed picture of viruses circulating in Wuhan early on.” He added that “it suggests possibly one reason why we haven’t seen more of these sequences is perhaps there hasn’t been a wholehearted effort to get them out there.”
Bloom said that while the data was deleted from the NIH’s Sequence Read Archive, he recovered the deleted files from the Google Cloud, allowing him to conduct research to reconstruct partial sequences of 13 early epidemic viruses.
While the data likely won’t significantly contribute to the origin investigation, its erasure at the request of Chinese researchers is the latest in a string of developments which suggest that Beijing is working against U.S. and international efforts to discover the pandemic’s origin.
“It makes us wonder if there are other sequences like these that have been purged,” Vaughn S. Cooper, a University of Pittsburgh evolutionary biologist who did not study or research the topic of the paper, told Journal.
In order to discover the virus’s roots, scientists must find the progenitor virus, the original version from which all other strains descended. Bloom writes in the paper that this is made more difficult given the limited data on early COVID cases provided by Beijing. Most known virus sequences stem from a dozen patients who had contact with the Huanan fish market in December 2019, as well as a handful of sequences gathered before late January 2020.
The new data Bloom uncovered suggests that the pathogen was circulating in Wuhan before it presented itself at the seafood market, Bloom said. The existence of previously unreleased data confirms that the WHO investigative team which dismissed the lab-leak theory as “unlikely” did not have access to the raw data on the earliest COVID cases, according to Bloom.
“This fact suggests that the market sequences, which are the primary focus of the genomic epidemiology in the joint WHO-China report … are not representative of the viruses that were circulating in Wuhan in late December of 2019 and early January of 2020,” he wrote.