News and Commentary

Biden Official Said We May Never Know Origins Of COVID. Now Scientist Says He Discovered ‘Deleted’ Evidence Of Earliest Known COVID Patients In Wuhan.

Hank BerrienJun 23, 2021

On Wednesday, an American professor and virologist posted what he says is evidence that critical data from the earliest confirmed COVID-19 patients in Wuhan, China, was “deleted,” thus potentially eliminating key information on how the virus originated and how long it had been spreading in its early stages.

In a preprint study published Friday titled “Recovery of deleted deep sequencing data sheds more light on the early Wuhan SARS-CoV-2 epidemic,” Professor Jesse Bloom, a virologist from the Fred Hutchinson Cancer Research Center in Seattle, explained that he discovered a data set of 45 positive samples from Wuhan outpatients with suspected COVID-19 early in the epidemic that was “deleted from the NIH’s Sequence Read Archive.”

“I recover the deleted files from the Google Cloud, and reconstruct partial sequences of 13 early epidemic viruses,” Bloom wrote in the abstract of the study. “Phylogenetic analysis of these sequences in the context of carefully annotated existing data suggests that the Huanan Seafood Market sequences that are the focus of the joint WHO-China report are not fully representative of the viruses in Wuhan early in the epidemic. Instead, the progenitor of known SARS-CoV-2 sequences likely contained three mutations relative to the market viruses that made it more similar to SARS-CoV-2’s bat coronavirus relatives.”

Bloom also issued a Twitter thread on his discovery in which he wrote:

In a new study, I identify and recover a deleted set of #SARSCoV2 sequences that provide additional information about viruses from the early Wuhan outbreak. Specifically, HIN maintains the Sequence Read Archive, where scientists around world deposit deep sequencing data for others to analyze. I noted peerj.com/articles/9255 lists all #SARSCoV2 data in archive as of March 31, 2020. Most from a project by Wuhan University.
But when I went to Sequence Read Archive, I found entire project was gone! (Note that as detailed below, this does *not* imply malfeasance by NIH. Sequence Read Archive policy allows submitters to delete by e-mail request.) I was able to determine deleted data corresponded to a study that partially sequenced “45 nasopharyngeal samples from [Wuhan] outpatients with suspected COVID-19 early in the epidemic.”
I discovered that even though the files were deleted from archive itself, they could be recovered from the Google Cloud … Using this approach, I recovered files for the 34 early samples that were virus positive. I was able to use the data in the files to reconstruct partial viral sequences (from start of spike to end of ORF10) for 13 of these samples.

Bloom explained that whether the emergence of COVID was caused by zoonosis or a lab accident, “everyone agrees deep ancestors are coronavirus from bats.” He pointed out, “Therefore, we’d expect the first #SARSCoV2 sequences would be more similar to bat coronaviruses, and as #SARSCoV2 continued to evolve, it would become more divergent from these ancestors. But that is not the case!”

Instead, he pointed out, “Early Huanan Seafood Market #SARSCoV2 viruses are more different from bat coronaviruses than #SARSCoV2 viruses collected later in China and even other countries. … The conundrum is easily seen by plotting the relative differences from the bat coronavirus RaTG13 outgroup versus collection date for early #SARSCoV2. If we include those sequences, and note 4 sequences from Guangdong are from two groups of people infected in Wuhan in late Dec / early Jan, we get plausible scenarios that resolve above problems. These two scenarios are plotted below. Each has a different ‘progenitor,’ which is the sequence that gave rise to all currently known #SARSCoV2 sequences …

He concluded, “Both progenitors suggest #SARSCoV2 was circulating in Wuhan before December outbreak at Huanan Seafood Market, which is corroborated by lots of other evidence, including news articles from China in early 2020. … There are also broader implications. First, fact this dataset was deleted should make us skeptical that all other relevant early Wuhan sequences have been shared.”

He added, “Sequence sharing could be further limited by fact that scientists in China are under an order from the State Council requiring central approval of all publications.”

Specifically, NIH maintains the Sequence Read Archive, where scientists around world deposit deep sequencing data for others to analyze. I noted https://t.co/6bROuUTlM5 lists all #SARSCoV2 data in archive as of March-31-2020. Most from a project by Wuhan University. (2/n) pic.twitter.com/9V4dchZVUl
— Bloom Lab (@jbloom_lab) June 22, 2021

I was able to determine deleted data corresponded to a study that partially sequenced “45 nasopharyngeal samples from [Wuhan] outpatients with suspected COVID-19 early in the epidemic“ https://t.co/8z5bnJE4tJ (4/n)
— Bloom Lab (@jbloom_lab) June 22, 2021

Using this approach, I recovered files for the 34 early samples that were virus positive. I was able to use the data in the files to reconstruct partial viral sequences (from start of spike to end of ORF10) for 13 of these samples. (6/n)
— Bloom Lab (@jbloom_lab) June 22, 2021

Therefore, we’d expect the first #SARSCoV2 sequences would be more similar to bat coronaviruses, and as #SARSCoV2 continued to evolve it would become more divergent from these ancestors. But that is *not* the case! (8/n)
— Bloom Lab (@jbloom_lab) June 22, 2021

In response to the study, Professor Lawrence Young, a molecular biologist at the University of Warwick, told The Daily Mail:

The study further cements the fact the virus was circulating in Wuhan before the outbreak in December. It highlights and reinforces the inadequacy of the original WHO investigation, which was compromised by not being able to access data. This just shows how important it is to understand the early spread of the virus and any future investigation really does need to look seriously at the whole issue.

Young, said the study “certainly suggests the COVID virus or a very close precursor virus [a less mutated version] was circulating before it started to take off. It could’ve jumped to humans much earlier than the consensus thinking, perhaps even months before.”

On Monday, President Joe Biden’s top intelligence official, Avril Haines, the director of national intelligence, told Yahoo News that it was “absolutely” possible the administration may never have a high degree of confidence regarding the origins of the pandemic that has killed millions of people and devastated economies worldwide, The Daily Wire reported.

Create a free account to join the conversation!

Already have an account?

Menu

Create a free account to join the conversation!