• Home
  • Map
  • R. solanacearum
    • Data
  • P. Infestans
    • Data
  • Virome
    • Data
  • Admin
    • P. Infestans complete set
    • Upload data
    • Login
  • Tutorial

Alert!

Warning! no sample was selected.

Sample analyzing results


Sample Cleaning

sample ID Total Reads Reads to be cleaned % to be cleaned Final Clean Reads % Final Clean Reads

Length Distribution

Virus Identified by nucleotide similarity (BLASTN)

Data interpretation         

Virus Identified by translated protein similarity (BLASTX)

Data interpretation         

FastQC

Important notes on interpreting the virus identification results

This table provides the list of reference sequences from GenBank database to which significant similarity was found in the contigs of the sample. There are several aspects to keep in mind when interpreting these data:

  1. The GenBank database may contain errors and/or outdated nomenclature and therefor results should be verified by checking the reference sequences.

  2. The table provides the minimum, maximum and average sequence identity to the reference sequence and this should be taken into account when interpreting the data. Significant similarity can be achieved with less than 70% sequence identity in which case it may represent a different virus than that listed. Sequence identity criteria for virus species differ between genera (more information http://www.ictvonline.org/index.asp), but generally sequence identity >90% can be considered the same species. The alignment should be further scrutinized if any of the values is below 90%. More information about evaluating alignments

  3. Results will change and become more accurate over time as sequence data of newly identified viruses is added to the GenBank database and we will re-run the analysis with the same data. The version of the GenBank virus sequence database used for the current results is v206.0. (http://www.ncbi.nlm.nih.gov/news/02-17-2015-genbank-206/)

About evaluating alignments.

  1. Alignment showing overlapping contigs with differing levels of sequence identity indicate the presence of several differing variants of the same virus.

    Example: sample GH-S036a (http://bioinfo.bti.cornell.edu/virome/sinfo?sid=GH-S036a)

    Contigs with high sequence identity (>95% shown in red) and low sequence identity (70-80% shown in white to pink) are found aligning to the same regions of the reference sequence (blue).



  2. Alignments showing low level of sequence identity to several different related reference viruses probably indicate a single new virus in the sample.

    Example: sample AO-S003 (http://bioinfo.bti.cornell.edu/virome/sinfo?sid=AO-S003)

    Four different potyviruses with significant sequence similarity are identified (indicated in the red box in the figure below) and all of them show very low sequence identity of around 70% to the reference virus sequences. However one potyvirus identified (shown in the green circle in the figure below) has a much higher level of nucleotide identity than the others (>90%), but has the same name (sweet potato feathery mottle virus) as some of the viruses with much lower sequence identity. The reference in GenBank (AF016366) in green is only a partial sequence and has been mis-identified as sweet potato feathery mottle virus. Based on its low sequence identity to sweet potato feathery mottle virus, AF016366 actually represents another virus species, which is probably the same as the one found in this sample (because of the high sequence identity of >90% to this sample).



  3. Alignments showing high level of sequence identity to some parts of the reference, but low identify to other parts, probably indicate a recombinant isolate in the sample.

    Example: sample MW-S082 (http://bioinfo.bti.cornell.edu/virome/sinfo?sid=MW-S082)

    The about 4.5 kb on the 5’ and 3’ end of the virus (circled in green) show >90% identity to the reference sequence, but around 2 kb in the middle (circled in red) show only around 80% identity. This indicates the central region originates from another variant of the virus and the isolate probably represents a recombinant.