
The first hit has 1.3E-12 the second 0.17. The first two hits are actually recorded from the same sequence, AF069529 (out of the EM_PH database). What is their E-value? Which is most similar?Ĭlick on the header of the DB:ID column to order by ID: 'EM_PH:AF069529' should appear on top. In case we are dealing with nucleotides (as now), similar has been put equal to identical. The column called 'Length', 'identities' and 'positives' should explain themselves: the length of the hit found, the % of identical residues, and similar residues. This is because the chance to obtain a similar score for an unrelated sequence of same length is reduced. If the databank sequence is longer, the E() values are better (lower) and the scores will be higher. So 5.5E-13 means that once in 5.5*10^13 searches with a similar but random query and a similar but random database, such a hit will be found (by chance). The first value you should check, is the E(xpect)-value, which is number of similar hits happen purely by chance (similar = similar length in a similar database). You can group them on the different columns by clicking on the header.
#How to find sequence for bioedit download#
You can use, search and download the sequence.īy default the FASTA search page is set up for protein similarity search. Since we have a universal accession number we can use all major portals. Try yourself first, click 'Show' for tips. Get the sequence with accession number M25165 in fasta, download it and save as M25165.fasta. (Note:running against the complete EMBL databank as compared to the bacteriophage section would take to much time for this exercise session - remember that fasta is rather slow). We will search with a fragment of the bacteriophage lambda genome containing the third right operator (OR3) and the right promoter (PR) to find similar sequences in the bacteriophage section of EMBL databank.

Go to, then select the menu item "Tools"/"Similarity & Homology"/"FASTA". We will use the FastA server of the EMBL-EBI, which has a richer collection of databanks.

FastA is older than the popular BLAST, it is slower, but sometimes finds things that BLAST does not find, especially for non-coding DNA (e.g. Pearson's FastA finds similar sequences in a database given a query sequence.
