Web document 3.6. Example of global and local alignment.
Contents of this document:
1. Sequences in the fasta format
2. Global alignment using the needle program at EBI
3. Local alignment from a blastp search using the E. coli protein as a query.
1. Sequences in the fasta format
>gi|16130477|ref|NP_417047.1| fused nitric oxide dioxygenase/dihydropteridine reductase 2 [Escherichia coli K12]
MLDAQTIATVKATIPLLVETGPKLTAHFYDRMFTHNPELKEIFNMSNQRNGDQREALFNAIAAYASNIEN
LPALLPAVEKIAQKHTSFQIKPEQYNIVGEHLLATLDEMFSPGQEVLDAWGKAYGVLANVFINREAEIYN
ENASKAGGWEGTRDFRIVAKTPRSALITSFELEPVDGGAVAEYRPGQYLGVWLKPEGFPHQEIRQYSLTR
KPDGKGYRIAVKREEGGQVSNWLHNHANVGDVVKLVAPAGDFFMAVADDTPVTLISAGVGQTPMLAMLDT
LAKAGHTAQVNWFHAAENGDVHAFADEVKELGQSLPRFTAHTWYRQPSEADRAKGQFDSEGLMDLSKLEG
AFSDPTMQFYLCGPVGFMQFTAKQLVDLGVKQENIHYECFGPHKVL
>gi|6321673|ref|NP_011750.1| Nitric oxide oxidoreductase, flavohemoglobin involved in nitric oxide detoxification; plays a role in the oxidative and nitrosative stress responses [Saccharomyces cerevisiae]
MLAEKTRSIIKATVPVLEQQGTVITRTFYKNMLTEHTELLNIFNRTNQKVGAQPNALATTVLAAAKNIDD
LSVLMDHVKQIGHKHRALQIKPEHYPIVGEYLLKAIKEVLGDAATPEIINAWGEAYQAIADIFITVEKKM
YEEALWPGWKPFDITAKEYVASDIVEFTVKPKFGSGIELESLPITPGQYITVNTHPIRQENQYDALRHYS
LCSASTKNGLRFAVKMEAARENFPAGLVSEYLHKDAKVGDEIKLSAPAGDFAINKELIHQNEVPLVLLSS
GVGVTPLLAMLEEQVKCNPNRPIYWIQSSYDEKTQAFKKHVDELLAECANVDKIIVHTDTEPLINAAFLK
EKSPAHADVYTCGSLAFMQAMIGHLKELEHRDDMIHYEPFGPKMSTVQV
2. Global alignment using the needle program at EBI
######################################### Program: needle# Rundate: Tue Mar 13 10:48:31 2007# Align_format: srspair# Report_file: /ebi/extserv/old-work/needle-20070313-10483068446418.output######################################## #=======================================## Aligned_sequences: 2# 1: NP_417047.1# 2: NP_011750.1# Matrix: EBLOSUM62# Gap_penalty: 10.0# Extend_penalty: 0.5## Length: 423# Identity: 145/423 (34.3%)# Similarity: 210/423 (49.6%)# Gaps: 51/423 (12.1%)# Score: 579.0# ##======================================= NP_417047.1 1 MLDAQTIATVKATIPLLVETGPKLTAHFYDRMFTHNPELKEIFNMSNQRN 50 ||..:|.:.:|||:|:|.:.|..:|..||..|.|.:.||..|||.:||:.NP_011750.1 1 MLAEKTRSIIKATVPVLEQQGTVITRTFYKNMLTEHTELLNIFNRTNQKV 50 NP_417047.1 51 GDQREALFNAIAAYASNIENLPALLPAVEKIAQKHTSFQIKPEQYNIVGE 100 |.|..||...:.|.|.||::|..|:..|::|..||.:.|||||.|.||||NP_011750.1 51 GAQPNALATTVLAAAKNIDDLSVLMDHVKQIGHKHRALQIKPEHYPIVGE 100 NP_417047.1 101 HLLATLDEMFSPG--QEVLDAWGKAYGVLANVFINREAEIYNENASKAGG 148 :||..:.|:.... .|:::|||:||..:|::||..|.::|.|.. NP_011750.1 101 YLLKAIKEVLGDAATPEIINAWGEAYQAIADIFITVEKKMYEEAL----- 145 NP_417047.1 149 WEGTRDFRIVAKTPRSALITSFELEPVDGGAV----AEYRPGQYLGVWLK 194 |.|.:.|.|.||...::.|..|.::|..|..: ....||||:.|...NP_011750.1 146 WPGWKPFDITAKEYVASDIVEFTVKPKFGSGIELESLPITPGQYITVNTH 195 NP_417047.1 195 P--EGFPHQEIRQYSLTRKPDGKGYRIAVKRE------EGGQVSNWLHNH 236 | :...:..:|.|||.......|.|.|||.| ..|.||.:||..NP_011750.1 196 PIRQENQYDALRHYSLCSASTKNGLRFAVKMEAARENFPAGLVSEYLHKD 245 NP_417047.1 237 ANVGDVVKLVAPAGDFF----MAVADDTPVTLISAGVGQTPMLAMLDTLA 282 |.|||.:||.||||||. :...::.|:.|:|:|||.||:||||:...NP_011750.1 246 AKVGDEIKLSAPAGDFAINKELIHQNEVPLVLLSSGVGVTPLLAMLEEQV 295 NP_417047.1 283 KAGHTAQVNWFHAAENGDVHAFADEVKEL---GQSLPRFTAHTWYRQPSE 329 |......:.|..::.:....||...|.|| ..::.:...|| NP_011750.1 296 KCNPNRPIYWIQSSYDEKTQAFKKHVDELLAECANVDKIIVHT------- 338 NP_417047.1 330 ADRAKGQFDSEGLMD---LSKLEGAFSDPTMQFYLCGPVGFMQFTAKQLV 376 |:|.|:: |.:...|.:| .|.||.:.|||.....|.NP_011750.1 339 --------DTEPLINAAFLKEKSPAHAD----VYTCGSLAFMQAMIGHLK 376 NP_417047.1 377 DLGVKQENIHYECFGPHKVL 396 :|..:.:.||||.|||.... NP_011750.1 377 ELEHRDDMIHYEPFGPKMSTVQV 399 #---------------------------------------#---------------------------------------
3. Local alignment from a blastp search using the E. coli protein as a query:
>gi|6321673|ref|NP_011750.1| Nitric oxide oxidoreductase, flavohemoglobin involved in nitric oxide detoxification; plays a role in the oxidative and nitrosative stress responses; Yhb1p [Saccharomyces cerevisiae]Length=399 Score = 210 bits (535), Expect = 6e-54, Method: Composition-based stats. Identities = 143/410 (34%), Positives = 208/410 (50%), Gaps = 36/410 (8%) Query 1 MLDAQTIATVKATIPLLVETGPKLTAHFYDRMFTHNPELKEIFNMSNQRNGDQREALFNA 60ML +T + +KAT+P+L + G +T FY M T + EL IFN +NQ+ G Q AL
Sbjct 1 MLAEKTRSIIKATVPVLEQQGTVITRTFYKNMLTEHTELLNIFNRTNQKVGAQPNALATT 60 Query 61 IAAYASNIENLPALLPAVEKIAQKHTSFQIKPEQYNIVGEHLLATLDEMFSPGQ--EVLD 118+ A A NI++L L+ V++I KH + QIKPE Y IVGE+LL + E+ E+++
Sbjct 61 VLAAAKNIDDLSVLMDHVKQIGHKHRALQIKPEHYPIVGEYLLKAIKEVLGDAATPEIIN 120 Query 119 AWGKAYGVLANVFINREAEIYNENASKAGGWEGTRDFRIVAKTPRSALITSFELEPVDGG 178 AWG+AY +A++FI E ++Y E W G + F I AK ++ I F ++P G Sbjct 121 AWGEAYQAIADIFITVEKKMYEEAL-----WPGWKPFDITAKEYVASDIVEFTVKPKFGS 175 Query 179 AVA----EYRPGQYLGVWLKP--EGFPHQEIRQYSLTRKPDGKGYRIAVKREE------G 226 + PGQY+ V P + + +R YSL G R AVK E Sbjct 176 GIELESLPITPGQYITVNTHPIRQENQYDALRHYSLCSASTKNGLRFAVKMEAARENFPA 235 Query 227 GQVSNWLHNHANVGDVVKLVAPAGDFF----MAVADDTPVTLISAGVGQTPMLAMLDTLA 282 G VS +LH A VGD +KL APAGDF + ++ P+ L+S+GVG TP+LAML+ Sbjct 236 GLVSEYLHKDAKVGDEIKLSAPAGDFAINKELIHQNEVPLVLLSSGVGVTPLLAMLEEQV 295 Query 283 KAGHTAQVNWFHAAENGDVHAFADEVKELGQSLPRFTAHTWYRQPSEADRAKGQFDSEGL 342 K + W ++ + AF V EL + + D+ D+E LSbjct 296 KCNPNRPIYWIQSSYDEKTQAFKKHVDEL------------LAECANVDKIIVHTDTEPL 343 Query 343 MDLSKLEGAFSDPTMQFYLCGPVGFMQFTAKQLVDLGVKQENIHYECFGP 392 ++ + L+ S Y CG + FMQ L +L + + IHYE FGPSbjct 344 INAAFLKEK-SPAHADVYTCGSLAFMQAMIGHLKELEHRDDMIHYEPFGP 392