Web document 5.3. PHI-BLAST.
We can do a blastp search restricted to bacterial sequences in the refseq database, using human RBP4 as a query.
The query is:
>gi|55743122|ref|NP_006735.2| retinol-binding protein 4, plasma precursor [Homo sapiens]
MKWVWALLLLAALGSGRAERDCRVSSFRVKENFDKARFSGTWYAMAKKDPEGLFLQDNIVAEFSVDETGQ
MSATAKGRVRLLNNWDVCADMVGTFTDTEDPAKFKMKYWGVASFLQKGNDDHWIVDTDYDTYAVQYSCRL
LNLDGTCADSYSFVFSRDPNGLPPEAQKIVRQRQEELCLARQYRLIVHNGYCDGRSERNLL
The two bacterial matches having low E values are:
>gi|119470685|ref|ZP_01613353.1| outer membrane lipoprotein (lipocalin) [Alteromonadales bacterium TW-7]
MKAITTILLITGLFLLTACTSAPEGITPVKNFDLEQYKGKWYEIARLDHSFEEGMEQVTATYTVNDDGTV
KVLNKGFITKEQKWDEAEGLAKFVEGTDTGHFKVSFFGPFYGAYVIFELDQDDYQYAFITSYNRDFLWFL
SRTPTVSDKLKQHFIAKANKLGFATEQIIWVKQ
>gi|84519543|ref|ZP_01006814.1| lipoprotein Blc [Prochlorococcus marinus str. MIT 9211]
MYLLLENGALAMMAVLRRWFLIVGLMGLASCTSLPEGIEPVSGFDSDRYLGTWYEIARLDHSFERGLTNV
RAEYSRNDDGSIKVINRGYNAEEEQWEEADGRAVFVEDENTGHLKVSFFGPFYASYVVFELDKDEYSYAY
VTGYDRDYLWFLSRTPEVS
A multiple sequence alignment, done using ClustalW (Chapter 6), is shown here (the highly conserved GXW motif is shaded green):
CLUSTAL W (1.83) multiple sequence alignment ZP_01613353 ------------MKAITTILLITGL-FLLTACTSAPEGITPVKNFDLEQYKGKWYEIARL 47ZP_01006814 MYLLLENGALAMMAVLRRWFLIVGL-MGLASCTSLPEGIEPVSGFDSDRYLGTWYEIARL 59human_NP_006735 ------------MKWVWALLLLAALGSGRAERDCRVSSFRVKENFDKARFSGTWYAMAKK 48 * : :*:..* : . ..: ..** :: *.** :*: ZP_01613353 DHSFEEGMEQVTATYTVNDDGTVKVLNKGFITKEQKWDEAEGLA-KFVEGTDTGHFKVSF 106ZP_01006814 DHSFERGLTNVRAEYSRNDDGSIKVINRGYNAEEEQWEEADGRA-VFVEDENTGHLKVSF 118human_NP_006735 DPEGLFLQDNIVAEFSVDETGQMSATAKGRVRLLNNWDVCADMVGTFTDTEDPAKFKMKY 108 * . :: * :: :: * :.. :* ::*: . . . *.: :..::*:.: ZP_01613353 FG--PFYG----AYVIFELDQDDYQYAFIT-------SYNRDFLWFLSRTP-TVSDKLKQ 152ZP_01006814 FG--PFYA----SYVVFELDKDEYSYAYVT-------GYDRDYLWFLSRTP-EVS----- 159human_NP_006735 WGVASFLQKGNDDHWIVDTDYDTYAVQYSCRLLNLDGTCADSYSFVFSRDPNGLPPEAQK 168 :* .* : :.: * * * : .: :.:** * :. ZP_01613353 HFIAKANKLGFATEQIIWVKQ------------ 173ZP_01006814 ---------------------------------human_NP_006735 IVRQRQEELCLARQYRLIVHNGYCDGRSERNLL 201
One can select many possible patterns for PHI-BLAST searches. Here is one:
GXW[YF]X[VILMAFY]A[RKH]XD