SPRINT Home UMBER Home Contents Standard Search Advanced Search Relation Search

==SPRINT==> PRINTS View



  selected as


PR00789

Identifier
OSIALOPTASE  [View Relations]  [View Alignment]  
Accession
PR00789
No. of Motifs
6
Creation Date
11-JUN-1997  (UPDATE 07-JUN-1999)
Title
O-sialoglycoprotein endopeptidase (M22) metallo-protease family signature
Database References

PROSITE; PS01016 GLYCOPROTEASE
PFAM; PF00814 Glycoprotease
INTERPRO; IPR000905
Literature References
1. RAWLINGS, N.D. AND BARRETT, A.J.
Evolutionary families of metallopeptidases.
METHODS ENZYMOL. 248 183-228 (1995).
 
2. MELLORS, A. AND LO, R.Y.C.
O-Sialoglycoprotease from Pasturella haemolytica.
METHODS ENZYMOL. 248 728-740 (1995).

Documentation
Metalloproteases are the most diverse of the four main types of protease,
with more than 30 families identified to date [1]. Of these, around
half contain the HEXXH motif, which has been shown in crystallographic
studies to form part of the metal-binding site [1]. The HEXXH motif is 
relatively common, but can be more stringently defined for metallo-
proteases as abXHEbbHbc, where a is most often valine or threonine and 
forms part of the S1' subsite in thermolysin and neprilysin, b is an
uncharged residue, and c a hydrophobic residue. Proline is never found
in this site, possibly because it would break the helical structure 
adopted by this motif in metalloproteases [1].
 
Metalloproteases can be split into five groups on the basis of their metal-
binding residues: the first three contain the HEXXH motif, the other two
do not [1]. In the first group, a glutamic acid completes the active site -
these are termed HEXXH+E: all families in this group show some sequence
relationship and have been assigned to clan MA [1]. The second group, which
have a third histidine as the extra metal-binding residue, are termed
HEXXH+H and are grouped into clan MB on the basis of their inter-relation-
ship [1]. In the third group, the additional metal-binding residues are
unidentified. The fourth group is diverse - the metal-binding residues are
known but do not form the HEXXH motif. And the fifth group comprises the
remaining families where the metal-binding residues are as yet unknown [1].
 
O-Sialoglycoprotein endopeptidase is secreted by the bacterium Pasturella
haemolytica and digests only proteins that are heavily sialylated, in
particular those with sialylated serine and threonine residues [2].
Substrate proteins include glycophorin A and leukocyte surface antigens
CD34, CD43, CD44 and CD45 [1,2]. Removal of glycosylation, by treatment
with neuraminidase, completely negates susceptibility to O-sialoglycoprotein
endopeptidase digestion [1,2].
 
Sequence similarity searches have revealed other members of the M22 family,
from yeast, Mycobacterium, Haemophilus influenzae and the cyanobacterium
Synechocystis [1]. The zinc-binding and catalytic residues of this family 
have not been determined, although the motif HMEGH may be a zinc-binding
region [1].
 
OSIALOPTASE is a 6-element fingerprint that provides a signature for the
O-sialoglycoprotein endopeptidase (M22) family of metalloproteases. The
fingerprint was derived from and initial alignment of 6 sequences: the
motifs were drawn from conserved regions spanning the N-terminal half of
the alignment - motif 3 includes the region encoded by PROSITE pattern
GLYCOPROTEASE (PS0106), which describes the region surrounding the HMEGH
motif; and motif 4 contains well-conserved histidines that may be involved
in zinc binding. Three iterations on OWL29.3 were required to reach
convergence, at which point a true set comprising 13 sequences was
identified. A single partial match was also found, YHSH_HALMA, a 
hypothetical protein fragment from the archaebacterium Haloarcula
marismortui that makes only a weak match with motif 4 and lacks the
portion of sequence bearing motif 6.
 
An update on SPTR37_9f identified a true set of 23 sequences, and 1
partial match.
Summary Information
  23 codes involving  6 elements
1 codes involving 5 elements
0 codes involving 4 elements
0 codes involving 3 elements
0 codes involving 2 elements
Composite Feature Index
6232323232323
5111110
4000000
3000000
2000000
123456
True Positives
GCP_BORBU     GCP_HAEIN     GCP_MYCGE     GCP_MYCPN     
GCP_PASHA O22145 O29153 O49653
O57716 O66986 O83686 O84200
O86793 Q93170 QRI7_YEAST Y09A_MYCTU
Y246_MYCLE Y807_SYNY3 YB30_METJA YDIE_BACSU
YE25_METTH YGJD_ECOLI YK18_YEAST
True Positive Partials
Codes involving 5 elements
GCP_HELPY
Sequence Titles
GCP_BORBU   PUTATIVE O-SIALOGLYCOPROTEIN ENDOPEPTIDASE (EC 3.4.24.57) (GLYCOPROTEASE) - BORRELIA BURGDORFERI (LYME DISEASE SPIROCHETE). 
GCP_HAEIN PROBABLE O-SIALOGLYCOPROTEIN ENDOPEPTIDASE (EC 3.4.24.57) (GLYCOPROTEASE) - HAEMOPHILUS INFLUENZAE.
GCP_MYCGE PUTATIVE O-SIALOGLYCOPROTEIN ENDOPEPTIDASE (EC 3.4.24.57) (GLYCOPROTEASE) - MYCOPLASMA GENITALIUM.
GCP_MYCPN PUTATIVE O-SIALOGLYCOPROTEIN ENDOPEPTIDASE (EC 3.4.24.57) (GLYCOPROTEASE) - MYCOPLASMA PNEUMONIAE.
GCP_PASHA O-SIALOGLYCOPROTEIN ENDOPEPTIDASE (EC 3.4.24.57) (GLYCOPROTEASE) - PASTEURELLA HAEMOLYTICA.
O22145 PUTATIVE SIALOGLYCOPROTEASE - ARABIDOPSIS THALIANA (MOUSE-EAR CRESS).
O29153 O-SIALOGLYCOPROTEIN ENDOPEPTIDASE (GCP) - ARCHAEOGLOBUS FULGIDUS.
O49653 GLYCOPROTEIN ENDOPEPTIDASE - LIKE PROTEIN - ARABIDOPSIS THALIANA (MOUSE-EAR CRESS).
O57716 324AA LONG HYPOTHETICAL O-SIALOGLYCOPROTEIN ENDOPEPTIDASE - PYROCOCCUS HORIKOSHII.
O66986 SIALOGLYCOPROTEASE - AQUIFEX AEOLICUS.
O83686 O-SIALOGLYCOPROTEIN ENDOPEPTIDASE (GCP) - TREPONEMA PALLIDUM.
O84200 O-SIALOGLYCOPROTEIN ENDOPEPTIDASE - CHLAMYDIA TRACHOMATIS.
O86793 PUTATIVE O-SIALOGLYCOPROTEIN ENDOPEPTIDASE - STREPTOMYCES COELICOLOR.
Q93170 C01G10.10 PROTEIN - CAENORHABDITIS ELEGANS.
QRI7_YEAST PUTATIVE PROTEASE QRI7 (EC 3.4.24.-) - SACCHAROMYCES CEREVISIAE (BAKER'S YEAST).
Y09A_MYCTU HYPOTHETICAL 35.1 KD PROTEIN CY78.10 - MYCOBACTERIUM TUBERCULOSIS.
Y246_MYCLE HYPOTHETICAL 35.4 KD PROTEIN IN GROES-ALR INTERGENIC REGION - MYCOBACTERIUM LEPRAE.
Y807_SYNY3 HYPOTHETICAL 37.1 KD PROTEIN SLR0807 - SYNECHOCYSTIS SP. (STRAIN PCC 6803).
YB30_METJA HYPOTHETICAL PROTEIN MJ1130 - METHANOCOCCUS JANNASCHII.
YDIE_BACSU HYPOTHETICAL 36.8 KD PROTEIN IN PHOB-GROES INTERGENIC REGION - BACILLUS SUBTILIS.
YE25_METTH O-SIALOGLYCOPROTEIN ENDOPEPTIDASE - METHANOBACTERIUM THERMOAUTOTROPHICUM.
YGJD_ECOLI HYPOTHETICAL 36.0 KD PROTEIN IN TTDB-RPSU INTERGENIC REGION (ORF-X) - ESCHERICHIA COLI.
YK18_YEAST HYPOTHETICAL 46.6 KD PROTEIN IN DAL80-GAP1 INTERGENIC REGION - SACCHAROMYCES CEREVISIAE (BAKER'S YEAST).

GCP_HELPY PUTATIVE O-SIALOGLYCOPROTEIN ENDOPEPTIDASE (EC 3.4.24.57) (GLYCOPROTEASE) - HELICOBACTER PYLORI (CAMPYLOBACTER PYLORI).
Scan History
OWL29_3    3  250  NSINGLE    
SPTR37_9f 2 29 NSINGLE
Initial Motifs
Motif 1  width=14
Element Seqn Id St Int Rpt
LGIETSCDETGVAI GCP_HAEIN 4 4 -
LGIETSCDETGVAI GCP_PASHA 4 4 -
LAIETSCDETGVGI Y246_MYCLE 12 12 -
LGIETSCDETGIAI YGJD_ECOLI 4 4 -
LGIETTCDDTGLSI GCP_MYCGE 8 8 -
VTIFTSCNDAYIYL YK18_YEAST 6 6 -

Motif 2 width=21
Element Seqn Id St Int Rpt
IAYTSGPGLVGALLVGATIAR GCP_HAEIN 76 58 -
IAYTAGPGLVGALLVGSTIAR GCP_PASHA 76 58 -
VAATIGPGLAGALLVGVAAAK Y246_MYCLE 89 63 -
VAYTAGPGLVGALLVGATVGR YGJD_ECOLI 76 58 -
IAYACNPGLAGCLHVGATFAR GCP_MYCGE 75 53 -
ICFTKGPGMGAPLHSVVIAAR YK18_YEAST 141 121 -

Motif 3 width=20
Element Seqn Id St Int Rpt
SLAYAWNVPAIGVHHMEGHL GCP_HAEIN 97 0 -
SLAYAWNVPALGVHHMEGHL GCP_PASHA 97 0 -
AYSAAWGVPFYAVNHLGGHL Y246_MYCLE 110 0 -
SLAFAWDVPAIPVHHMEGHL YGJD_ECOLI 97 0 -
SLSFLLDKPLLPINHLYAHI GCP_MYCGE 96 0 -
TCSLLWDVPLVGVNHCIGHI YK18_YEAST 162 0 -

Motif 4 width=13
Element Seqn Id St Int Rpt
VALLVSGGHTQLV GCP_HAEIN 131 14 -
VALLISGGHTQLV GCP_PASHA 131 14 -
VALLVSGGHTHLL Y246_MYCLE 143 13 -
VALLVCGGHTQLI YGJD_ECOLI 131 14 -
LGLVISGGHTAIY GCP_MYCGE 132 16 -
VVLYVSGGNTQVI YK18_YEAST 194 12 -

Motif 5 width=22
Element Seqn Id St Int Rpt
IGESIDDAAGEAFDKTAKLLGL GCP_HAEIN 154 10 -
LGESIDDAAGEAFDKTGKLLGL GCP_PASHA 154 10 -
LGSTVDDAAGEAYDKVARLLGL Y246_MYCLE 167 11 -
LGESIDDAAGEAFDKTAKLLGL YGJD_ECOLI 154 10 -
IAETSDDAIGEVYDKIGRAMGF GCP_MYCGE 155 10 -
FGETLDIAIGNCLDRFARTLKI YK18_YEAST 216 9 -

Motif 6 width=10
Element Seqn Id St Int Rpt
LVIAGGVSAN GCP_HAEIN 268 92 -
LVMAGGVSAN GCP_PASHA 268 92 -
LLIVGGVAAN Y246_MYCLE 276 87 -
LVMAGGVSAN YGJD_ECOLI 263 87 -
LLVGGGVSAN GCP_MYCGE 264 87 -
VLIVGGVGCN YK18_YEAST 341 103 -
Final Motifs
Motif 1  width=14
Element Seqn Id St Int Rpt
LGIETSCDETGVAI GCP_HAEIN 4 4 -
LGIETSCDETGVGI Y09A_MYCTU 5 5 -
LGIETSCDETGVAI GCP_PASHA 4 4 -
LAIETSCDETGVGI Y246_MYCLE 12 12 -
LGIETSCDETGVGV O86793 11 11 -
LGIETSCDETAAAI YDIE_BACSU 10 10 -
LGIETSCDETGIAI YGJD_ECOLI 4 4 -
LAIETSCDETAVAI Y807_SYNY3 5 5 -
LAVETSCDETALAI O66986 4 4 -
LGIETSCDETAVAI O83686 4 4 -
LGIEGTAHTLGIGI O57716 4 4 -
LGIETTCDDTSIGV GCP_MYCPN 8 8 -
LGIEGTAEKTGVGI YE25_METTH 4 4 -
LGIETTCDDTGLSI GCP_MYCGE 8 8 -
LGIETSCDDTAAAV O22145 87 87 -
LGLEGTAEKTGVGI YB30_METJA 4 4 -
LGIETSCDDCCVAV GCP_BORBU 4 4 -
LAIETSCDDTCVSV QRI7_YEAST 36 36 -
LGIEGTAWSLSIGV O29153 4 4 -
LGLESSCDETSCSL O84200 4 4 -
VTIFTSCNDAYIYL YK18_YEAST 6 6 -
LGIETSCDDTAVAI Q93170 26 26 -
IGFEGSANKIGVGI O49653 8 8 -

Motif 2 width=21
Element Seqn Id St Int Rpt
IAYTSGPGLVGALLVGATIAR GCP_HAEIN 76 58 -
VAATIGPGLAGALLVGVAAAK Y09A_MYCTU 79 60 -
IAYTAGPGLVGALLVGSTIAR GCP_PASHA 76 58 -
VAATIGPGLAGALLVGVAAAK Y246_MYCLE 89 63 -
IAVTAGPGLAGALLVGVSAAK O86793 82 57 -
IAVTEGPGLVGALLIGVNAAK YDIE_BACSU 82 58 -
VAYTAGPGLVGALLVGATVGR YGJD_ECOLI 76 58 -
IAVTVAPGLAGALMVGVTAAK Y807_SYNY3 76 57 -
ISFTLTPGLILSLVVGVAFAK O66986 76 58 -
IAVTHAPGLTGSLLVGLTFAK O83686 76 58 -
IAFSQGPGLGPALRVVATAAR O57716 72 54 -
IAYAANPGLPGCLHVGATFAR GCP_MYCPN 75 53 -
ISFSRGPGLGPALRTVATAAR YE25_METTH 73 55 -
IAYACNPGLAGCLHVGATFAR GCP_MYCGE 75 53 -
VAVTIGPGLSLCLRVGVRKAR O22145 156 55 -
IAFSQGPGLGPSLRVTATVAR YB30_METJA 71 53 -
IAVTSRPGLIGSLIVGLNFAK GCP_BORBU 75 57 -
ICVTRGPGMPGSLSGGLDFAK QRI7_YEAST 110 60 -
VAFSQGPGMGPCLRVVATAAR O29153 70 52 -
ISVANTPGLIGALSIGVNFAK O84200 74 56 -
ICFTKGPGMGAPLHSVVIAAR YK18_YEAST 141 121 -
VAVTVTPGLVIALKEGISAAI Q93170 98 58 -
ICYTKGPGMGAPLQVSAIVVR O49653 78 56 -

Motif 3 width=20
Element Seqn Id St Int Rpt
SLAYAWNVPAIGVHHMEGHL GCP_HAEIN 97 0 -
AYSAAWGVPFYAVNHLGGHL Y09A_MYCTU 100 0 -
SLAYAWNVPALGVHHMEGHL GCP_PASHA 97 0 -
AYSAAWGVPFYAVNHLGGHL Y246_MYCLE 110 0 -
AYAYALGKPLYGVNHLASHI O86793 103 0 -
ALSFAYNIPLVGVHHIAGHI YDIE_BACSU 103 0 -
SLAFAWDVPAIPVHHMEGHL YGJD_ECOLI 97 0 -
TLAMVHQKPFLGVHHLEGHI Y807_SYNY3 97 0 -
ALAYEYRKPLVPVHHLEGHI O66986 97 0 -
TLAWSMHLPFIAVNHLHAHF O83686 97 0 -
ALAIRYNKPIVGVNHCIAHV O57716 93 0 -
SLSFLLDKPLLPINHLYAHI GCP_MYCPN 96 0 -
TLALSLDVPIVGVNHCIGHI YE25_METTH 94 0 -
SLSFLLDKPLLPINHLYAHI GCP_MYCGE 96 0 -
RVAGNFSLPIVGVHHMEAHA O22145 177 0 -
TLSLTLKKPIIGVNHCIAHI YB30_METJA 92 0 -
GLAISLKKPIICIDHILGHL GCP_BORBU 96 0 -
GLAVAWNKPLIGVHHMLGHL QRI7_YEAST 131 0 -
LLAIKLEKPLVGVNHCLAHV O29153 91 0 -
GLASGLKRPLIGVNHVEAHL O84200 95 0 -
TCSLLWDVPLVGVNHCIGHI YK18_YEAST 162 0 -
GFAKKHRLPLIPVHHMRAHA Q93170 119 0 -
VLSQLWKKPIVAVNHCVAHI O49653 99 0 -

Motif 4 width=13
Element Seqn Id St Int Rpt
VALLVSGGHTQLV GCP_HAEIN 131 14 -
VALLVSGGHTHLL Y09A_MYCTU 133 13 -
VALLISGGHTQLV GCP_PASHA 131 14 -
VALLVSGGHTHLL Y246_MYCLE 143 13 -
MALLVSGGHSSLL O86793 137 14 -
LALVVSGGHTELV YDIE_BACSU 136 13 -
VALLVCGGHTQLI YGJD_ECOLI 131 14 -
LCLLVSGGHTSLI Y807_SYNY3 131 14 -
LALIISGGHTDLY O66986 130 13 -
VGLLASGGHALVC O83686 130 13 -
VGLYVSGGNTQVL O57716 124 11 -
LGLVVSGGHTAIY GCP_MYCPN 132 16 -
VSLYVSGGNTQVI YE25_METTH 126 12 -
LGLVISGGHTAIY GCP_MYCGE 132 16 -
MALLISGGHNLLV O22145 211 14 -
LTLYVSGGNTQVI YB30_METJA 124 12 -
ISLLLSGGHTLIA GCP_BORBU 129 13 -
VSLLVSGGHTTFV QRI7_YEAST 167 16 -
VSLYVSGGNSQVI O29153 123 12 -
LGLAISGAHTSLF O84200 129 14 -
VVLYVSGGNTQVI YK18_YEAST 194 12 -
SAVLLSGGHALIS Q93170 153 14 -
VVLYVSGGNTQVI O49653 131 12 -

Motif 5 width=22
Element Seqn Id St Int Rpt
IGESIDDAAGEAFDKTAKLLGL GCP_HAEIN 154 10 -
LGSTVDDAAGEAYDKVARLLGL Y09A_MYCTU 157 11 -
LGESIDDAAGEAFDKTGKLLGL GCP_PASHA 154 10 -
LGSTVDDAAGEAYDKVARLLGL Y246_MYCLE 167 11 -
LGATIDDAAGEAFDKIARVLNL O86793 161 11 -
IGETLDDAAGEAYDKVARTMGL YDIE_BACSU 159 10 -
LGESIDDAAGEAFDKTAKLLGL YGJD_ECOLI 154 10 -
LGTTRDDAAGEAFDKVARLLDL Y807_SYNY3 154 10 -
LGGTLDDAVGEAYDKVAKMLGL O66986 153 10 -
LGATIDDAPGEAFDKVAAFYGF O83686 153 10 -
FGETLDIGIGNAIDVFARELGL O57716 146 9 -
IAETSDDAIGEVYDKVGRAMGF GCP_MYCPN 155 10 -
FGETLDIAVGNMLDQFARESGL YE25_METTH 148 9 -
IAETSDDAIGEVYDKIGRAMGF GCP_MYCGE 155 10 -
LGTTVDDAIGEAFDKTAKWLGL O22145 234 10 -
FGETLDIAVGNCLDQFARYVNL YB30_METJA 146 9 -
LGRTLDDACGEAFDKVAKHYDM GCP_BORBU 152 10 -
LCDTIDIAVGDSLDKCGRELGF QRI7_YEAST 190 10 -
FGETLDIGIGNALDKLARHMGL O29153 145 9 -
IGKTRDDAIGETFDKVARFLGL O84200 152 10 -
FGETLDIAIGNCLDRFARTLKI YK18_YEAST 216 9 -
YGQSVSGSPGECIDKVARQLGD Q93170 176 10 -
FGETIDIAVGNCLDRFARVLKL O49653 153 9 -

Motif 6 width=10
Element Seqn Id St Int Rpt
LVIAGGVSAN GCP_HAEIN 268 92 -
LLIAGGVAAN Y09A_MYCTU 269 90 -
LVMAGGVSAN GCP_PASHA 268 92 -
LLIVGGVAAN Y246_MYCLE 276 87 -
LMIGGGVAAN O86793 274 91 -
VLLAGGVAAN YDIE_BACSU 269 88 -
LVMAGGVSAN YGJD_ECOLI 263 87 -
ITVGGGVAAN Y807_SYNY3 270 94 -
LVVVGGVSAN O66986 259 84 -
AVVCGGVAAN O83686 266 91 -
VVLVGGVAAN O57716 248 80 -
LLLGGGVSAN GCP_MYCPN 268 91 -
VLLCGGVAVN YE25_METTH 249 79 -
LLVGGGVSAN GCP_MYCGE 264 87 -
MVISGGVASN O22145 348 92 -
VMLVGGVAAN YB30_METJA 247 79 -
LVIAGGVASN GCP_BORBU 265 91 -
FVCSGGVSSN QRI7_YEAST 320 108 -
VLLVGGVAAN O29153 246 79 -
LIVGGGVANN O84200 268 94 -
VLIVGGVGCN YK18_YEAST 341 103 -
LVIGGGVAAN Q93170 300 102 -
VLIVGGVGCN O49653 261 86 -