Given a point-based motif, the LabelHash algorithm will compute all matches in either the full PDB or a non-redundant version of the PDB (with less than 95% sequence identity), both from March 25, 2009. The points in this case are the C-alpha coordinates of a number of residues. It is possible to specify alternate residue labels for each motif point. Computing matches in the full PDB is about 5 times slower than computing matches in the non-redundant PDB.
On the main page, fill out the form to specify a motif. If you click submit, your job will be submitted to a queue. You will receive an email message once your job starts. Once your job is finished, you will receive another email message with a link to a web page with the matching results. This web page shows information for the top 20 matches and includes a link to an XML file of all matches. You can visualize this file with matches using our ViewMatch plugin for Chimera. You can also look at them as plain text using the command:
xmllint -format mymatches.xml | less
The “xmllint” program is part of libxml2. It is already installed on OS X, while on Linux it is part of a package called libxml2-utils, which you can install through your package manager. If you would like to run LabelHash on your own machines, download the command line version of LabelHash.
We have set up a mailing list for LabelHash users. You can send a message to the list by emailing labelhash@mailman.rice.edu. You can also subscribe to the list or browse the archived messages sent to the list.
Below are some example motifs that were used to test the LabelHash algorithm. For each motif we have pre-computed the best matches in the entire PDB from March 25, 2009. For each motif we list the EC class it represents, the PDB ID of the protein that was used to get the 3D coordinates of the motif, and the motif residues. The residues are specified by residue ID in the corresponding PDB file. For each residue the allowed residue types are denoted in superscript by their one-letter residue abbreviations. So if only one residue type is listed, no substitutions are allowed. The last column has a link to the corresponding match results page.
EC class | PDB ID | Motif | Matches |
---|---|---|---|
1.14.13.39 | 1dwwA | 194AC, 346V, 366W, 367FY, 371E, 376DN | results page |
2.1.1.73 | 7mhtA | 80P, 81C, 85ST, 119EL, 163R, 165R | results page |
2.1.1.77 | 1jg1A | 97DENQ, 99G, 101AGL, 160DNS, 179ILV, 183EGN | results page |
2.1.1.79 | 1kpgA | 17D, 72G, 74G, 75W, 76G, 200F | results page |
2.7.4.6 | 1ucnA | 12K, 13P, 92G, 105R, 115N, 118G | results page |
3.1.3.1 | 1aniA | 51AD, 101DE, 102S, 166CRS, 331GH, 412HNQ | results page |
3.2.1.15 | 1czfA | 178N, 180D, 201D, 256HR, 258K, 291Y | results page |
3.4.24.27 | 8tlnE | 120LMW, 143AE, 144ILV, 157LSY, 231HL | results page |
4.1.1.48 | 1lbfA | 51E, 56S, 57P, 89F, 91G, 112F, 159E, 180N, 211S, 233G | results page |
4.1.1.49 | 1aylA | 249L, 250S, 251G, 253G, 254K, 255T | results page |
4.2.1.84 | 2ahjA | 53P, 120L, 127Y, 190V, 193D, 196I | results page |
5.1.3.13 | 1ep0A | 53AST, 61AR, 64H, 73K, 90R, 172D | results page |
5.3.1.5 | 1didA | 25F, 53H, 56D, 93F, 136W, 182K | results page |
6.1.1.14 | 1ggmA | 188ET, 311LR, 239ET, 241E, 359ES, 361AS | results page |
6.1.1.20 | 1b7yA | 149AGW, 178HQ, 180ST, 206DER, 218Q, 258FNY, 260FY | results page |
6.1.1.21 | 1adyA | 81DE, 83T, 112RS, 130DE, 264LY, 311KNQR | results page |
6.3.4.5 | 1kp3A | 106R, 139F, 202ES, 286K, 288R, 331Y | results page |