The LabelHash Server


Given a point-based motif, the LabelHash algorithm will compute all matches in either the full PDB or a non-redundant version of the PDB (with less than 95% sequence identity), both from March 25, 2009. The points in this case are the C-alpha coordinates of a number of residues. It is possible to specify alternate residue labels for each motif point. Computing matches in the full PDB is about 5 times slower than computing matches in the non-redundant PDB.

On the main page, fill out the form to specify a motif. If you click submit, your job will be submitted to a queue. You will receive an email message once your job starts. Once your job is finished, you will receive another email message with a link to a web page with the matching results. This web page shows information for the top 20 matches and includes a link to an XML file of all matches. You can visualize this file with matches using our ViewMatch plugin for Chimera. You can also look at them as plain text using the command:

 xmllint -format mymatches.xml | less

The “xmllint” program is part of libxml2. It is already installed on OS X, while on Linux it is part of a package called libxml2-utils, which you can install through your package manager. If you would like to run LabelHash on your own machines, download the command line version of LabelHash.

LabelHash mailing list

We have set up a mailing list for LabelHash users. You can send a message to the list by emailing You can also subscribe to the list or browse the archived messages sent to the list.

Sample output

Below are some example motifs that were used to test the LabelHash algorithm. For each motif we have pre-computed the best matches in the entire PDB from March 25, 2009. For each motif we list the EC class it represents, the PDB ID of the protein that was used to get the 3D coordinates of the motif, and the motif residues. The residues are specified by residue ID in the corresponding PDB file. For each residue the allowed residue types are denoted in superscript by their one-letter residue abbreviations. So if only one residue type is listed, no substitutions are allowed. The last column has a link to the corresponding match results page.

EC classPDB IDMotifMatches 1dwwA 194AC, 346V, 366W, 367FY, 371E, 376DN results page 7mhtA 80P, 81C, 85ST, 119EL, 163R, 165R results page 1jg1A 97DENQ, 99G, 101AGL, 160DNS, 179ILV, 183EGN results page 1kpgA 17D, 72G, 74G, 75W, 76G, 200F results page 1ucnA 12K, 13P, 92G, 105R, 115N, 118G results page 1aniA 51AD, 101DE, 102S, 166CRS, 331GH, 412HNQ results page 1czfA 178N, 180D, 201D, 256HR, 258K, 291Y results page 8tlnE 120LMW, 143AE, 144ILV, 157LSY, 231HL results page 1lbfA 51E, 56S, 57P, 89F, 91G, 112F, 159E, 180N, 211S, 233G results page 1aylA 249L, 250S, 251G, 253G, 254K, 255T results page 2ahjA 53P, 120L, 127Y, 190V, 193D, 196I results page 1ep0A 53AST, 61AR, 64H, 73K, 90R, 172D results page 1didA 25F, 53H, 56D, 93F, 136W, 182K results page 1ggmA 188ET, 311LR, 239ET, 241E, 359ES, 361AS results page 1b7yA 149AGW, 178HQ, 180ST, 206DER, 218Q, 258FNY, 260FY results page 1adyA 81DE, 83T, 112RS, 130DE, 264LY, 311KNQR results page 1kp3A 106R, 139F, 202ES, 286K, 288R, 331Y results page