Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helical Symmetry Detection Problem #306

Closed
lafita opened this issue Jul 29, 2015 · 6 comments
Closed

Helical Symmetry Detection Problem #306

lafita opened this issue Jul 29, 2015 · 6 comments
Assignees
Labels
Milestone

Comments

@lafita
Copy link
Member

@lafita lafita commented Jul 29, 2015

The Quaternary Symmetry detection algorithm gives False Positive helical symmetry results in some of the PDB entries. Two structures have been found so far: 3OQ9 ad 3MOP.

image

http://www.rcsb.org/pdb/explore/jmol.do?structureId=3MOP&bionumber=1

image

http://www.rcsb.org/pdb/explore/jmol.do?structureId=3OQ9&bionumber=1

To compare with a True Positive Helical Symmetry example (TMV):

image

http://www.rcsb.org/pdb/explore/jmol.do?structureId=4UDV&bionumber=1&opt=3&jmolMode=HTML5

It seems that the helical symmetry is defined locally for some groups of subunits, instead of globally, and there exist jumps between these groups. A possible explanation might be that the RMSD threshold, for allowed RMSD, is too high. For the case of the 3OQ9 helical symmetry, the RMSD is around 1.1 A, whereas the RMSD for the TMV helical symmetry is very low (near 0).

The structures 3OQ9 and 3MOP are annotated by the authors as helical symmetry, but it seems more reasonable that they are heterodimer and asymmetric respectively. Were they used for the parameter determination of the Helical Symmetry detection algorithm?

@lafita
Copy link
Member Author

@lafita lafita commented Jul 29, 2015

Another example: 4WYV.

screenshot

http://www.rcsb.org/pdb/explore/jmol.do?structureId=4WYV&bionumber=1

In this case, the point group would be D4, but the structure has an open side that breaks the symmetry. That is why the Helical detector is called instead of the Rotational detector.

Two requirements could be added to determine more specifically Helical symmetry:

  1. In a multiple-start helical symmetry, the start of the nth helix is before the previous ends.
  2. The helix rise has to allow for infinite helix construction, because if it is less than the structure height, then the first subunit of the next turn will clash with the first subunit of the current turn.
@josemduarte
Copy link
Contributor

@josemduarte josemduarte commented Jul 29, 2015

A note about the last case (4wyv): there is another structure of the same protein (1j1j) which does have a proper D4 without the open side. Apparently in 4wyv the difference is that they purify the protein in complex with RNA, which somehow produces an opening of the D4 (see paper). The RNA however is not seen in the density at all.

andreasprlic added a commit that referenced this issue Jul 31, 2015
…f" and switches to mmCif parsing accordingly.
@sbliven
Copy link
Member

@sbliven sbliven commented Aug 10, 2015

With loose thresholds, these are actually OK helical examples. First, I'll show why these are decent examples, then I'll describe bugs and shortcomings as I see it with the existing implementation and visualization.

3MOP is a tricky case because the subunits have three conformations. If you believe this is helical, this can be argued as coming about from the dramatically different crystal contacts at the "head" and "tail" of the short helix fragment in the asymmetric unit. Below, the "tail" (chains A-F) is green, "body" (G-J) is purple, and the "head" (K-N) is pink.

3MOP chains - free superposition

Superimposing chains A and B from the tail, we see that all subunits align fairly well to the next one up, so it does have a pseudo-helical structure with ~100 degrees between subunits.

3MOP helices - superposition of A and B

3OQ9 contains 10 subunits, with ~140 degree rotations between sequential subunits. The chain order is CADBEJHKIL. Here are two asymmetric units stacked, which makes the axis clearer:

screen shot 2015-08-10 at 15 16 02

Problems

  1. Visualization. Using straight lines between segments is difficult to follow, especially for rotation angles near 180°. This could be solved by fitting a true helix to the centroids. Potential links: Eberly, Enkhbayar 2008
  2. Entity clustering. It's not entirely clear which entities are being considered equivalent here. Both examples have significant structural similarity and so could be considered equivalent, but I'm not sure if the symmetry code is doing so. PDB website gives uneven stoichiometries for both examples (A6B4C4 for 3MOP, corresponding to head/body/tail; A5B5 for 3OQ9, corresponding to head/tail). I would have throught that global helical symmetry would require that all subunits be included, but the breaks in the images suggest that maybe helices are being fit separately to each entity cluster. That would be a bug.
  3. Multiple starts. This is a tough problem. For a set of points distributed evenly along a helix, you can fit an arbitrary number of other helices perfectly to the points, in the same way that you can index a grid of points using an infinite number of basis vectors. Thus, a double helix with equal major and minor grooves is equivalent to a single helix with a different pitch. However, it is not always possible to fit a single helix to points taken from a true double helix, so we do need to be able to detect this case. The problem is that we want to be quite tolerant to errors in the position of the centroid, so it will be a trade-off between overfitting many helices to the points and fitting a single helix with a lot of error. We may also want to take into account the interfaces between sequential subunits. For instance, it may be preferable to fit 3OQ9 using 5 helices (red) or 3 helices (yellow) since sequential subunits along each helix have a big interface, while sequential subunits along the the single helix (green) don't touch at all:

3OQ9 potential helices

@pwrose
Copy link
Member

@pwrose pwrose commented Aug 10, 2015

Great analysis Aleix and Spencer.

Regarding:

  1. Drawing a helix rather than the straight line segments would really help
    with visualization. You just need to keep in mind the primitives you can
    draw in Jmol. You either could use a polyline with short segments or try to
    make the curve out of arcs.
  2. This looks like a bug. The stoichiometry should be checked. To see which
    subunits are considered equivalent, color by "sequence" on the Jmol page.
    For global symmetry, it uses a 95% seq. identity cutoff to distinguish
    subunits. For lower seq. identity cutoffs (to determine pseudosymmetry), it
    does a structural alignment between non-identical sequences.
  3. For multi-start helices, I've implemented a number of algorithms to find
    the "best" layerlines (see methods in HelixLayers). The one being used is
    based on the maximum number of contacts between sequential subunits.

On Mon, Aug 10, 2015 at 8:06 AM, Spencer Bliven notifications@github.com
wrote:

With loose thresholds, these are actually OK helical examples. First, I'll
show why these are decent examples, then I'll describe bugs and
shortcomings as I see it with the existing implementation and visualization.

3MOP is a tricky case because the subunits have three conformations. If
you believe this is helical, this can be argued as coming about from the
dramatically different crystal contacts at the "head" and "tail" of the
short helix fragment in the asymmetric unit. Below, the "tail" (chains A-F)
is green, "body" (G-J) is purple, and the "head" (K-N) is pink.

[image: 3MOP chains - free superposition]
https://cloud.githubusercontent.com/assets/595872/9173160/9fe44c40-3f78-11e5-8a87-2205f0ef772e.png

Superimposing chains A and B from the tail, we see that all subunits align
fairly well to the next one up, so it does have a pseudo-helical structure
with ~100 degrees between subunits.

[image: 3MOP helices - superposition of A and B]
https://cloud.githubusercontent.com/assets/595872/9173121/5c819e1c-3f78-11e5-8378-3487f4ec0c66.png

3OQ9 contains 10 subunits, with ~140 degree rotations between
sequential subunits. The chain order is CADBEJHKIL. Here are two asymmetric
units stacked, which makes the axis clearer:

[image: screen shot 2015-08-10 at 15 16 02]
https://cloud.githubusercontent.com/assets/595872/9172306/171dbf86-3f73-11e5-9a8b-84a63556d308.png

Problems

Visualization. Using straight lines between segments is difficult to
follow, especially for rotation angles near 180°. This could be solved by
fitting a true helix to the centroids. Potential links: Eberly
http://www.geometrictools.com/Documentation/HelixFitting.pdf, Enkhbayar
2008 http://www.ncbi.nlm.nih.gov/pubmed/18467178
2.

Entity clustering. It's not entirely clear which entities are being
considered equivalent here. Both examples have significant structural
similarity and so could be considered equivalent, but I'm not sure if the
symmetry code is doing so. PDB website gives uneven stoichiometries for
both examples (A6B4C4 for 3MOP, corresponding to head/body/tail; A5B5 for
3OQ9, corresponding to head/tail). I would have throught that global
helical symmetry would require that all subunits be included, but the
breaks in the images suggest that maybe helices are being fit separately to
each entity cluster. That would be a bug.
3.

Multiple starts. This is a tough problem. For a set of points
distributed evenly along a helix, you can fit an arbitrary number of other
helices perfectly to the points, in the same way that you can index a grid
of points using an infinite number of basis vectors. Thus, a double helix
with equal major and minor grooves is equivalent to a single helix with a
different pitch. However, it is not always possible to fit a single helix
to points taken from a true double helix, so we do need to be able to
detect this case. The problem is that we want to be quite tolerant to
errors in the position of the centroid, so it will be a trade-off between
overfitting many helices to the points and fitting a single helix with a
lot of error. We may also want to take into account the interfaces between
sequential subunits. For instance, it may be preferable to fit 3OQ9 using 5
helices (red) or 3 helices (yellow) since sequential subunits along each
helix have a big interface, while sequential subunits along the the single
helix (green) don't touch at all:

[image: 3OQ9 potential helices]
https://cloud.githubusercontent.com/assets/595872/9173795/bc10dd44-3f7c-11e5-9508-2da2b25edd1a.png


Reply to this email directly or view it on GitHub
#306 (comment).

Peter Rose, Ph.D.
Site Head, RCSB Protein Data Bank West (http://www.rcsb.org)
San Diego Supercomputer Center (http://bioinformatics.sdsc.edu)
University of California, San Diego
+1-858-822-5497

@lafita
Copy link
Member Author

@lafita lafita commented Aug 11, 2015

I agree with @sbliven, after looking into detail at the examples they have helical symmetry. The stoichiometry and the axes' line segments were confusing in the first place.

I still do not understand why the line segments are not all connected (there are discontinuities), for example in the 3MOP example subunits 6 and 7 (in the order of the helix from the bottom) are not connected, nor 10 and 11. This makes the impression that the helix is broken, or that the symmetry is not conserved between these subunits, which is not true. The same happens in the 3OQ9 example, where there are 3 discontinuities. Maybe fixing that is easier than drawing a helix rather than straight line segments and it will improve the visualization.

About the last example (4WYV), since it is a broken D4 symmetry it has some translation component in addition to the rotation that makes a helix rise, but it is not sufficient to continue the helix more than 4 subunits because of clashes. I think we could check that the helix rise is sufficient enough (larger than the subunit diameter in the direction of helix axis) to discard these cases.

@lafita
Copy link
Member Author

@lafita lafita commented Aug 11, 2015

If I run CeSymm on the structure 3MOP I get the correct axis line segment connections, corresponding to the Spencer's red lines (4-start helix). The only thing that changes between running CeSymm or the QuatSymmetryDetector is the alignment part, because the drawing part is the same. It might give a clue to what is going on:

3mop_axes

3mop_axes_only

With this visualization the helical symmetry is clearer.

@lafita lafita added the enhancement label Feb 25, 2016
@lafita lafita added this to the BioJava 5.0.0 milestone Feb 25, 2016
@lafita lafita self-assigned this Feb 25, 2016
@andreasprlic andreasprlic modified the milestones: BioJava 5.0.1, BioJava 5.0.0 Sep 14, 2016
@lafita lafita added wontfix minor and removed enhancement labels Aug 8, 2017
@lafita lafita closed this Aug 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.