- Jul 2018
-
europepmc.org europepmc.org
-
On 2013 Jul 11, Joshua L Cherry commented:
This article presents an analysis of glycosylation patterns and sequence variations among human H1N1 influenza virus hemagglutinins. The authors conclude that glycosylation decreases sequence variation at nearby positions in the linear sequence by modifying selective pressures, perhaps by shielding these residues from antibodies. Careful examination of the results, especially with a phylogenetic tree in hand, shows that this conclusion is unfounded. The "extra" sequence variation observed among those viruses lacking glycosylation in a region did not arise among viruses lacking glycosylation in that region. Rather, this variation is the result of the mutations that eliminated the glycosylation site.
A clear example is the peak of variation at position 129 in Fig. 3 for the {91, 162} class. Sequences glycosylated at position 129 necessarily have an asparagine at this position, and hence no variation. For this reason alone, the high variability compared to sequences with glycosylation in region 129 is a trivial result. Furthermore, mapping of the sequence changes onto a phylogenetic tree shows that the sequences in class {91, 162} (which contains 2% of the sequences) are derived from class {91, 129, 162} (the largest class, with 68% of the sequences) by loss of a glycosylation site. All of the variation at position 129 appears to be the result of mutation of the asparagine in a 129-glycosylated virus that eliminates the glycosylation site, moving the sequence into class {91, 162}.
The nearby peak at 131 has a similar explanation. This site must be S or T for glycosylation at 129, so its mutation leads to loss of that glycosylation site. Also, some of the glycosylation sites in the 129 region are at position 131, and mutation of that asparagine is another way for a sequence to move into the {91, 162} category. All of the variation at position 131 appears to have arisen in one or the other of these ways.
The peaks at 164 and 166 for the {91, 129} class have an analogous explanation. The remaining claimed examples of the phenomenon are attributable to the same effect combined with problems with residue numbering due to the treatment of partial sequences, alignment gaps, and ambiguity characters.
Thus, the observed peaks represent losses of glycosylation rather than changes to unglycosylated sequences. These mutations may have been selected by growth in the laboratory; loss of glycosylation sites is a common laboratory adaptation. Even if the variation is natural, it is not evidence of the claimed phenomenon.
Some comments are also in order about the comparisons in Fig. 4. First, the positions of the most prominent variation outside of glycosylation regions, around position 190, are know sites of laboratory adaptation. Thus, the peaks are likely artifactual. Second, some of the comparisons involve categories that have different representations of regions of the sequence, and hence are not comparable, and the statistical tests appear to take no account of the lack of phylogenetic independence.
It is reasonable to think that glycosylation affects selection pressures at nearby residues, and to search for evidence of this in the protein sequences. Such a search should probably begin with mapping the mutations to a phylogenetic tree.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.
-
- Feb 2018
-
europepmc.org europepmc.org
-
On 2013 Jul 11, Joshua L Cherry commented:
This article presents an analysis of glycosylation patterns and sequence variations among human H1N1 influenza virus hemagglutinins. The authors conclude that glycosylation decreases sequence variation at nearby positions in the linear sequence by modifying selective pressures, perhaps by shielding these residues from antibodies. Careful examination of the results, especially with a phylogenetic tree in hand, shows that this conclusion is unfounded. The "extra" sequence variation observed among those viruses lacking glycosylation in a region did not arise among viruses lacking glycosylation in that region. Rather, this variation is the result of the mutations that eliminated the glycosylation site.
A clear example is the peak of variation at position 129 in Fig. 3 for the {91, 162} class. Sequences glycosylated at position 129 necessarily have an asparagine at this position, and hence no variation. For this reason alone, the high variability compared to sequences with glycosylation in region 129 is a trivial result. Furthermore, mapping of the sequence changes onto a phylogenetic tree shows that the sequences in class {91, 162} (which contains 2% of the sequences) are derived from class {91, 129, 162} (the largest class, with 68% of the sequences) by loss of a glycosylation site. All of the variation at position 129 appears to be the result of mutation of the asparagine in a 129-glycosylated virus that eliminates the glycosylation site, moving the sequence into class {91, 162}.
The nearby peak at 131 has a similar explanation. This site must be S or T for glycosylation at 129, so its mutation leads to loss of that glycosylation site. Also, some of the glycosylation sites in the 129 region are at position 131, and mutation of that asparagine is another way for a sequence to move into the {91, 162} category. All of the variation at position 131 appears to have arisen in one or the other of these ways.
The peaks at 164 and 166 for the {91, 129} class have an analogous explanation. The remaining claimed examples of the phenomenon are attributable to the same effect combined with problems with residue numbering due to the treatment of partial sequences, alignment gaps, and ambiguity characters.
Thus, the observed peaks represent losses of glycosylation rather than changes to unglycosylated sequences. These mutations may have been selected by growth in the laboratory; loss of glycosylation sites is a common laboratory adaptation. Even if the variation is natural, it is not evidence of the claimed phenomenon.
Some comments are also in order about the comparisons in Fig. 4. First, the positions of the most prominent variation outside of glycosylation regions, around position 190, are know sites of laboratory adaptation. Thus, the peaks are likely artifactual. Second, some of the comparisons involve categories that have different representations of regions of the sequence, and hence are not comparable, and the statistical tests appear to take no account of the lack of phylogenetic independence.
It is reasonable to think that glycosylation affects selection pressures at nearby residues, and to search for evidence of this in the protein sequences. Such a search should probably begin with mapping the mutations to a phylogenetic tree.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.
-