Archive for April, 2008

A new hydrogen bond potential

Saturday, April 26th, 2008

Hydrogen bond is formed to share a hydrogen atom between a donor atom and an acceptor atom. In proteins, nitrogen and oxygen are common donors and acceptors. Hbond is relatively weak and short life time compared to other covalent bonds. For example, in water, the energy to break a hbond is about 4.5 Kcal/mol, while 110 Kcal/mol is required to break a H–O bond. Also, the half-life of hbond is only about 10^-9 s. However, in biology, as it always turns out, the weaker ones is the winner. Hbonds are very important to stabilize the whole structure, as well as the functions. Although after decades of studying hydrogen bonds at the fundamental physics levels, and several versions of formulas and parameters to calculate its free/interaction energy, it is still inconclusive which one outperforms others profoundly.

One new competitor invaded this field of hbond potentials recently. It is from David Baker’s group. Instead of focusing on the very basics at the physics and chemistry level, they chose geometry measurements and build the energy scoring function purely on statistics. Four geometry parameters were introduced, hbond distance, angle at H atom, angle at the acceptor, and the dihedral angle depicting the rotation along the acceptor and acceptor base atoms. The rest is as same as all other statistical potential developments. The more you see it, the more stable it is, and therefore, the more negative its energy is. In other words, log(frequency(parameter)/frequency(background)) is the energy to be fitted. In this paper, they chose frequency(background) as a unit value, which I’m not convinced it is right, but might be there are some other concerns of more complicate and relevant backgrounds. The hbond is then the summation of all the parameter terms. I need point out that in this potential, all geometry terms contribute equally to the final hbond potential.

Energy distribution and fold stability

Monday, April 21st, 2008

There is a paper by Giulia Morra and Giorgio Colombo focused on energy distribution and the stability of protein folds. The paper was motivated by the observation that sequence mutations are not equally important to stabilize protein folding. The stability is tolerant to majority of the mutation, which is called neutral mutation. While some mutation, even though it is very subtle, they severely change the overall fold, those positions are then called hot spots. If one could use the difference between neutral mutation and hot spots, one could help understanding the mechanisms of stabilizing protein folds and evolution of protein folds. Obviously, that is too much to ask from the paper published on Proteins, no offense to Proteins though. The authors targeted two smaller goals, to find a way to measure the fitness of a sequence to a given structure, and to find key residues which are important to stabilize protein fold. How?

They proposed the principle eigenvector of interaction matrix to represent the contribution from individual residues. The interaction matrix was obtained by averaging all pair-wise interaction at residue level over all the conformations generated by MD. The interaction consists of van-de-Walls and electrostatic energies, in other words, only non-bonded energy are considered here. The matrix is then diagonalized and re-expressed in terms of eigenvalues and eigenvectors. The most negative eigenvalue and its associated eigenvector is singled out and chosen as the representation of the energy distribution of the fold. The eigenvector is called sequence eigenvector because it captured the sequence profiles. This chosen strategy is justified by looking at the percentage of the energy calculated only using the sequence eigenvector, which is about 90% for the majority of the 5 protein representatives. Also, they tested the correlation between the calculated nonbonded energy and the experimentally measured free energy. And they found the correlation is strong, though it is anti-correlated. I couldn’t understand why it is anti-correlated, a positively correlated one would be my expect.

The hot spots are defined as those components which their value in the sequence eigenvector are greater than 1/sqrt(N), which N is the number of total residues in protein. They found that the mutation change very little on the hot spots.

I have some reservation to some points in the paper as what Ron and other group members pointed out. First, the contribution from the solvation effect. They are larger part of the “forces” driving folding. They calculated the interaction energy using the sequence eigenvector. The results are compared against the free energy of folding. It is in a sense, compare apple to orange. Electrostatic calculation is also not clear, because as Ron said, it is not appropriate to attribute the electrostatic energy to residues after used Ewald. They didn’t compare their energy calculation from sequence eigenvector and total energy calculation without approximation against the measured free energy. Also, why not explicitly list all the hot spots?

Anyway, interesting paper overall.

Call R in Perl

Monday, April 7th, 2008

This is to call a R script not a specific function of R.
system(“R –vanilla –no-save < yourRscript.R > output.txt”)

Google and genome

Friday, April 4th, 2008

You see, Google was anywhere but genome. Not anymore, or sorta. The giant cloud computation company has already stepped a food on genome field. G-language was introduced. What a fast moving world.

Camera simulator

Thursday, April 3rd, 2008

It is a surprise to find out a website which you could learn how to use camera in such a effective way. LINK.