## Archive for the ‘Reading’ Category

### Reading “A CV of failures”

Sunday, November 21st, 2010

This essay is quite interesting. Here is an excerpted part:

So here is my suggestion. Compile an ‘alternative’ CV of failures. Log every unsuccessful application, refused grant proposal and rejected paper. Don’t dwell on it for hours, just keep a running, up-to-date tally. If you dare — and can afford to — make it public. It will be six times as long as your normal CV. It will probably be utterly depressing at first sight. But it will remind you of the missing truths, some of the essential parts of what it means to be a scientist — and it might inspire a colleague to shake off a rejection and start again.

In a sense, I could interpret the author’s suggestion in a slightly different way: let’s publish our negative results, i.e., negative application results, negative research results, and by doing so, other people might be benefited, psychologically, scientifically, or both. But I have to say that this is a rather personal thing, seeing so many failures, will I be excited about my future or rather depressed?

### Noise reduction on funnel-shaped energy landscapes

Monday, September 27th, 2010

This post is due to an excellent paper by Andrew Stumpff-Kane and Michael Feig, which was published on Proteins: Structure, Function, and Bioinformatics 63:155-164(2006). In protein structure prediction field, almost every one has his/her own scoring function to score/rank models for any target sequence,  presumbly, the model closest to the native structure should score highest/lowest. And since the closeness of the model to the native structure is usually measured by RMSD (or GDT_TS as in CASP or TMscore invented by Zhang and Scholnick’s paper), if the scoring function is perfect, there should be strong correlation between the RMSD and the score for all models. However, more often than not we saw very clumsy distruction of RSMD vs scores for CASP models, in other words, there is little or no such hoped correlation. So the authors proposed a statistical solution, correlation based scoring function to reduce the noise from original score functions. The noise, Z of original score function (W) and the score function is assumed to not correlated to the distance between model and native structure. The correlation coefficient

$\rho_{r}(d(PP_{r}), W+Z)=\frac{Cov(d(PP_{r}), W+Z)}{\sqrt Var(d(PP_{r})(Var(W)+Var(Z))}$

They found that the correlation of $\rho$ to $d(PP_{0})$ is not dependent on Z anymore, that is

$\rho(d(PP_{0}), \rho_{r}(d(PP_{r}), W+Z))=\frac{Cov()}{\sqrt(Var(d(PP_{0}))Var()}$

So the proposed score of each model is calculated as:

$r_{i}=\frac{N\sum_{j \neq i}^{N}s_{j}d_{ij}-\sum_{j\neq i}^{N}s_{j}\sum_{j\neq i}^{N}d_{ij}}{\sqrt(N^{2} Var(s)Var(d))}$ where

$d_{ij}$ is the distance between model i and model j, $s_{i}$ is the original score of model i, N is the total number of models.

It works well on 5 data sets they chose. One of the reasons it works is that it uses the assumption that all the models are near by or at the native structure and their distribution is a funnel-like, that is, there is a global minimum. So the correlation score would weight the model with closest to global minimum the largest score. In reality, they found that it is better to use a hybrid of the original score with this correlation based score. That is, to use the correlation based score to select a limited number of models (for example 10), and then use the original scoring function to rank the preselected models. And this hybrid turns out to be better than either.

Again the assumption is that all models form funnel like distribution on energy landscape.

### The book about Warren Buffet

Thursday, September 9th, 2010

I heard a while ago about the biography book of Warren Buffet by Alice Shroeder and finally got a chance to read(rather listen to) it. This is the first book I read about the admired Buffet. To me the whole book is just more gossips than what I knew from watching Charlie Rose Show. I mean, after read this book, Buffet is still just in a plane. The book title says his business life, but the author didn’t dig in enough to unveil his business activities from business perspective. The Solomon event is good but still lacking something, well, I guess I’ll never satisfied unless I read some documentation about that collapse–could put in on my wish list.

Anyway it is worth reading if you are interested in the greatest investor’s personal life.

### Reading: evolution of protein modularity

Monday, March 15th, 2010

This is a review paper from Edward Trifonov and Zakharia Frenkel. The starting points of the review are basically two points. Evolution is a pretty gradual procedure, and the other is the (biological) systems start with a simpler version then evolves to a more complex one. So based on knowledge and experience, Brenner proposed essentially three steps for evolution:

1. Peptides (~ 10aa)
2. domains (~100aa)
3. multidomain proteins.

The author themselves suggested a slightly modified version:

1. Peptides (~7aa)
2. (a) Closed loops(25~30aa)
3. (b)folds (100~150aa)
4. Multifold proteins