Stochastic grammars and grammatical evolution

I’ve been wondering how to use grammatical evolution to generate signaling networks. So first we have to think up some sort of grammar for signaling networks. What would be appropriate start symbols? Productions? Terminals?

Start: Gene

Transcription: Gene > Gene + RNA (constitutive expression) | Gene*TF | Gene*Inhibitor

Transcription: Gene*TF > Gene + RNA | Gene*TF[*Cofactor]^n | Gene*TF*Inhibitor

Transcription: Gene*TF*Cofactor > Gene + RNA

Signaling: Receptor > Receptor*SIgnal | Receptor*Blocker

Degradation: Any > Nothing

and so on

People have done this sort of thing before, obviously, but I’m wondering about how applying genetic mutation operators to a string of such productions will lead to the same sort of changes to gene networks that are actually observed. Not obvious to me …

What happens if you use a stochastic grammar? What’s the difference between a stochastic grammar applied many times to a fixed genome vs a deterministic grammar applied to a population of genomes? In biology, the binding of TFs may actually be stochastic, so perhaps we should encode the probability of a symbol in the genome going to a particular production in the genome itself.

Robustness vs adaptability

So much of science depends on how well you can communicate. I’m now on the least favorite part of any research project from my perspective: The communication part.

Here’s the question that motivated this particular bit: Natural selection acts on the phenotype, i.e. on the adaptability and competence of a particular individual. The search for improvement works by mutating the genotype, and unfortunately(?), even if there is a beneficial mutation in the germline, the individual whose progeny will carry that mutation does not benefit at all in its lifetime. Now: If mutations alter the phenotype to be inherited, then the genotype that leads to the selected parent phenotype will not really play much of a role since mutations will provide the progeny with a phenotype that may not be as good. So there has to be some sort of interplay between mutation resistance and adaptability, but in principle a dynamical system that is adaptable may not be stable to parameter perturbations (due to mutations) and vice versa.

So I suggested a little thought experiment to my postdoc (Zeina Shreif): Imagine two parts of a network, one which receives some environmental input and another that gets some signal from this part and produces some output. If the system is adaptable, we expect that it should be able to maintain the same output for a change in input. However, how exactly can the output part of the network distinguish between a change in the input or a change in some parameter in the input part of the network? So I conjectured that in fact these two properties must be correlated, that statistically homeostatic networks will be robust with respect to parameter fluctuations.

It turns out that this is true. Heroic work by Zeina has demonstrated this all the way up to 50 node random networks! So this is an exciting (to me) result, and so it is worth it to try to write this in a way that people will understand it. The problem is that there are so many exciting ramifications that I am very reluctant to do the careful writing. Must be done 😦

Genotype to phenotype – grammatical evolution?

We’re working on a new modeling framework where we can take evolution into account in developing the models.

  • We want to make models that are `robust’ in several senses (parameter insensitivity, data uncertainties and homeostatic adaptability are some of the reasons).
  • We want to be able to take data from different organisms and use all the data to constrain models, but the data come from distinct models with only evolution connecting them.
  • We want to restrict the model search space by considering only models that could have come from a genotype to phenotype mapping.

There’s loads of work that people have done on such maps, and today I’ve been learning about grammatical evolution, which is a new approach to genetic programming. The idea is that there is a fixed grammar and the genome encodes the production of the start symbol that leads to the actual code, which ends up being compilable if this is done right. Standard genetic programming works directly on the parse trees and, in some variants, doesn’t always lead to working end programs.

My postdoc, Junghyo Jo, and I have been thinking of a genotype – phenotype mapping as well, but wanting to encode a whole dynamical system in the  genotype, parameters and all. That we can set up in a way that is pretty close to `nature’ but I’m still trying to get my head around why grammatical evolution is the correct genotype-phenotype map. Obviously, the GE algorithm generates correct code if the grammar is consistent, but is my genome sequentially encoding the code that is then compiled into the executable that is me? Probably not the best way to phrase my confusion but in all honesty I do not see why GE is biologically inspired. Yes, genes encode for proteins but transcribing a gene into an executable protein as a grammatical production is not quite what happens. The mRNA doesn’t get to the ribosome and start getting translated with amino-acids being added at one point caring about the amino-acids that have previously been added. (There are control mechanisms such as secondary structure of the mRNA etc., but let’s keep it simple.) I think what people have in mind is that the executable is the working folded protein analog rather than a string of residues that needs to be folded etc. In that case it would make some sort of sense as set up – linear structure being mapped to complicated active executable, with the compiler as some sort of ribosome, but I still feel that each succeeding base should not depend on what the preceding base did to the derivation (thus far) of the start symbol.

So what do we expect? I’m thinking this genotype-phenotype mapping is not a one-time thing. There should be many different go-to type entry points in the genotype, and the compiled code should execute something that activates some of these go-to points. Thus, there should be several start symbols, and several go-to points. The compiled code should execute and produce a new set of start symbols that then activate their associated go-to points. That’s a more amusing picture but I’m pretty sure that isn’t enough.

Inference and prediction: Dimon 0 Volcker 1

Jamie Dimon finally gets his. Chase posted a 2 billion $ loss because of a trade that they didn’t do enough risk management on. Too funny.

From DealBook:

>> On Wall Street, few have been more outspoken about the pitfalls of the Volcker Rule than JPMorgan’s chief executive, Jamie Dimon. Mr. Dimon not only attacked the rule, he personally criticized Paul Volcker, the former Federal Reserve chairman and the regulation’s namesake.

“Paul Volcker by his own admission has said he doesn’t understand capital markets,” Mr. Dimon told Fox Business earlier this year. “He has proven that to me.” ….

Even Mr. Dimon had to admit Thursday’s disclosure was a setback for JPMorgan and other banks that want more flexibility when the final version of the Volcker Rule is issued. “It plays into the hands of a bunch of pundits but you have to deal with that and that’s life,” Mr. Dimon said Thursday on a conference call with analysts.  …

“Just because we’re stupid doesn’t mean everybody else was,” he said. “There were huge moves in the marketplace but we made these positions more complex and they were badly monitored.”

“This may not have violated the Volcker Rule, but it violates the Dimon Principle.” <<

Apparently, Mr. Dimon doesn’t understand capital markets either.

On a constructive note, I think some of these problems arise due to an inadequate understanding of how to model probability distributions from a finite amount of data and how to automatically learn changing distributions. These issues are exactly the same as the conflict in evolution: You want stability to propagate your genome, but you need variation to allow for the emergence of new traits that can handle changes in the environment. It isn’t easy to come up with dynamical systems that meet both of these desiderata.

Regeneration vs. reproduction

An intern in my group a couple of summers ago became interested in aging when he got to med school. I usually have a fun time with my interns so sometimes they keep in touch. He called me and wanted to know what was known about aging. Knowing nothing, but always willing to be distracted, I looked into it and discovered all sorts of mathematical models of aging and death. These are usually at a pretty phenomenological level, but it is good to know about Gompertz and the Penna model and so on. In fact, these models don’t take evolution into account. Rephrasing: Why aren’t we all immortal with lots of regenerative capabilities? That’s something that Kurzweil might want to think about, no? So, given that evolution selects on the phenotype and evolves the genotype, I don’t see offhand how the appearance of programmed death in all multicellular species I know about is something that emerges without taking environmental variation and the whole group of individuals into account. In other words, why do we have children instead of continual self-renewal? Why doesn’t our homeostasis include regenerative homeostasis. Avoid all the diapers …
This is something I wonder about a lot, especially after my father’s death a couple of weeks ago, is why is it so hard to accept death for most people? My father died a good death, no prolonged suffering, at a ripe old age, surrounded by his family. So why does my mother keep re-iterating that she did not expect it to be like this? I couldn’t help asking her: How did she think it would happen? I’m more or less convinced that a large part of religion is simply to stop people from going nuts at the prospect of ceasing to exist by offering them an afterlife, with the bonus that if they conform in this life, the afterlife will feature many comforts (conform to being a jihadi and you get umpteen virgins and so on … which brings up: What happens if you’re a woman jihadi? Inquiring minds want to know …) Getting back to this lamentation of death, why is this horror of death a cultural/social survivor? Recall the easy way to induce opposition to healthcare reform: Death panels!! But actually, this isn’t at all unrelated to the concept of regeneration vs. reproduction. We have a certain tradeoff to make: We keep the present genes intact and cared for, or we produce new sets that will carry on while entropy wins over the present set. People should really be asked: This is a finite planet. Do you want aggressive treatment if you fall sick or do you want your descendants to have the equivalent resources? Of course, as a society we’ve decided that we care more about our well being rather than the economic well being of our descendants, so I suppose we’re evolving into the Kurzweil model of keeping our present genes functioning, thank you very much.