Genotype to phenotype – grammatical evolution?

We’re working on a new modeling framework where we can take evolution into account in developing the models.

  • We want to make models that are `robust’ in several senses (parameter insensitivity, data uncertainties and homeostatic adaptability are some of the reasons).
  • We want to be able to take data from different organisms and use all the data to constrain models, but the data come from distinct models with only evolution connecting them.
  • We want to restrict the model search space by considering only models that could have come from a genotype to phenotype mapping.

There’s loads of work that people have done on such maps, and today I’ve been learning about grammatical evolution, which is a new approach to genetic programming. The idea is that there is a fixed grammar and the genome encodes the production of the start symbol that leads to the actual code, which ends up being compilable if this is done right. Standard genetic programming works directly on the parse trees and, in some variants, doesn’t always lead to working end programs.

My postdoc, Junghyo Jo, and I have been thinking of a genotype – phenotype mapping as well, but wanting to encode a whole dynamical system in the ¬†genotype, parameters and all. That we can set up in a way that is pretty close to `nature’ but I’m still trying to get my head around why grammatical evolution is the correct genotype-phenotype map. Obviously, the GE algorithm generates correct code if the grammar is consistent, but is my genome sequentially encoding the code that is then compiled into the executable that is me? Probably not the best way to phrase my confusion but in all honesty I do not see why GE is biologically inspired. Yes, genes encode for proteins but transcribing a gene into an executable protein as a grammatical production is not quite what happens. The mRNA doesn’t get to the ribosome and start getting translated with amino-acids being added at one point caring about the amino-acids that have previously been added. (There are control mechanisms such as secondary structure of the mRNA etc., but let’s keep it simple.) I think what people have in mind is that the executable is the working folded protein analog rather than a string of residues that needs to be folded etc. In that case it would make some sort of sense as set up – linear structure being mapped to complicated active executable, with the compiler as some sort of ribosome, but I still feel that each succeeding base should not depend on what the preceding base did to the derivation (thus far) of the start symbol.

So what do we expect? I’m thinking this genotype-phenotype mapping is not a one-time thing. There should be many different go-to type entry points in the genotype, and the compiled code should execute something that activates some of these go-to points. Thus, there should be several start symbols, and several go-to points. The compiled code should execute and produce a new set of start symbols that then activate their associated go-to points. That’s a more amusing picture but I’m pretty sure that isn’t enough.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s