We no longer offer ftp downloads. If there is a file you need referenced here, please contact me by email and I will get it to you.
Wed Jan 12 00:02:26 2005 Goal seeking
code
Posted by Tony Lawrence
Referencing: /Blog/B1226.html
This is an extremely unsophisticated program that demonstrates evolutionary DNA. You start by passing it a string:
./goal.pl Goal
The goal, starting from two strings "abcdefg" and "gfedcba", is to end up with your string duplicated four times: "GoalGoalGoalGoal". It does this by having children that are produced by mating with other strings in its memory. The child produced is just position one from string1, position two from string two, etc. So starting with "abcdefg" and "gfedcba", the first "child" would be "afcdebg". However, random mutations are introduced to the children. Each string is then scored, and the highest scores get to reproduce again. This reproduction pattern tends toward similar parents producing similar children - much like real life.
Mutations here are more apt to change the case of a letter than change it outright, but they are aggressive. You can easily change these factors to see how they affect the results. So that the string can grow, letters can be doubled, though they can also be taken away. The strings are held to a maximum size of four times the length of your goal string, and the total number of strings is trimmed to a small number so that this doesn't grind away for hours and hours sucking up cpu and memory..
It also does a little Permian Extinction now and then - which brings some fresh faces into the gene pool. Simply, we just randomly wipe out up to 75% of the strings on each generation, thus giving the new strings more chance to mate.
It doesn't take long for the first patterns to appear. It's interesting that "bad" strings often hold good positions for quite a while, but by a few thousand generations, the top scoring strings have at least one correctly anchored pattern.
It may take quite a while for better patterns to emerge. People who prefer to believe that life came from "intelligent design" sometimes forget how much time the primitive life of this planet had for its patterns to emerge. For hundreds of millions of years, simple living things reproduced over billions and billions of generations, gathering improvements as mutations failed and succeeded or made no difference at all. Bacteria might reproduce every twenty minutes. How many generations is that over a few hundred million years? What improvements can come about? Well, we are the results of just that. Life on this planet had many more generations available than you are likely to have the patience or time for on your computer.
I've recently been re-reading John McPhee's Annals of the Former World. I have to keep a large dictionary beside me while reading this, which amuses my wife greatly as she doesn't often see me nonplussed by mere words. Nevertheless, it's a great read, and while it is starting to get a bit out of date with current knowledge, I still enjoy it. McPhee notes that for most of us, the concept of geographic time is beyond real comprehension - 100 million years or so is just a number, it's nothing that we can really appreciate. Analogies like "if all of the earth's history were the length of your arm, all of human history could be erased by a fingernail file" don't really help, but it is these incredible lengths of time that made complex life possible.
The scoring here rewards simple goals, and gives higher scores for better matches. The scoring is deliberately imperfect - though it is programatically interesting:
sub score {
my $trg=shift;
my $len=length($Goal);
my $lct=lc($trg);
my $s=0;
my $G=$Goal;
my $g=lc($Goal);
# This looks for matches in position. If the goal is "Abc" and the
# target is "Xyz-Mno-123-456", this compares "abc" to "xyz", then
# "Abc" to "Xyz, then "ab" to "xy" (and "Ab" to "Xy"),
# "a" to "x", "bc" to "yz", "b" to "y", and "c" to "z".
#
# It then moves to "abc" compared to "mno", "ab" to "mn", etc.
#
# The longer the comparison, the more score for a match. "Abc"
# compared to "abc-xyz-mno-sdf" will match all of the case insensitive cases
# for "Abc" to "abc" and will match "bc" to "bc" and "c" to "c"
# for both insensitive and case sensitive tests.
for ($y=0;$y< length($trg);$y+=$len) {
for ($z=0;$z<$len;$z++) {
for ($zz=$len-$z;$zz>0;$zz--) {
$s+=(10 * $zz) if substr($g,$z,$zz) eq substr($lct,$y+$z,$zz);
$s+=(20 * $zz) if substr($G,$z,$zz) eq substr($trg,$y+$z,$zz);
}
}
}
if ( $trg eq "$Goal$Goal$Goal$Goal") {
print "$s $trg - DONE!\n";
exit 0;
}
return $s;
}
Crude as it is, the strings slowly get closer to the desired
result. Just leave it alone for a few hours, and you should see the
strings building toward the goal. There will be imperfect versions
along the way; for example this has "Fog" as its target:
-----Fog----(Population 14)---- 01 of generation 00000550: FJg-BCg-bbb-bAA 90 02 of generation 00000550: CAg-ggg-AAb-bgg 90 03 of generation 00000550: FBg-ggg-bbb-AAA 90 04 of generation 00000550: FJg-ggg-bbB-AAA 90 05 of generation 00000550: rOg-ABg-bAb-bBb 90 06 of generation 00000550: rOg-ABg-bbb-bAb 90 07 of generation 00000550: Ccg-Bgg-AAb-bgg 90 08 of generation 00000550: AAA-AAb-bbb-bbB 0 09 of generation 00000550: ABa-Abb-bbb-bbb 0 10 of generation 00000550: ABA-abb-bbb-bbb 0
After just 550 generations, this is off to a good start.
Shorter strings of course get good patterns early.
-----Xy----(Population 24)---- 01 of generation 00000625: XG-XC-AB-XA 90 02 of generation 00000625: XG-XB-AB-XC 90 03 of generation 00000625: XD-XB-AB-XA 90 04 of generation 00000625: XA-XA-BB-XA 90 05 of generation 00000625: XB-XC-AB-XA 90 06 of generation 00000625: XG-XA-CB-XB 90 07 of generation 00000625: XG-XB-CB-XA 90 08 of generation 00000625: Aa-Xb-AB-XB 60 09 of generation 00000625: Aa-Xb-AA-XB 60 10 of generation 00000625: Aa-Xb-BC-XA 60This has more partial matches, and is well on its way to perfection. As it gets closer, though, changing becomes harder, because most changes won't score well enough to beat these already good scores. The change has to be more beneficial, which is just as it works in the real world: succesful organisms tend not to evolve quickly.
Of course, we get much quicker results for a single letter goal:
-----M----(Population 22)---- 01 of generation 00001025: M-M-F-C 60 02 of generation 00001025: B-M-M-C 60 03 of generation 00001025: M-M-C-D 60 04 of generation 00001025: M-M-G-C 60 05 of generation 00001025: O-M-M-B 60 06 of generation 00001025: M-M-A-E 60 07 of generation 00001025: M-M-F-A 60 08 of generation 00001025: D-M-C-A 30 09 of generation 00001025: C-B-M-A 30 10 of generation 00001025: A-M-C-A 30 At Generation 1034 with score 120 MMMM - DONE!
It is is fun to play with various mutation/extinction schemes just to see what happens, though I don't think we have any scientific validity here. This was fun to write, and fun to play with, but it's not anything like the Avida project.
Perl source at ftp://ftp.aplawrence.com/pub/goal.pl for all the other off-balance folks.
Have you tried Searching this site?
Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates
This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more. We appreciate comments and article submissions.
Many of the products and books I review are things I purchased for my own use. Some were given to me specifically for the purpose of reviewing them. I resell or can earn commissions from the sale of some of these items. Links within these pages may be affiliate links that pay me for referring you to them. That's mostly insignificant amounts of money; whenever it is not I have made my relationship plain. I also may own stock in companies mentioned here. If you have any question, please do feel free to contact me.
Specific links that take you to pages that allow you to purchase the item I reviewed are very likely to pay me a commission. Many of the books I review were given to me by the publishers specifically for the purpose of writing a review. These gifts and referral fees do not affect my opinions; I often give bad reviews anyway.
We use Google third-party advertising companies to serve ads when you visit our website. These companies may use information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide advertisements about goods and services of interest to you. If you would like more information about this practice and to know your choices about not having this information used by these companies, click here.
Click here to add your comments
Don't miss responses! Subscribe to Comments by RSS or by Email
Click here to add your comments
If you want a picture to show with your comment, go get a Gravatar