Sun Dec 19 10:48:06 2004 Linux
clusters
Posted by Drag
Search Keys: linux|clusters
Referencing: /Words/2004_12_19.html
Right now I am working on my own cluster. 3 machines using the OpenSSI project. Similar concept to OpenMosix and it uses some of OpenMosix's algorythms for it's load balancing.
It uses several software projects and mixes them together in order to try to create a single image cluster.
Basically one root filing system, one unified /dev/ directory, network transparent filing systems, one kernel for each machine working together as one big machine. Added to that you have high availability features such as mirrored root filing systems so if the "initnode" any other machine with a copy of the root filing system can take over. Also supports a virtual network address so that any client machines accessing the cluster will see all the machines as one machine and the network load would get balanced thru all the exposed interfaces. (generally you run a private network just for the cluster and have a few machines with dual nics).
One way to think about it is that your cluster is one big NUMA machine with the private network being the data interconnect between the individual nodes. The only gotcha is that it doesn't support moving individual threads from multithreaded apps, only forks and individual proccesses.
Very ambitious project. I am running into some problems, of course, shouldn't take long to solve it though.
You have several types of clusters, you have computational clusters like Beowolf, high availablity clusters, load balancing clusters, and a few others.
I expect that in a couple years it will get to the point were these things will be plug-n-play type affairs and be fully mature and stable. Right now you can buy several different types of clusters from various companies that are tested and such.
Also check out Redhat's "stateless" linux. To visualise that think about knoppix live cdroms, now think about instead of running a read-only linux from a cdrom, think about loading the OS parts as you need it over a network and having a user have local access to their files on a local harddrive and have that mirrored on backup file servers and accessable thru those on other machines. They would range from thin X clients to thick client desktops to machines that will be functional when you remove them from the network like you would with laptops.
They want it to get to the point were you can go up and grab any PC, toss it out a window, and have a replacement isntalled within minutes with absolutely no loss of data or configuration for the computer's user.
If you can then imagine using something like that in combination with OpenSSI/OpenMosix type projects and IBM's Cell-type stuff and eventually I am thinking that we be having Network Operating systems, instead of just computer operating systems. Entire buildings would be running only a few operating systems with the client PC's being mostly just another node. Sort of like returning back to the mainframe days, except the mainframe is replaced by the cluster and the terminals are the nodes in the cluster.
At least that's what I figure.
OpenSSI project:
http://openssi.org/cgi-bin/view?page=openssi.html
Stateless linux:
http://people.redhat.com/dmalcolm/stateless/stateless-linux-HOWTO-en/M/a>
http://people.redhat.com/~hp/stateless/
More Articles by Drag Sidious

/Drag/B1199.html copyright December 2004 Drag Sidious All Rights Reserved
Have you tried Searching this site?
Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates
This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more. We appreciate comments and article submissions.
Many of the products and books I review are things I purchased for my own use. Some were given to me specifically for the purpose of reviewing them. I resell or can earn commissions from the sale of some of these items. Links within these pages may be affiliate links that pay me for referring you to them. That's mostly insignificant amounts of money; whenever it is not I have made my relationship plain. I also may own stock in companies mentioned here. If you have any question, please do feel free to contact me.
Specific links that take you to pages that allow you to purchase the item I reviewed are very likely to pay me a commission. Many of the books I review were given to me by the publishers specifically for the purpose of writing a review. These gifts and referral fees do not affect my opinions; I often give bad reviews anyway.
We use Google third-party advertising companies to serve ads when you visit our website. These companies may use information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide advertisements about goods and services of interest to you. If you would like more information about this practice and to know your choices about not having this information used by these companies, click here.
Click here to add your comments
---December 19, 2004
Well, makes me wish I would of paid a bit better attention to my grammar when I posted that. Let that be a lesson to you all. :-)
And I think I was slightly incorrect about the unified /dev/ stuff in OpenSSI. I think that is a goal, but right now you use a "onnode" command to access different harddrives, for isntance. (and for other tasks)
So if you want to find information about a harddrive on node2 with the hdparm utility, you would have to go:
onnode 2 hdparm -i /dev/hda
I am still a newb to this clustering stuff. Very interesting though, I think that it has a lot of future possibilities as it matures.
--Drag
---December 19, 2004
Well, makes me wish I would of paid a bit better attention to my grammar when I posted that. Let that be a lesson to you all. :-)
And I think I was slightly incorrect about the unified /dev/ stuff in OpenSSI. I think that is a goal, but right now you use a "onnode" command to access different harddrives, for isntance. (and for other tasks)
So if you want to find information about a harddrive on node2 with the hdparm utility, you would have to go:
onnode 2 hdparm -i /dev/hda
I am still a newb to this clustering stuff. Very interesting though, I think that it has a lot of future possibilities as it matures.
Oh, so you don't think that it is very experimental stuff currently lots of people use Linux clusters, so for many tasks it's very mature. The most powerfull computers in the world run Linux on clusters. For instance the #2 most powerfull known computer as per top500 is a 20 machine SGI beowolf-style cluster. Each machine has 512 cpus, so the whole thing is very massive.
Another example is Google.com, which is run on several massive clusters of commodity-style PC machines. They use combinations of load balancing clusters, and then they use clusters of those clusters as high aviability clusters. So that if a machine goes down in one cluster, another section will kick in seamlessly and take over for the wounded section, leaving techs to repair it at their leisure.
It's also very common for things needed for high amounts of computational power, but have a relatively low budget. For instance with colleges and astronomical labs. For doing calculations about star movements and stuff like that. Also they were used in part to help decode the human genome and develope new drugs and such. Rendering clusters are used in rendering and developing major hollywood motion pictures like the LOTR series. Anything you need a powerfull computer for.
--Drag
---December 19, 2004
Well, makes me wish I would of paid a bit better attention to my grammar when I posted that. Let that be a lesson to you all. :-)
And I think I was slightly incorrect about the unified /dev/ stuff in OpenSSI. I think that is a goal, but right now you use a "onnode" command to access different harddrives, for isntance. (and for other tasks)
So if you want to find information about a harddrive on node2 with the hdparm utility, you would have to go:
onnode 2 hdparm -i /dev/hda
I am still a newb to this clustering stuff. Very interesting though, I think that it has a lot of future possibilities as it matures.
Oh, so you don't think that it is very experimental stuff currently lots of people use Linux clusters, so for many tasks it's very mature. The most powerfull computers in the world run Linux on clusters. For instance the #2 most powerfull known computer as per top500.com is a 20 machine SGI beowolf-style cluster. Each machine has 512 cpus, so the whole thing is very massive.
Another example is Google.com, which is run on several massive clusters of commodity-style PC machines. They use combinations of load balancing clusters, and then they use clusters of those clusters as high aviability clusters. So that if a machine goes down in one cluster, another section will kick in seamlessly and take over for the wounded section, leaving techs to repair it at their leisure.
It's also very common for things needed for high amounts of computational power, but have a relatively low budget. For instance with colleges and astronomical labs. For doing calculations about star movements and stuff like that. Also they were used in part to help decode the human genome and develope new drugs and such. Rendering clusters are used in rendering and developing major hollywood motion pictures like the LOTR series. Anything you need a powerfull computer for.
--Drag
---December 19, 2004
---December 19, 2004
Well, makes me wish I would of paid a bit better attention to my grammar when I posted that. Let that be a lesson to you all. :-)
And I think I was slightly incorrect about the unified /dev/ stuff in OpenSSI. I think that is a goal, but right now you use a "onnode" command to access different harddrives, for isntance. (and for other tasks)
So if you want to find information about a harddrive on node2 with the hdparm utility, you would have to go:
onnode 2 hdparm -i /dev/hda
I am still a newb to this clustering stuff. Very interesting though, I think that it has a lot of future possibilities as it matures.
Oh, so you don't think that it is very experimental stuff currently lots of people use Linux clusters, so for many tasks it's very mature. The most powerfull computers in the world run Linux on clusters. For instance the #2 most powerfull known computer as per top500.com is a 20 machine SGI beowolf-style cluster. Each machine has 512 cpus, so the whole thing is very massive.
Another example is Google.com, which is run on several massive clusters of commodity-style PC machines. They use combinations of load balancing clusters, and then they use clusters of those clusters as high aviability clusters. So that if a machine goes down in one cluster, another section will kick in seamlessly and take over for the wounded section, leaving techs to repair it at their leisure.
It's also very common for things needed for high amounts of computational power, but have a relatively low budget. For instance with colleges and astronomical labs. For doing calculations about star movements and stuff like that. Also they were used in part to help decode the human genome and develope new drugs and such. Rendering clusters are used in rendering and developing major hollywood motion pictures like the LOTR series. Anything you need a powerfull computer for.
Most of those would be considured Beowolf clusters and require special programming technics to use properly. Stuff like OpenSSI and OpenMosix would be for much more common uses.
In fact OpenMosix is easy enough that you can download knoppix style live cds and build clusters pretty much on the fly. Nice if you have a lot of dvd movies to rip or need to compile a lot of programs. Distcc is also usefull for that, it's a modified version of gcc that will break up compiling jobs and move them around on a network.
--Drag
---December 19, 2004
Well, makes me wish I would of paid a bit better attention to my grammar when I posted that. Let that be a lesson to you all. :-)
And I think I was slightly incorrect about the unified /dev/ stuff in OpenSSI. I think that is a goal, but right now you use a "onnode" command to access different harddrives, for isntance. (and for other tasks)
So if you want to find information about a harddrive on node2 with the hdparm utility, you would have to go:
onnode 2 hdparm -i /dev/hda
I am still a newb to this clustering stuff. Very interesting though, I think that it has a lot of future possibilities as it matures.
Oh, so you don't think that it is very experimental stuff currently lots of people use Linux clusters, so for many tasks it's very mature. The most powerfull computers in the world run Linux on clusters. For instance the #2 most powerfull known computer as per top500.com is a 20 machine SGI beowolf-style cluster. Each machine has 512 cpus, so the whole thing is very massive.
Another example is Google.com, which is run on several massive clusters of commodity-style PC machines. They use combinations of load balancing clusters, and then they use clusters of those clusters as high aviability clusters. So that if a machine goes down in one cluster, another section will kick in seamlessly and take over for the wounded section, leaving techs to repair it at their leisure.
It's also very common for things needed for high amounts of computational power, but have a relatively low budget. For instance with colleges and astronomical labs. For doing calculations about star movements and stuff like that. Also they were used in part to help decode the human genome and develope new drugs and such. Rendering clusters are used in rendering and developing major hollywood motion pictures like the LOTR series. Anything you need a powerfull computer for.
--Drag
---December 19, 2004
"Well, makes me wish I would of paid a bit better attention to my grammar when I posted that."
Just tell me anything you want to change and I'll fix it for you. I just saw this as useful and interesting enough to stand alone.
--TonyLawrence
Don't miss responses! Subscribe to Comments by RSS or by Email
Click here to add your comments
If you want a picture to show with your comment, go get a Gravatar