<?xml version="1.0"?>
<article id="index"><artheader><title>Creating SSI Clusters Using UML HOWTO</title><author><firstname>Brian</firstname><othername>J.</othername><surname>Watson</surname><affiliation><address format="linespecific">           <email>Brian.J.Watson@hp.com</email>
        </address></affiliation></author><revhistory><revision><revnumber>1.04</revnumber><date>2002-05-29</date><authorinitials>bjw</authorinitials><revremark>            LDP review
         </revremark></revision><revision><revnumber>1.03</revnumber><date>2002-05-23</date><authorinitials>bpm</authorinitials><revremark>            LDP review
         </revremark></revision><revision><revnumber>1.02</revnumber><date>2002-05-13</date><authorinitials>bjw</authorinitials><revremark>            Fixed minor typos and errors
         </revremark></revision><revision><revnumber>1.00</revnumber><date>2002-05-09</date><authorinitials>bjw</authorinitials><revremark>            Initial release
         </revremark></revision></revhistory><abstract><indexterm significance="normal"><primary>template</primary></indexterm><para>     This is a description of how to create a Single System Image (SSI)
     cluster of virtual User-Mode Linux (UML) machines. After explaining
     how to use the pre-built SSI/UML binaries, this document demonstrates
     what an SSI cluster can do. Then it shows more advanced users
     how to build their own SSI/UML kernels, ramdisks and root images.
     Following that, it provides an overview of how to move to a
     hardware-based SSI cluster. It concludes with a set of links
     and an invitation to contribute to the SSI Clustering project.
    </para></abstract></artheader><sect1 id="intro"><title>Introduction</title><indexterm significance="normal"><primary>disk!introduction</primary></indexterm><para>      An SSI cluster is a collection of computers that work together as if 
      they are a single highly-available supercomputer. There are at least 
      three reasons to create an SSI cluster of virtual UML machines.
   </para><para>      <itemizedlist><listitem><para>	   Allow new users to easily experiment with SSI clustering,
	   before investing time and hardware resources into creating
	   a hardware-based cluster.
         </para></listitem><listitem><para>	   Provide a friendly testing and debugging environment 
	   for SSI developers.
         </para></listitem><listitem><para>	  Let developers test hundred-node clusters with only 
	  ten or so physical machines.
         </para></listitem></itemizedlist>
   </para><sect2 id="ssioverview"><title>Overview of SSI Clustering</title><para>    The <emphasis>raison d'enttre</emphasis> of the 
    <ulink url="http://ssic-linux.sf.net/">SSI Clustering project</ulink> is to provide 
    a full, highly available SSI environment for Linux. 

    Goals for this project include availability, scalability and 
    manageability, using standard servers. 

    Technology pieces include: membership, single root and single init, 
    single process space and process migration, load leveling, single 
    IPC, device and networking space, and single management space. 
   </para><para>    The SSI project was seeded with HP's NonStop Clusters for 
    UnixWare (NSC) technology. 

    It also leverages other open source technologies, such as Cluster
    Infrastructure (CI), Global File System (GFS), keepalive/spawndaemon, 
    Linux Virtual Server (LVS), and the Mosix load-leveler, to create the best
    general-purpose clustering environment on Linux.  </para><sect3 id="ci"><title>Cluster Infrastructure (CI)</title><para>     The <ulink url="http://ci-linux.sf.net/">CI project</ulink> is developing a common infrastructure 
     for Linux clustering by extending the 
     Cluster Membership Subsystem (CLMS) and 
     Internode Communication Subsystem (ICS) from HP's 
     NonStop Clusters for Unixware (NSC) code base.
    </para></sect3><sect3 id="gfs"><title>Global File System (GFS)</title><para>     GFS is a parallel physical file system for Linux. It allows multiple
     computers to simultaneously share a single drive. 
     The SSI Clustering project uses GFS for its single, shared root.
     GFS was originally developed and open-sourced by <ulink url="http://www.sistina.com/products_gfs.htm">Sistina Software</ulink>. 
     Later they decided to close the GFS source, which prompted the creation 
     of the <ulink url="http://www.opengfs.org/">OpenGFS project</ulink> 
     to maintain a version of GFS that is still under the GPL. 
    </para></sect3><sect3 id="keepalive"><title>Keepalive/Spawndaemon</title><para>     <ulink url="http://ci-linux.sourceforge.net/keepalive.shtml"><command moreinfo="none">keepalive</command></ulink> is a process monitoring 
     and restart daemon that was ported
     from HP's Non-Stop Clusters for UnixWare (NSC). It offers 
     significantly more flexibility than the <parameter moreinfo="none">respawn</parameter> 
     feature of <command moreinfo="none">init</command>.
    </para><para>     <ulink url="http://ci-linux.sourceforge.net/spawndaemon.shtml"><command moreinfo="none">spawndaemon</command></ulink> provides a command-line 
     interface for <command moreinfo="none">keepalive</command>. It's used to control which
     processes <command moreinfo="none">keepalive</command> monitors, along with
     various other parameters related to monitoring and restart.
    </para><para>     Keepalive/spawndaemon is currently incompatible with the GFS shared
     root. <command moreinfo="none">keepalive</command> makes use of shared writable 
     memory mapped files, which OpenGFS does not yet support. It's
     only mentioned for the sake of completeness.
    </para></sect3><sect3 id="lvs"><title>Linux Virtual Server (LVS)</title><para>     <ulink url="http://www.LinuxVirtualServer.org/">LVS</ulink> 
     allows you to build highly scalable and highly available 
     network services over a set of cluster nodes. LVS offers various 
     ways to load-balance connections (e.g., round-robin, 
     least connection, etc.) across the cluster. The whole cluster 
     is known to the outside world by a single IP address.
    </para><para>     The SSI project will become more tightly integrated with LVS in the
     future. An advantage will be greatly reduced administrative overhead,
     because SSI kernels have the information necessary to automate most
     LVS configuration. Another advantage will be that the SSI environment 
     allows much tighter coordination among server nodes.
    </para><para>     LVS support is turned off in the current binary release of SSI/UML.
     To experiment with it you must build your own kernel as described in
     <xref linkend="buildkernel"></xref>.
    </para></sect3><sect3 id="mosixll"><title>Mosix Load-Leveler</title><para>     The <ulink url="http://openmosix.sourceforge.net/">Mosix</ulink>
     load-leveler provides automatic load-balancing within a cluster.
     Using the Mosix algorithms, the load of each node is calculated and
     compared to the loads of the other nodes in the cluster. If it's
     determined that a node is overloaded, the load-leveler chooses a
     process to migrate to the best underloaded node.
    </para><para>     Only the load-leveling algorithms have been taken from Mosix. The 
     SSI Clustering project is using its own process migration model, 
     membership mechanism and information sharing scheme.
    </para><para>     The Mosix load-leveler is turned off in the current binary release 
     of SSI/UML.
     To experiment with it you must build your own kernel as described in
     <xref linkend="buildkernel"></xref>.
    </para></sect3></sect2><sect2 id="umloverview"><title>Overview of UML</title><para>    <ulink url="http://user-mode-linux.sf.net/">User-Mode Linux</ulink> (UML)
    allows you to run one or more virtual Linux machines on a host Linux 
    system. It includes virtual block, network, and serial devices to
    provide an environment that is almost as full-featured as a 
    hardware-based machine.
   </para></sect2><sect2 id="audience"><title>Intended Audience</title><para>    The following are various cluster types found in use today. If you
    use or intend to use one of these cluster types, you may want to 
    consider SSI clustering as an alternative or addition.
   </para><para>    <itemizedlist><listitem><para>       High performance (HP) clusters, typified by <ulink url="http://www.beowulf.org/">Beowulf clusters</ulink>, 
       are constructed to run parallel programs (weather simulations, 
       data mining, etc.).
      </para></listitem><listitem><para>       Load-leveling clusters, typified by <ulink url="http://openmosix.sourceforge.net/">Mosix</ulink>, are constructed 
       to allow a user on one node to spread his workload 
       transparently across all nodes in the cluster. This can be 
       very useful for compute intensive, long running jobs that 
       aren't massively parallel.
      </para></listitem><listitem><para>       Web-service clusters, typified by the <ulink url="http://www.LinuxVirtualServer.org/">Linux Virtual 
       Server</ulink> (LVS) project and <ulink url="http://sources.redhat.com/piranha/">Piranha</ulink>, 
       do a different kind 
       of load leveling. Incoming web service requests are 
       load-leveled by a front end system across a set of 
       standard servers.
      </para></listitem><listitem><para>       Storage clusters, typified by <ulink url="http://www.sistina.com/products_gfs.htm">Sistina's 
       GFS</ulink> and the <ulink url="http://www.opengfs.org/">OpenGFS project</ulink>, consist of nodes which supply parallel, 
       coherent, and highly available access to filesystem data. 
      </para></listitem><listitem><para>       Database clusters, typified by <ulink url="http://oracle.com/ip/index.html?rac_home.html">Oracle 9I RAC</ulink> (formerly Oracle Parallel Server), 
       consist of nodes which supply 
       parallel, coherent, and HA access to a database. 
      </para></listitem><listitem><para>       High Availability clusters, typified by <ulink url="http://www.steeleye.com/products/linux/">Lifekeeper</ulink>, <ulink url="http://oss.sgi.com/projects/failsafe/">FailSafe</ulink> and <ulink url="http://linux-ha.org/heartbeat/">Heartbeat</ulink>, are also often known as failover 
       clusters. Resources, most importantly applications and 
       nodes, are monitored. When a failure is detected, scripts 
       are used to fail over IP addresses, disks, and filesystems, 
       as well as restarting applications. 
      </para></listitem></itemizedlist>
   </para><para>    For more information about how SSI clustering compares to the
    cluster types above, read Bruce Walker's <ulink url="http://ssic-linux.sf.net/ssi-intro-v4.pdf">Introduction to Single System Image Clustering</ulink>.
   </para></sect2><sect2 id="sysreqs"><title>System Requirements</title><para>    To create an SSI cluster of virtual UML machines,
    you need an Intel x86-based computer running any Linux distribution with
    a 2.2.15 or later kernel. About two gigabytes of available
    hard drive space are needed for each node's swap space, the original disk 
    image, and its working copy.
   </para><para>    A reasonably fast processor and sufficient memory are necessary
    to ensure good performance while running several virtual machines.
    The systems I've used so far have not performed well.
   </para><para>    One was a 400 MHz PII with 192 MB of memory running Sawfish as its
    window manager. Bringing up a three node cluster was quite slow
    and sometimes failed, maybe due to problems with memory pressure
    in either UML or the UML port of SSI.
   </para><para>    Another was a two-way 200 MHz Pentium Pro with 192 MB of memory 
    that used a second machine as its X server. 
    A three node cluster booted quicker and failed less often, 
    but performance was still less than satisfactory.
   </para><para>    More testing is needed to know what the appropriate system requirements
    are. User feedback would be most useful, and can be sent to 
    <email>ssic-linux-devel@lists.sf.net</email>.
   </para></sect2><sect2 id="newversions"><title>New Versions</title><indexterm significance="normal"><primary>(your index root)!news on</primary></indexterm><para>    The latest version of this HOWTO will always be made available on
    the <ulink url="http://ssic-linux.sf.net/">SSI project website</ulink>, 
    in a variety of formats:
   </para><para>   <itemizedlist><listitem><para>      <ulink url="http://ssic-linux.sf.net/ssiuml-howto/">HTML</ulink>
     </para></listitem><listitem><para>      <ulink url="http://ssic-linux.sf.net/ssiuml-howto.pdf">PDF</ulink>
     </para></listitem><listitem><para>      <ulink url="http://ssic-linux.sf.net/ssiuml-howto.sgml">SGML 
        source</ulink>
     </para></listitem></itemizedlist>
   </para></sect2><sect2 id="feedback"><title>Feedback</title><para>    Feedback is most certainly welcome for this document. Please
    send your additions, comments and criticisms to the following
    email address: <email>ssic-linux-devel@lists.sf.net</email>.
   </para></sect2><sect2 id="copyright"><title>Copyright Information</title><para>    This document is copyrighted ent 2002 Hewlett-Packard Company and is
    distributed under the terms of the Linux Documentation Project
    (LDP) license, stated below.
   </para><para>    Unless otherwise stated, Linux HOWTO documents are
    copyrighted by their respective authors. Linux HOWTO documents may
    be reproduced and distributed in whole or in part, in any medium
    physical or electronic, as long as this copyright notice is
    retained on all copies. Commercial redistribution is allowed and
    encouraged; however, the author would like to be notified of any
    such distributions.
   </para><para>    All translations, derivative works, or aggregate works
    incorporating any Linux HOWTO documents must be covered under this
    copyright notice. That is, you may not produce a derivative work
    from a HOWTO and impose additional restrictions on its
    distribution. Exceptions to these rules may be granted under
    certain conditions; please contact the Linux HOWTO coordinator at
    the address given below.
   </para><para>    In short, we wish to promote dissemination of this
    information through as many channels as possible. However, we do
    wish to retain copyright on the HOWTO documents, and would like to
    be notified of any plans to redistribute the HOWTOs.
   </para><para>    If you have any questions, please contact 
    <email>linux-howto@en.tdlp.org</email>
   </para></sect2><sect2 id="disclaimer"><title>Disclaimer</title><para>    No liability for the contents of this documents can be accepted.
    Use the concepts, examples and other content at your own risk.
    As this is a new edition of this document, there may be errors
    and inaccuracies, that may of course be damaging to your system.
    Proceed with caution, and although this is highly unlikely,
    the author(s) do not take any responsibility for that.
   </para><para>    All copyrights are held by their by their respective owners, unless
    specifically noted otherwise.  Use of a term in this document
    should not be regarded as affecting the validity of any trademark
    or service mark.
   </para><para>    Naming of particular products or brands should not be seen 
    as endorsements.
   </para><para>    You are strongly recommended to make a backup of your system 
    before major installations, and back up at regular intervals.
   </para></sect2></sect1><sect1 id="gettingstarted"><title>Getting Started</title><para>    This section is a quick start guide for installing and running an SSI
    cluster of virtual UML machines. The most time-consuming part of this 
    procedure is downloading the root image.
  </para><sect2 id="getroot"><title>Root Image</title><para>     First you need to download a <ulink url="http://prdownloads.sf.net/ssic-linux/ssiuml-root-rh72-0.6.5-1.tar.bz2">SSI-ready root image</ulink>. The compressed image
     weighs in at over 150MB, which will take more than six hours to
     download over a 56K modem, or about 45 minutes over a 500K 
     broadband connection. 
   </para><para>     The image is based on Red Hat 7.2. This means the virtual SSI cluster
     will be running Red Hat, but it does not matter which distribution
     you run on the host system. 
     A more advanced user can make a new root
     image based on another distribution. This is described in 
     <xref linkend="buildroot"></xref>.
   </para><para>     After downloading the root image, extract and install it.
   </para><screen format="linespecific">host$ tar jxvf ~/ssiuml-root-rh72-0.6.5-1.tar.bz2
host$ su
host# cd ssiuml-root-rh72
host# make install
host# <keycap moreinfo="none">Ctrl-D</keycap>
   </screen></sect2><sect2 id="umlutils"><title>UML Utilities</title><para>     Download the <ulink url="http://prdownloads.sf.net/user-mode-linux/uml_utilities_20020428.tar.bz2">UML utilities</ulink>. Extract, build, and install them.
   </para><screen format="linespecific">host$ tar jxvf ~/uml_utilities_20020428.tar.bz2
host$ su
host# cd tools
host# make install
host# <keycap moreinfo="none">Ctrl-D</keycap>
   </screen></sect2><sect2 id="ssiumlutils"><title>SSI/UML Utilities</title><para>     Download the <ulink url="http://prdownloads.sf.net/ssic-linux/ssiuml-utils-0.6.5-1.tar.bz2">SSI/UML utilities</ulink>. Extract, build, and install them.
   </para><screen format="linespecific">host$ tar jxvf ~/ssiuml-utils-0.6.5-1.tar.bz2
host$ su
host# cd ssiuml-utils
host# make install
host# <keycap moreinfo="none">Ctrl-D</keycap>
   </screen></sect2><sect2 id="booting"><title>Booting the Cluster</title><para>     Assuming X Windows is running or the <parameter moreinfo="none">DISPLAY</parameter>
     variable is set to an available X server, start a two node 
     cluster with
   </para><screen format="linespecific">host$ ssi-start 2
   </screen><para>     This command boots nodes 1 and 2. It displays each console
     in a new xterm. The nodes run through their early kernel initialization, 
     then seek each other out and form an SSI cluster before booting the
     rest of the way. If you're anxious to see what an SSI cluster can do,
     skip ahead to <xref linkend="playaround"></xref>.
   </para><para>    You'll probably notice that two other consoles are started. One is the
    lock server node, which is an artefact of how the GFS shared root
    is implemented at this time. The console is not a node in the cluster, 
    and it won't give you a login prompt. For more information about 
    the lock server, see <xref linkend="gfslinks"></xref>. The other console
    is for the UML virtual networking switch daemon. It won't give you a 
    prompt, either.
   </para><para>     Note that only one SSI/UML cluster can be running at a time, although
     it can be run as a non-root user.
   </para><para>     The argument to <command moreinfo="none">ssi-start</command> is the number of nodes 
     that should be in the cluster. It must be a number between 1 and 15. 
     If this argument is omitted, it defaults to 3.
     The fifteen node limit is arbitrary,
     and can be easily increased in future releases.
   </para><para>     To substitute your own SSI/UML files for the ones
     in <filename moreinfo="none">/usr/local/lib</filename> and 
     <filename moreinfo="none">/usr/local/bin</filename>, provide your pathnames in
     <filename moreinfo="none">~/.ssiuml/ssiuml.conf</filename>. 
     Values to override are
     <parameter class="option" moreinfo="none">KERNEL</parameter>,
     <parameter class="option" moreinfo="none">ROOT</parameter>,
     <parameter class="option" moreinfo="none">CIDEV</parameter>,
     <parameter class="option" moreinfo="none">INITRD</parameter>, and
     <parameter class="option" moreinfo="none">INITRD_MEMEXP</parameter>.
     This feature is only needed by an advanced user.
   </para></sect2><sect2 id="bootingindiv"><title>Booting an Individual Node</title><para>     Add nodes 3 and 5 to the cluster with
   </para><screen format="linespecific">host$ ssi-add 3 5
   </screen><para>     The arguments taken by <command moreinfo="none">ssi-add</command> are an arbitrary 
     list of node numbers. The node numbers must be between 1 and 15.
     At least one node number must be provided. For
     any node that is already up, <command moreinfo="none">ssi-add</command> 
     ignores it and moves on to the next argument in the list.
   </para></sect2><sect2 id="crashingindiv"><title>Crashing an Individual Node</title><para>     Simulate a crash of node 3 with
   </para><screen format="linespecific">host$ ssi-rm 3
   </screen><para>     Note that this command does not inform the other nodes about the crash.
     They must discover it through the cluster's node monitoring mechanism.
   </para><para>     The arguments taken by <command moreinfo="none">ssi-rm</command> are an arbitrary 
     list of node numbers. At least one node number must be provided.
   </para></sect2><sect2 id="shutdown"><title>Shutting Down the Cluster</title><para>    You can take down the entire cluster at once with
   </para><screen format="linespecific">host$ ssi-stop
   </screen><para>    If <command moreinfo="none">ssi-stop</command> hangs, interrupt it and shoot all the
    <command moreinfo="none">linux-ssi</command> processes before trying again.
   </para><screen format="linespecific">host$ killall -9 linux-ssi
host$ ssi-stop
   </screen><para>    Eventually, it should be possible to take down the cluster by running 
    <command moreinfo="none">shutdown</command> as root on any one of its consoles. This
    does not work just yet.
   </para></sect2></sect1><sect1 id="playaround"><title>Playing Around</title><para>   Bring up a three node cluster with <command moreinfo="none">ssi-start</command>.
   Log in to all three consoles as <userinput moreinfo="none">root</userinput>. The 
   initial password is <userinput moreinfo="none">root</userinput>, but you'll be
   forced to change it the first time you log in.
  </para><para>   The following demos should familiarize you with what an SSI 
   cluster can do.
  </para><sect2 id="procmove"><title>Process Movement, Inheriting Open Files and Devices</title><para>    Start <command moreinfo="none">dbdemo</command> on node 1.
   </para><screen format="linespecific">node1# cd ~/dbdemo
node1# ./dbdemo alphabet
   </screen><para>    The <command moreinfo="none">dbdemo</command> program "processes" records from the file 
    given as an argument. In this case, it's <filename moreinfo="none">alphabet</filename>, 
    which contains the ICAO alphabet used by aviators. For each record,
    <command moreinfo="none">dbdemo</command> writes the data to its terminal device
    and spins in a busy loop for a second to simulate an intensive calculation.
   </para><para>    The <command moreinfo="none">dbdemo</command> program is also listening on its terminal
    device for certain command keys.
   </para><table><title>Command Keys for <command moreinfo="none">dbdemo</command></title><tgroup cols="2"><thead><row><entry>Key</entry><entry>Description</entry></row></thead><tbody><row><entry>        <keycap moreinfo="none">1</keycap>-<keycap moreinfo="none">9</keycap>
       </entry><entry>        move to that node and continue with the next record
       </entry></row><row><entry>        <keycap moreinfo="none">Enter</keycap>
       </entry><entry>        periodically moves to a random node until you press a key
       </entry></row><row><entry>        <keycap moreinfo="none">q</keycap>
       </entry><entry>        quit
       </entry></row></tbody></tgroup></table><para>    Move <command moreinfo="none">dbdemo</command> to different nodes. Note that it continues
    to send output to the console where it was started, and that it continues
    to respond to keypresses from that console. This demonstrates that although
    the process is running on another node, it can remotely read and write
    the device it had open.
   </para><para>    Also note that when a process moves, it preserves its file offsets.
    After moving, <command moreinfo="none">dbdemo</command> continues processing records 
    from <filename moreinfo="none">alphabet</filename> as if nothing had happened.
   </para><para>    To confirm that the process moved to a new node, get its PID and use
    <command moreinfo="none">where_pid</command>. You can do this on any node.
   </para><screen format="linespecific">node3# ps -ef | grep dbdemo
node3# where_pid <emphasis>entpident</emphasis>
2
   </screen><para>    If you like, you can <ulink url="http://ssic-linux.sourceforge.net/dbdemo.tar.bz2">download the source</ulink> for <command moreinfo="none">dbdemo</command>.
    It's also available as a tarball in the 
    <filename moreinfo="none">/root/dbdemo</filename> directory.
   </para></sect2><sect2 id="distproc"><title>Clusterwide PIDs, Distributed Process Relationships 
   	and Access, Clusterwide Job Control and Single Root</title><para>    From node 1's console, start up <command moreinfo="none">vi</command> on node 2.
    The <command moreinfo="none">onnode</command> command uses the SSI kernel's
    <function moreinfo="none">rexec</function> system call to remotely execute
    <command moreinfo="none">vi</command>.
   </para><screen format="linespecific">node1# onnode 2 vi /tmp/newfile
   </screen><para>    Confirm that it's on node 2 with <command moreinfo="none">where_pid</command>.
    You need to get its PID first.
   </para><screen format="linespecific">node3# ps -ef | grep vi
node3# where_pid <emphasis>entpident</emphasis>
2
   </screen><para>    Type some text and save your work.
    On node 3, <command moreinfo="none">cat</command> the file to see the contents.
    This demonstrates the single root file system.
   </para><screen format="linespecific">node3# cat /tmp/newfile
some text
   </screen><para>    From node 3, kill the <command moreinfo="none">vi</command> session running on node 2. 
    You should see
    control of node 1's console given back to the shell.
   </para><screen format="linespecific">node3# kill <emphasis>entpident</emphasis>
   </screen></sect2><sect2 id="clustfifos"><title>Clusterwide FIFOs</title><para>    Make a FIFO on the shared root.
   </para><screen format="linespecific">node1# mkfifo /fifo
   </screen><para>    <command moreinfo="none">echo</command> something into the FIFO on node 1.
   </para><screen format="linespecific">node1# echo something ent/fifo 
   </screen><para>    <command moreinfo="none">cat</command> the FIFO on node 2.
   </para><screen format="linespecific">node2# cat /fifo
something
   </screen><para>    This demostrates that FIFOs are clusterwide and remotely accessible.
   </para></sect2><sect2 id="clustdevs"><title>Clusterwide Device Naming and Access</title><para>    On node 3, write "Hello World" to the console of node 1.
   </para><screen format="linespecific">node3# echo "Hello World" ent/devfs/node1/console
   </screen><para>    This shows that devices can be remotely accessed from anywhere in the
    cluster. Eventually, the node-specific subdirectories of 
    <filename moreinfo="none">/devfs</filename> will be merged together into a single
    device tree that can be mounted on <filename moreinfo="none">/dev</filename> without
    confusing non-cluster aware applications.
   </para></sect2></sect1><sect1 id="buildkernel"><title>Building a Kernel and Ramdisk</title><para>   Building your own kernel and ramdisk is necessary if you want to
  </para><para>   <itemizedlist><listitem><para>      customize the kernel configuration,
     </para></listitem><listitem><para>      keep up with the absolute latest SSI code available through CVS,
     </para></listitem><listitem><para>      or test your SSI bugfix or kernel enhancement with UML.
     </para></listitem></itemizedlist>
  </para><para>   Otherwise, feel free to skip this section.
  </para><sect2 id="getssi"><title>Getting SSI Source</title><para>    SSI source code is available as official release tarballs and through CVS.
    The CVS repository contains the latest, bleeding-edge code. It can be less
    stable than the official release, but it has features and bugfixes that 
    the release does not have.
   </para><sect3 id="getssirelease"><title>Official Release</title><para>     The latest SSI release can be found at the top of this <ulink url="http://sourceforge.net/project/showfiles.php?group_id=32541">release list</ulink>. At the time of this writing, the latest
     release is 0.6.5.
    </para><para>     Download the latest release. Extract it.
    </para><screen format="linespecific">host$ tar jxvf ~/ssi-linux-2.4.16-v0.6.5.tar.bz2
    </screen><para>     Determine the corresponding kernel version number from the release name. 
     It appears before the SSI version number. For the 0.6.5 release, 
     the corresponding kernel version is 2.4.16.
    </para></sect3><sect3 id="getssicvs"><title>CVS Checkout</title><para>     Follow these <ulink url="http://sourceforge.net/cvs/?group_id=32541">instructions</ulink> to do a CVS checkout of the latest SSI code.
     The modulename is <emphasis>ssic-linux</emphasis>.
    </para><para>     You also need to check out the latest CI code. Follow these 
     <ulink url="http://sourceforge.net/cvs/?group_id=32543">instructions</ulink> to do that.
     The modulename is <emphasis>ci-linux</emphasis>.
    </para><para>     To do a developer checkout, you must be a CI or SSI developer.
     If you are interested in becoming a developer, read 
     <xref linkend="debugging"></xref> and <xref linkend="newfeatures"></xref>.
    </para><para>     Determine the corresponding kernel version with
    </para><screen format="linespecific">host$ head -4 ssic-linux/ssi-kernel/Makefile
VERSION = 2
PATCHLEVEL = 4
SUBLEVEL = 16
EXTRAVERSION =
    </screen><para>     In this case, the corresponding kernel version is 2.4.16. If you're
     paranoid, you might want to make sure the corresponding kernel version
     for CI is the same.
    </para><screen format="linespecific">host$ head -4 ci-linux/ci-kernel/Makefile
VERSION = 2
PATCHLEVEL = 4
SUBLEVEL = 16
EXTRAVERSION =
    </screen><para>     They will only differ when I'm merging them up to a new kernel version.
     There is a window between checking in the new CI code and the new SSI
     code. I'll do my best to minimize that window. If you happen to see it,
     wait a few hours, then update your sandboxes.
    </para><screen format="linespecific">host$ cd ssic-linux
host$ cvs up -d
host$ cd ../ci-linux
host$ cvs up -d
host$ cd ..
    </screen></sect3></sect2><sect2 id="base"><title>Getting the Base Kernel</title><para>    Download the appropriate kernel source. Get the version you
    determined in <xref linkend="getssi"></xref>. Kernel source can be found
    on this <ulink url="http://www.kernel.org/pub/linux/kernel/v2.4/">U.S. server</ulink> or any one of these <ulink url="http://kernel.org/mirrors/">mirrors</ulink> around the world.
   </para><para>    Extract the source. This will take a little time.
   </para><screen format="linespecific">host$ tar jxvf ~/linux-2.4.16.tar.bz2
   </screen><para>    or
   </para><screen format="linespecific">host$ tar zxvf ~/linux-2.4.16.tar.gz
   </screen></sect2><sect2 id="applyssi"><title>Applying SSI Kernel Code</title><para>    Follow the appropriate instructions, based on whether you downloaded
    an official SSI release or did a CVS checkout.
   </para><sect3 id="applyssirelease"><title>Official Release</title><para>     Apply the patch in the SSI source tree.
    </para><screen format="linespecific">host$ cd linux
host$ patch -p1 ent../ssi-linux-2.4.16-v0.6.5/ssi-linux-2.4.16-v0.6.5.patch
    </screen></sect3><sect3 id="applyssicvs"><title>CVS Checkout</title><para>     Apply the UML patch from either the CI or SSI sandbox. It will fail
     on patching <filename moreinfo="none">Makefile</filename>. Don't worry about this.
    </para><screen format="linespecific">host$ cd linux
host$ patch -p1 ent../ssic-linux/3rd-party/uml-patch-2.4.18-22
    </screen><para>     Copy CI and SSI code into place.
    </para><screen format="linespecific">host$ cp -alf ../ssic-linux/ssi-kernel/. .
host$ cp -alf ../ci-linux/ci-kernel/. .
    </screen><para>     Apply the GFS patch from the SSI sandbox.
    </para><screen format="linespecific">host$ patch -p1 ent../ssic-linux/3rd-party/opengfs-ssi.patch
    </screen><para>     Apply any other patch from 
     <filename moreinfo="none">ssic-linux/3rd-party</filename> at your discretion. 
     They haven't been tested much or at all in the UML environment. 
     The KDB patch is rather useless in this environment.
    </para></sect3></sect2><sect2 id="kernbuild"><title>Building the Kernel</title><para>    Configure the kernel with the provided configuration
    file. The following commands assume you are still in the kernel source
    directory.
   </para><screen format="linespecific">host$ cp config.uml .config
host$ make oldconfig ARCH=um
   </screen><para>    Build the kernel image and modules.
   </para><screen format="linespecific">host$ make dep linux modules ARCH=um
   </screen></sect2><sect2 id="gfshost"><title>Adding GFS Support to the Host</title><para>    To install the kernel you must be able to loopback mount the 
    GFS root image. You need to do a few things to the
    host system to make that possible.
   </para><para>    <ulink url="http://opengfs.org/sourceframe.html">Download</ulink> any version of OpenGFS <emphasis>after</emphasis>
    0.0.92, or <ulink url="http://sourceforge.net/cvs/?group_id=34688">check out</ulink> the latest source from CVS.
   </para><para>    Apply the appropriate kernel patches from the 
    <filename moreinfo="none">kernel_patches</filename> directory to your kernel source tree.
    Make sure you enable the /dev filesystem, but
    do <emphasis>not</emphasis> have it automatically mount at boot.
    (When you configure the kernel select 'File systems -ent /dev
    filesystem support' and unselect 'File systems -ent /dev filesystem
    support -ent Automatically mount at boot'.)
    Build the kernel as usual, install it, rewrite your boot block and 
    reboot. 
   </para><para>    Configure, build and install the GFS modules and utilities.
   </para><screen format="linespecific">host$ cd opengfs
host$ ./autogen.sh --with-linux_srcdir=<emphasis>host_kernel_source_tree</emphasis>
host$ make
host$ su
host# make install
   </screen><para>    Configure two aliases for one of the host's network devices. The first
    alias should be 192.168.50.1, and the other should be 192.168.50.101.
    Both should have a netmask of 255.255.255.0.
   </para><screen format="linespecific">host# ifconfig eth0:0 192.168.50.1 netmask 255.255.255.0
host# ifconfig eth0:1 192.168.50.101 netmask 255.255.255.0
   </screen><para>    <command moreinfo="none">cat</command> the contents of 
    <filename moreinfo="none">/proc/partitions</filename>. Select two device names
    that you're not using for anything else, and make two loopback devices
    with their names. For example:
   </para><screen format="linespecific">host# mknod /dev/ide/host0/bus0/target0/lun0/part1 b 7 1
host# mknod /dev/ide/host0/bus0/target0/lun0/part2 b 7 2
   </screen><para>    Finally, load the necessary GFS modules and start the lock server daemon.
   </para><screen format="linespecific">host# modprobe gfs
host# modprobe memexp
host# memexpd
host# <keycap moreinfo="none">Ctrl-D</keycap>
   </screen><para>    Your host system now has GFS support.
   </para></sect2><sect2 id="kerninst"><title>Installing the Kernel</title><para>    Loopback mount the shared root.
   </para><screen format="linespecific">host$ su
host# losetup /dev/loop1 root_cidev
host# losetup /dev/loop2 root_fs
host# passemble
host# mount -t gfs -o hostdata=192.168.50.1 /dev/pool/pool0 /mnt
   </screen><para>    Install the modules into the root image.
   </para><screen format="linespecific">host# make modules_install ARCH=um INSTALL_MOD_PATH=/mnt
host# <keycap moreinfo="none">Ctrl-D</keycap>
   </screen></sect2><sect2 id="gfsuml"><title>Building GFS for UML</title><para>    You have to repeat some of the steps you did in <xref linkend="gfshost"></xref>.
    Extract another copy of the OpenGFS source. Call it
    <filename moreinfo="none">opengfs-uml</filename>. Add the following line to
    <filename moreinfo="none">make/modules.mk.in</filename>.
   </para><programlisting format="linespecific"> KSRC		:= /root/linux-ssi
 
 INCL_FLAGS	:= -I. -I.. -I$(GFS_ROOT)/src/include -I$(KSRC)/include \
+		    -I$(KSRC)/arch/um/include \
 		    $(EXTRA_INCL)
 DEF_FLAGS	:= -D__KERNEL__ -DMODULE  $(EXTRA_FLAGS)
 OPT_FLAGS	:= -O2 -fomit-frame-pointer 
   </programlisting><para>    Configure, build and install the GFS modules and utilities for UML.
   </para><screen format="linespecific">host$ cd opengfs-uml
host$ ./autogen.sh --with-linux_srcdir=<emphasis>UML_kernel_source_tree</emphasis>
host$ make
host$ su
host# make install DESTDIR=/mnt
   </screen></sect2><sect2 id="initrdbuild"><title>Building the Ramdisk</title><para>    Change root into the loopback mounted root image, and use the 
    <parameter class="command" moreinfo="none">--uml</parameter> argument to 
    <command moreinfo="none">cluster_mkinitrd</command> to build a ramdisk.
   </para><screen format="linespecific">host# /usr/sbin/chroot /mnt
host# cluster_mkinitrd --uml initrd-ssi.img 2.4.16-21um
   </screen><para>    Move the new ramdisk out of the root image, and assign ownership
    to the appropriate user. Wrap things up.
   </para><screen format="linespecific">host# mv /mnt/initrd-ssi.img ~<emphasis>username</emphasis>
host# chown <emphasis>username</emphasis> ~<emphasis>username</emphasis>/initrd-ssi.img
host# umount /mnt
host# passemble -r all
host# losetup -d /dev/loop1
host# losetup -d /dev/loop2
host# <keycap moreinfo="none">Ctrl-D</keycap>
host$ cd ..
   </screen></sect2><sect2 id="bootingkern"><title>Booting the Cluster</title><para>    Pass the new kernel and ramdisk images into <command moreinfo="none">ssi-start</command>
    with the appropriate pathnames for 
    <parameter class="option" moreinfo="none">KERNEL</parameter> and
    <parameter class="option" moreinfo="none">INITRD</parameter> in 
    <filename moreinfo="none">~/.ssiuml/ssiuml.conf</filename>.
    An example for <parameter class="option" moreinfo="none">KERNEL</parameter> 
    would be <filename moreinfo="none">~/linux/linux</filename>.
    An example for <parameter class="option" moreinfo="none">INITRD</parameter>
    would be <filename moreinfo="none">~/initrd-ssi.img</filename>.
   </para><para>    Stop the currently running cluster and start again.
   </para><screen format="linespecific">host$ ssi-stop
host$ ssi-start
   </screen><para>    You should see a three-node cluster booting with your new kernel.
    Feel free to take it through the exercises in <xref linkend="playaround"></xref>
    to make sure it's working correctly.
   </para></sect2></sect1><sect1 id="buildroot"><title>Building a Root Image</title><para>   Building your own root image is necessary if you want to use a 
   distribution other than Red Hat 7.2. Otherwise, feel free to skip 
   this section.
  </para><para>   These instructions describe how to build a Red Hat 7.2 image.
   At the end of this section is a brief discussion of how other
   distributions might differ. Building a root image for another
   distribution is left as an exercise for the reader.
  </para><sect2 id="baseroot"><title>Base Root Image</title><para>    Download the <ulink url="http://prdownloads.sf.net/user-mode-linux/root_fs.rh72.pristine.bz2">Red Hat 7.2 root image</ulink> from the User-Mode Linux (UML) project.
    As with the root image you downloaded in <xref linkend="getroot"></xref>,
    it is over 150MB.
   </para><para>    Extract the image.
   </para><screen format="linespecific">host$ bunzip2 -c root_fs.rh72.pristine.bz2 entroot_fs.ext2
   </screen><para>    Loopback mount the image.
   </para><screen format="linespecific">host$ su
host# mkdir /mnt.ext2
host# mount root_fs.ext2 /mnt.ext2 -o loop,ro
   </screen></sect2><sect2 id="gfsroot"><title>GFS Root Image</title><para>    Make a blank GFS root image. You also need to create an 
    accompanying lock table image. Be sure you've added support
    for GFS to your host system by following the instructions in
    <xref linkend="gfshost"></xref>.
   </para><screen format="linespecific">host# dd of=root_cidev bs=1024 seek=4096 count=0
host# dd of=root_fs bs=1024 seek=2097152 count=0
host# chmod a+w root_cidev root_fs
host# losetup /dev/loop1 root_cidev
host# losetup /dev/loop2 root_fs
   </screen><para>    Enter the following pool information into a file 
    named <filename moreinfo="none">pool0cidev.cf</filename>.
   </para><programlisting format="linespecific">poolname pool0cidev
subpools 1
subpool 0 0 1 gfs_data
pooldevice 0 0 /dev/loop1 0
   </programlisting><para>    Enter the following pool information into a file 
    named <filename moreinfo="none">pool0.cf</filename>.
   </para><programlisting format="linespecific">poolname pool0
subpools 1
subpool 0 0 1 gfs_data
pooldevice 0 0 /dev/loop2 0
   </programlisting><para>    Write the pool information to the loopback devices.
   </para><screen format="linespecific">host# ptool pool0cidev.cf
host# ptool pool0.cf
   </screen><para>    Create the pool devices.
   </para><screen format="linespecific">host# passemble
   </screen><para>    Enter the following lock table into a file named 
    <filename moreinfo="none">gfscf.cf</filename>.
   </para><programlisting format="linespecific">datadev:	/dev/pool/pool0
cidev:		/dev/pool/pool0cidev
lockdev:	192.168.50.101:15697
cbport:		3001
timeout:	30
STOMITH: NUN
name:none
node: 192.168.50.1	1	SM: none
node: 192.168.50.2	2	SM: none
node: 192.168.50.3	3	SM: none
node: 192.168.50.4	4	SM: none
node: 192.168.50.5	5	SM: none
node: 192.168.50.6	6	SM: none
node: 192.168.50.7	7	SM: none
node: 192.168.50.8	8	SM: none
node: 192.168.50.9	9	SM: none
node: 192.168.50.10	10	SM: none
node: 192.168.50.11	11	SM: none
node: 192.168.50.12	12	SM: none
node: 192.168.50.13	13	SM: none
node: 192.168.50.14	14	SM: none
node: 192.168.50.15	15	SM: none
   </programlisting><para>    Write the lock table to the cidev pool device.
   </para><screen format="linespecific">host# gfsconf -c gfscf.cf
   </screen><para>    Format the root disk image.
   </para><screen format="linespecific">host# mkfs_gfs -p memexp -t /dev/pool/pool0cidev -j 15 -J 32 -i /dev/pool/pool0
   </screen><para>    Mount the root image.
   </para><screen format="linespecific">host# mount -t gfs -o hostdata=192.168.50.1 /dev/pool/pool0 /mnt
   </screen><para>    Copy the ext2 root to the GFS image.
   </para><screen format="linespecific">host# cp -a /mnt.ext2/. /mnt
   </screen><para>    Clean up.
   </para><screen format="linespecific">host# umount /mnt.ext2
host# rmdir /mnt.ext2
host# <keycap moreinfo="none">Ctrl-D</keycap>
host$ rm root_fs.ext2
   </screen></sect2><sect2 id="getct"><title>Getting Cluster Tools Source</title><para>    Cluster Tools source code is available as official release tarballs and 
    through CVS.  The CVS repository contains the latest, bleeding-edge code. 
    It can be less stable than the official release, but it has features and 
    bugfixes that the release does not have.
   </para><sect3 id="getctrelease"><title>Official Release</title><para>     The latest release can be found at the top of 
     the Cluster-Tools section of this <ulink url="http://sourceforge.net/project/showfiles.php?group_id=32543">release list</ulink>. At the time of this writing, the latest
     release is 0.6.5.
    </para><para>     Download the latest release. Extract it.
    </para><screen format="linespecific">host$ tar jxvf ~/cluster-tools-0.6.5.tar.bz2
    </screen></sect3><sect3 id="getctcvs"><title>CVS Checkout</title><para>     Follow these <ulink url="http://sourceforge.net/cvs/?group_id=32543">instructions</ulink> to do a CVS checkout of the latest Cluster Tools 
     code.  The modulename is <emphasis>cluster-tools</emphasis>.
    </para><para>     To do a developer checkout, you must be a CI developer.
     If you are interested in becoming a developer, read 
     <xref linkend="debugging"></xref> and <xref linkend="newfeatures"></xref>.
    </para></sect3></sect2><sect2 id="buildtools"><title>Building and Installing Cluster Tools</title><screen format="linespecific">host$ su
host# cd cluster-tools
host# make install_ssi_redhat UML_ROOT=/mnt
   </screen></sect2><sect2 id="rootmodules"><title>Installing Kernel Modules</title><para>    If you built a kernel, as described in <xref linkend="buildkernel"></xref>,
    then follow the instructions in <xref linkend="kernbuild"></xref> and
    <xref linkend="gfsuml"></xref> to install kernel and GFS modules onto your
    new root.
   </para><para>    Otherwise, mount the old root image and copy the modules directory from 
    <filename moreinfo="none">/mnt/lib/modules</filename>. Then remount the new root image
    and copy the modules into it.
   </para></sect2><sect2 id="rootconfig"><title>Configuring the Root</title><para>    Remake the ubd devices. At some point, the UML team switched the device
    numbering scheme from 
    <parameter moreinfo="none">98,1</parameter> for <filename moreinfo="none">dev/ubd/1</filename>,
    <parameter moreinfo="none">98,2</parameter> for <filename moreinfo="none">dev/ubd/2</filename>, etc.
    Now they use
    <parameter moreinfo="none">98,16</parameter> for <filename moreinfo="none">dev/ubd/1</filename>,
    <parameter moreinfo="none">98,32</parameter> for <filename moreinfo="none">dev/ubd/2</filename>, etc.
   </para><para>    Comment and uncomment the appropriate lines in 
    <filename moreinfo="none">/mnt/etc/inittab.ssi</filename>. Search for the
    phrase 'For UML' to see which lines to change. Basically,
    you should disable the DHCP daemon, and change the getty
    to use <filename moreinfo="none">tty0</filename> rather than <filename moreinfo="none">tty1</filename>.
   </para><para>    You may want to strip down the operating system so that it boots quicker.
    For the prepackaged root image, I removed the following files.
   </para><programlisting format="linespecific">/etc/rc3.d/S25netfs
/etc/rc3.d/S50snmpd
/etc/rc3.d/S55named
/etc/rc3.d/S55sshd
/etc/rc3.d/S56xinetd
/etc/rc3.d/S80sendmail
/etc/rc3.d/S85gpm
/etc/rc3.d/S85httpd
/etc/rc3.d/S90crond
/etc/rc3.d/S90squid
/etc/rc3.d/S90xfs
/etc/rc3.d/S91smb
/etc/rc3.d/S95innd
   </programlisting><para>    You might also want to copy <command moreinfo="none">dbdemo</command> and its associated
    <filename moreinfo="none">alphabet</filename> file into <filename moreinfo="none">/root/dbdemo</filename>.
    This lets you run the demo described in <xref linkend="procmove"></xref>.
   </para></sect2><sect2 id="buildumount"><title>Unmounting the Root Image</title><screen format="linespecific">host# umount /mnt
host# passemble -r all
host# losetup -d /dev/loop1
host# losetup -d /dev/loop2
   </screen></sect2><sect2 id="otherdists"><title>Distributions Other Than Red Hat</title><para>    Cluster Tools has make rules for Caldera and 
    Debian, in addition to Red Hat.
    Respectively, the rules are 
    <parameter class="command" moreinfo="none">install_ssi_caldera</parameter> and 
    <parameter class="command" moreinfo="none">install_ssi_debian</parameter>.
   </para><para>    The main difference between the distributions is the 
    <filename moreinfo="none">/etc/inittab.ssi</filename> installed. It is the
    inittab used by the clusterized <command moreinfo="none">init.ssi</command>
    program. It is based on the distribution's 
    <filename moreinfo="none">/etc/inittab</filename>, but has some cluster-specific
    enhancements that are recognized by <command moreinfo="none">init.ssi</command>.
   </para><para>    There is also some logic in the <filename moreinfo="none">/etc/rc.d/rc.nodeup</filename>
    script to detect which distribution it's on. This script is run whenever
    a node joins the cluster, and it needs to do different things for 
    different distributions.
   </para><para>    Finally, there are some modifications to the networking scripts to
    prevent them from tromping on the cluster interconnect configuration.
    They're a short-term hack, and they've only been implemented for
    Red Hat so far. The modified files are 
    <filename moreinfo="none">/etc/sysconfig/network-scripts/ifcfg-eth0</filename> and
    <filename moreinfo="none">/etc/sysconfig/network-scripts/network-functions</filename>.
   </para></sect2></sect1><sect1 id="physical"><title>Moving to a Hardware-Based Cluster</title><para>   If you plan to use SSI clustering in a production system, you probably 
   want to move to a hardware-based cluster. That way you can take advantage
   of the high-availability and scalability that a hardware-based SSI cluster
   can offer.
  </para><para>   Hardware-based SSI clusters have significantly higher availability. If
   a UML host kernel panics, or the host machine has a hardware failure,
   its UML-based SSI cluster goes down. On the other hand, if one of the SSI
   kernels panic, or one of the hardware-based nodes has a failure, the
   cluster continues to run. Centralized kernel services can failover
   to a new node, and critical user-mode programs can be restarted by
   the application monitoring and restart daemon.
  </para><para>   Hardware-based SSI clusters also have significantly higher scalability.
   Each node has one or more CPUs that truly work in parallel, whereas
   a UML-based cluster merely simulates having multiple nodes by time-sharing
   on the host machine's CPUs. Adding nodes to a hardware-based cluster
   increases the volume of work it can handle, but adding nodes to a UML-based
   cluster bogs it down with more processes to run on the same number of CPUs.
  </para><sect2 id="physreqs"><title>Requirements</title><para>    You can build hardware-based SSI clusters with x86 or Alpha machines.
    More architectures, such as IA64, may be added in the future. Note that
    an SSI cluster must be homogeneous. You cannot mix architectures
    in the same cluster.
   </para><para>    The cluster interconnect must support TCP/IP networking. 100 Mbps 
    ethernet is acceptable. For security reasons, it should be a
    private network. Each node should have a second network interface for
    external traffic.
   </para><para>    Right now, the most expensive requirement of an SSI cluster is the 
    shared drive, required for the shared GFS root. This will no longer
    be a requirement when CFS, which is described below, is available.
    The typical configuration for
    the shared drive is a hardware RAID disk cabinet attached to all
    nodes with a Fibre Channel SAN. For a two-node cluster, it is also
    possible to use shared SCSI, but it is not directly supported by
    the current cluster management tools.
   </para><para>    The GFS shared root also requires one Linux machine outside of the
    cluster to be the lock server. It need not be the same architecture
    as the nodes in the cluster. It just has to run 
    <command moreinfo="none">memexpd</command>, a user-mode daemon. Eventually, GFS will
    work with a Distributed Lock Manager (DLM). This would
    eliminate the need for the external lock server, which is a single 
    point of failure. It could also free up the machine to be another
    node in your cluster.
   </para><para>    In the near future, the Cluster File System (CFS) will be an option
    for the shared root. It is a stateful NFS that uses a token mechanism
    to provide tight coherency guarantees. With CFS, the shared root
    can be stored on the internal disk of one of the nodes. The on-disk 
    format can be any journalling file system, such as ext3 or ReiserFS.
   </para><para>    The initial version of CFS will not provide high availability. 
    Future versions of CFS will allow the root to be mirrored across
    the internal disks of two nodes. A technology such as the Distributed
    Replicated Block Device (DRBD) would be used for this. This is a low-cost
    solution for the shared root, although it has a performance penalty.
   </para><para>    Future versions will also allow the root to be stored on a 
    disk shared by two or
    more nodes, but not necessarily shared by all nodes. 
    If the CFS server node crashes, its responsibilities
    would failover to another node attached to the shared disk.
   </para></sect2><sect2 id="physresources"><title>Resources</title><para>    Start with the <ulink url="http://ssic-linux.sourceforge.net/install.shtml">installation instructions for SSI</ulink>.
   </para><para>    If you'd like to install SSI from CVS code, follow <ulink url="http://sourceforge.net/cvs/?group_id=32541">these instructions</ulink> to checkout modulename 
    <emphasis>ssic-linux</emphasis>, and <ulink url="http://sourceforge.net/cvs/?group_id=32543">these instructions</ulink> to checkout modulenames
    <emphasis>ci-linux</emphasis> and <emphasis>cluster-tools</emphasis>.

    Read the <filename moreinfo="none">INSTALL</filename> and 
    <filename moreinfo="none">INSTALL.cvs</filename> files in both the
    <filename moreinfo="none">ci-linux</filename> and <filename moreinfo="none">ssic-linux</filename>
    sandboxes. Also look at the <filename moreinfo="none">README</filename> file in the
    <filename moreinfo="none">cluster-tools</filename> sandbox.
   </para><para>    For more information, read <xref linkend="moreinfo"></xref>.
   </para></sect2></sect1><sect1 id="moreinfo"><title>Further Information</title><indexterm significance="normal"><primary>(your index root)!information resources</primary></indexterm><para>   Here are some links to information on SSI clusters, CI clusters, GFS,
   UML, and other clustering projects.
  </para><sect2 id="ssilinks"><title>SSI Clusters</title><para>    Start with the <ulink url="http://ssic-linux.sf.net/">SSI project 
    homepage</ulink>. In particular, the <ulink url="http://ssic-linux.sourceforge.net/docs.shtml">documentation</ulink> may be of interest.

    The SourceForge <ulink url="http://sourceforge.net/projects/ssic-linux/">project summary page</ulink> also has some useful information.
   </para><para>    If you have a question or concern, post it to the
    <email>ssic-linux-devel@lists.sf.net</email> mailing list.
    If you'd like to subscribe, you can do so through this <ulink url="http://lists.sourceforge.net/lists/listinfo/ssic-linux-devel">web form</ulink>.
   </para><para>    If you are working from a CVS sandbox, you may also want to sign up
    for the ssic-linux-checkins mailing list to receive
    checkin notices. You can do that through this <ulink url="http://lists.sourceforge.net/lists/listinfo/ssic-linux-checkins">web form</ulink>.
   </para></sect2><sect2 id="cilinks"><title>CI Clusters</title><para>    Start with the <ulink url="http://ci-linux.sf.net/">CI project 
    homepage</ulink>. In particular, the <ulink url="http://ci-linux.sourceforge.net/docs.shtml">documentation</ulink> may be of interest.

    The SourceForge <ulink url="http://sourceforge.net/projects/ci-linux/">project summary page</ulink> also has some useful information.
   </para><para>    If you have a question or concern, post it to the
    <email>ci-linux-devel@lists.sf.net</email> mailing list.
    If you'd like to subscribe, you can do so through this <ulink url="http://lists.sourceforge.net/lists/listinfo/ci-linux-devel">web form</ulink>.
   </para><para>    If you are working from a CVS sandbox, you may also want to sign up
    for the ci-linux-checkins mailing list to receive
    checkin notices. You can do that through this <ulink url="http://lists.sourceforge.net/lists/listinfo/ci-linux-checkins">web form</ulink>.
   </para></sect2><sect2 id="gfslinks"><title>GFS</title><para>    SSI clustering currently depends on the Global File System (GFS) to
    provide a single root. The open-source version of GFS is maintained
    by the <ulink url="http://www.opengfs.org/">OpenGFS project</ulink>.
    They also have a SourceForge <ulink url="http://sourceforge.net/projects/opengfs/">project summary page</ulink>.
   </para><para>    Right now, GFS requires either a DMEP-equipped shared drive or a lock
    server outside the cluster. The lock server is the only software solution
    for coordinating disk access, and it is not truly HA. There are plans to 
    make OpenGFS support IBM's <ulink url="http://oss.software.ibm.com/dlm/">Distributed Lock Manager</ulink> (DLM), which would distribute the
    lock server's responsibilities across all the nodes in the cluster. 
    If any node fails, the locks it managed would failover to other nodes.
    This would be a true HA software solution for coordinating disk access.
   </para><para>    If you have a question or concern, post it to the
    <email>opengfs-users@lists.sf.net</email> mailing list.
    If you'd like to subscribe, you can do so through this <ulink url="http://lists.sourceforge.net/lists/listinfo/opengfs-users">web form</ulink>.
   </para></sect2><sect2 id="umllinks"><title>UML</title><para>    The User-Mode Linux (UML) project has a <ulink url="http://user-mode-linux.sf.net/">homepage</ulink> and a SourceForge <ulink url="http://sourceforge.net/projects/user-mode-linux/">project summary page</ulink>.
   </para><para>    If you have a question or concern, post it to the
    <email>user-mode-linux-user@lists.sf.net</email> mailing list.
    If you'd like to subscribe, you can do so through this <ulink url="http://lists.sourceforge.net/lists/listinfo/user-mode-linux-user">web form</ulink>.
   </para></sect2><sect2 id="clusterlinks"><title>Other Clustering Projects</title><para>    Other clustering projects include
    <ulink url="http://openmosix.sourceforge.net/">Mosix</ulink>,
    <ulink url="http://www.LinuxVirtualServer.org/">Linux Virtual Server</ulink>,
    <ulink url="http://www.beowulf.org/">Beowulf</ulink>,
    <ulink url="http://linux-ha.org/">HA Linux</ulink> and
    <ulink url="http://oss.sgi.com/projects/failsafe/">FailSafe</ulink>.
   </para></sect2></sect1><sect1 id="contributing"><title>Contributing</title><para>   If you'd like to contribute to the SSI project, you can do so by
   testing it, writing documentation, fixing bugs, or working on new
   features.
  </para><sect2 id="testing"><title>Testing</title><para>     While using the SSI clustering software, you may run into bugs or
     features that don't work as well as they should. If so, browse the
     <ulink url="http://sf.net/tracker/?atid=405834entgroup_id=32541entfunc=browse">SSI</ulink> and <ulink url="http://sf.net/tracker/?atid=405830entgroup_id=32543entfunc=browse">CI</ulink> bug databases to see if someone has seen the same problem.
     If not, either <ulink url="http://sf.net/tracker/?atid=405834entgroup_id=32541entfunc=add">post a bug</ulink> yourself or post a message to
     <email>ssic-linux-devel@lists.sf.net</email> to discuss the issue further.
    </para><para>     It is important to be as specific as you can in your bug report or
     posting. Simply saying that the SSI kernel doesn't boot or that it
     panics is not enough information to diagnose your problem.
    </para></sect2><sect2 id="documentation"><title>Documentation</title><para>     There is already some documentation for <ulink url="http://ssic-linux.sourceforge.net/docs.shtml">SSI</ulink> and <ulink url="http://ci-linux.sourceforge.net/docs.shtml">CI</ulink>, but more would certainly be welcome. If you'd like
     to write instructions for users or internals documentation for developers,
     post a message to <email>ssic-linux-devel@lists.sf.net</email> to
     express your interest.
    </para></sect2><sect2 id="debugging"><title>Debugging</title><para>     Debugging is a great way to get your feet wet as a developer.
     Browse the <ulink url="http://sf.net/tracker/?atid=405834entgroup_id=32541entfunc=browse">SSI</ulink> and <ulink url="http://sf.net/tracker/?atid=405830entgroup_id=32543entfunc=browse">CI</ulink> bug databases to see what problems need to be fixed. If
     a bug looks interesting, but is assigned to a developer, contact them 
     to see if they are actually working on it.
    </para><para>     After fixing the problem, send your patch to 
     <email>ssic-linux-devel@lists.sf.net</email> or
     <email>ci-linux-devel@lists.sf.net</email>. If it looks good, a developer
     will check it into the repository. After submitting a few patches, you'll
     probably be invited to become a developer yourself. Then you'll be able
     to checkin your own work.
    </para></sect2><sect2 id="newfeatures"><title>Adding New Features</title><para>     After fixing a bug or two, you may be inclined to work on enhancing or
     adding an SSI feature. You can look over the <ulink url="http://ssic-linux.sourceforge.net/index.shtml#projects">SSI</ulink> and <ulink url="http://ci-linux.sourceforge.net/index.shtml#projects">CI</ulink> project lists for ideas, or you can suggest something
     of your own. Before you start working on a feature,
     discuss it first on <email>ssic-linux-devel@lists.sf.net</email> or
     <email>ci-linux-devel@lists.sf.net</email>.
    </para></sect2></sect1><sect1 id="remarks"><title>Concluding Remarks</title><indexterm significance="normal"><primary>(your index root)!conclusion</primary></indexterm><para>   Hopefully, you find SSI clustering technology to be useful for your
   application that demands availability, scalability, and manageability
   at the same time.
  </para><para>   If you have any questions or comments, don't hesitate to post them
   to <email>ssic-linux-devel@lists.sf.net</email>.
  </para></sect1></article>

