|
A little
note on SUN Cluster topology…
|
More Resources by
Google: |
|
|
|
|
Gathered By:
John Kazerooni
How
Clusters provide HA?
By
providing a redundant server or servers, the critical application services and data are recovered
automatically in the case of any single hardware or software failure.
Hardware
Environment:
·
Cluster nodes running Solaris 8 or more;
· Separate boot disks on each
node. It is good practice to mirror your boot disk;
· At least, one public
network per system per subnet (two is ideal);
· A redundant private network
or cluster transport interface (IPMP-IP Multi-pathing software);
· Dual hosted
· Storage area (mirrored disk
storage);
· One terminal concentrator;
and
· An administrator Console.
Glossary:
“tc”
Terminal Concentrators.
“ccr”
Cluster Configuration Repository - Private net when updates are needed in a 5 sec interval.
Check the /etc/cluster/ccr/infrasturacture dba file.
“cmm”
Cluster Membership Monitor - Private net in a 5 sec interval. Checks in with ff (fail fast) driver
of opposing nodes.
“sma”
Switch Management Agent - Monitors Private net in a 1 sec interval.
“pnmd”
Public Network monitor daemon in a 5 sec interval
“IPMP”
IP Multi-pathing software one for failover group per node and per subnet
“rgmd”
Resource Group Management Daemon - Monitors resources within resource groups such as restarts,
shutdown database, and applications.
“gif”
Global Interface File - for scalable resources within scalable resource groups. Use for client
distribution to instances within clustser.
“quorum”
Predefined disk to handle scsi reserve from nodes to resolve split brain per split cluster
situation.
Quorum
Rules:
Device
Quorum Votes = Number of nodes - 1.
Each
Node has an Quorum Vote.
Votes
to win is 50% + 1
“ws”
Administration Work Station
Amnesia
means when clusters don’t know what stage they are.
What
is a Global file system?
It
is a file system that simultaneously available on all nodes, regardless of their physical location.
This
is normally in the /etc/vfstab file or can be put on the command line as following:
#
mount -o global,logging /dev/vx/dsk/ora-dg/system01_vol /global/system
Important
Notes to Remember
You
might use Network Information Service (NIS) or Domain Name Service (DNS) to resolve host name. But
it is also very important to resolve the names locally (/etc/hosts) on the cluster hosts and
administrator console. This will help in the case of naming service failures. You can not start the
cconsole program unless it can first resolve the host names in the /etc/clusters file.
How
to install the Cluster Console Software?
·
Log in as the root user in the administrative console,
· Go to the SUN Cluster
packages directory.
· Use the “pkgadd”
command to install the Cluster console software package
#
pkgadd -d . SUNWccon
·
Verify that the following path and variables:
PATH=$PATH:/opt/SUNWcluster/bin
MANPATH=$MANPATH:/opt/SUNWcluster/man
EDITOR=/usr/bin/vi
Export
PATH MANPATH EDITOR
·
Execute the user profile to verify changes:
#
. /.profile
·
Verify that the /etc/clusters file has a single line entry for each cluster.
· Verify that the /etc/serialports
file has an entry for each cluster host describing the connection path.
· Turn on the TC power and
all of the cluster hosts.
· Start the cconsole tool on
the administrative console.
#
cconsole cluster-name &
·
Place the cursor in the cconsole command window, and start typing. You should see a
response on all of the cluster host windows.
· Use the ccp control panel
by starting the ccp tool (# ccp cluster-name &). It
is very useful if you must use the crlogin and ctelnet plus cconsole.
How
to identify Cluster Transport Interfaces (Private Network)?
·
Check for network interfaces.
#
prtconf (look for
SUNW, hme instance #1, and SUNW, hme instance #2)
·
Check which interfaces are up (in other words, plumbed).
#
ifconfig -a
·
Bring up the unplumbed interfaces for testing. You can use some unused subnet address
just to test out interconnectivity across private net.
#
ifconfig hme1 plumb
#
ifconfig hme1 190.160.1.1 up
·
Check that the nodes can ping across each private network.
#
ping 190.160.1.1
·
Check that the interfaces you are looking are not actually on the public net.
#
snoop -d hme1 (hope to see
nothing…)
·
After you have identified the new network interfaces, bring them down again. Very
important to remember that the cluster installation fails if your transport network interfaces are
still up from testing.
#
ifconfig hme1 down unplump
#
ifconfig hme2 down unplump
How
to install the SUN Cluster Server Software?
·
Check that the boot disks have a 100-Mbytes /globaldevices partition and a small
partition for use by Solstice DiskSuite software replicas.
#
mount (To see the logical path to the boot disk
on each node such as /dev/dsk/c0t0d0).
·
Check that each boot disk meets the SUN Cluster requirements.
#
/usr/sbin/prtvtoc /dev/dsk/c0t0d0s2 (Slice
2)
·
Check the /.profile file on each cluster node contains the following paths and
variables.
PATH=$PATH:/usr/cluster/bin:/etc/vx/bin
MANPATH=$MANPATH:/usr/cluster/man:/usr/share/man:/opt/VRTS/man
TERM=dtterm
EDITOR=vi
Export
PATH MANPATH TERM EDITOR
·
On all nodes, create a .rhosts file
Host-name1
+
Console-name
+
·
On all nodes, edit the /etc/default/login file and comment out the
CONSOLE=/dev/console
· Use the cconsole window to
login to node1 as the root user.
· Start the node1 cluster
installation.
#
scinstall
·
As your installation proceeds, do the following:
o
Select option 1, to establish a new cluster.
o Allow the framework packages
to be added.
o Furnish your assigned cluster
name.
o Allow sccheck to run.
o Furnish the name of the
second node which will be added.
o Check the list of node names.
o Reply no to using DES
authentication.
o Use the default cluster
private network and netmask values.
o Configure the cluster
transport.
o Accept the default global
device file system.
o Reply no to the automatic
reboot question.
·
Add any applicable SUN Cluster patches.
· Reboot your node.
· Now, go to second node as
the root user and as you perform the Cluster installation do the following.
o
Select option 2.
o Give the name of a sponsoring
node.
o Give the cluster node that
you want to join.
o Type “scconf -p | more,”
if you have forgotten the name.
o Use auto-discovery for the
transport adapters.
o Select the default for the
global device directory.
o Reply no to the automatic
reboot question.
o Add patches if needed.
o Reboot the node.
·
Define your quorum devices.
#
scstat -p (Remember that only the first node has a
vote. No quorum device has been assigned.)
#
scdidadm -L (on node1, check the DID devices that you intend to configure as a quorum disk). For
example, if you had three nodes you want two quorum disk devices.)
Note:
The first few DID devices might be local disks. Make sure that the DID devices are disks in a
storage array and is connected to more than one nodes.
#
scsetup (Give the name of the DID device-global device-you selected from previous step.
Note:
To reply “yes” to the reset installmode question and type “q” to quit the scsetup utility.
·
Perform the following steps on all nodes to complete the Network Time Protocol (NTP)
configuration
#
vi /etc/inet/ntp.conf.cluster (remove the following
lines - peer clusternode3-priv … and leave only 1 and 2 for two nodes)
#
scstat -q (you should see two quorum votes present and a quorum device for a two nodes
configuration).
#
scdidadm -L (Each shared--dual ported-DID device should show a logical path from each cluster node.)
#
scconf -p (all node names, transport configuration, and quorum device information should be
completed).
·
Test basic Cluster Operation:
#
login to each node as the root user.
#
scshutdown -y -g 15 (To shutdown all cluster nodes).
#
boot node1, it should come up and join the cluster, then boot node2, it should come up and join the
cluster.
·
Verify or check basic Cluster Status.
#
scstat -q ( to verify the current cluster membership).
#
Note on the quorum configuration from previous step
#
sccheck -v ( Verify that the global device vfstab
entries are correct on all nodes).
#
scinstall -pv (Verify the version of the currently installed SUN Cluster.)
How
to maintain a Cluster node?
·
# scconf -c -q node=node2,maintstate (
on node1, use this command to place node2 into a maintenance state.)
· # scstat -q | grep
“Quorum votes” (The number of possible
quorum votes should be reduced.)
· Make your changes.
· # Boot the node2 and that
should reset its maintenance state.
· Verify the quorum votes.
How
to boot Nodes in non-cluster mode?
·
Ok boot -x (Will exclude the cluster from the node).
How
to navigate the SunPlex manager?
·
On your browser, type <https://nodename:3000>
(If you have problem, you might need to disable or set exceptions for the proxy settings in your web
browser.)
Some
of important notes on how to install and use VERITAS volume manager in a cluster environment…
·
# /usr/cluster/bin/scvxinstall ( to
install volume manager in all cluster nodes. Make sure to encapsulate root disk.)
#
vxprint ( to check that the rootdg disk group volumes
are unique between nodes.)
#
ls -l /dev/vx/rdsk/rootdg (To verify that the names of the rootdg disk group minor device numbers are
different on each node.)
#
scvxinstall -s ( on all nodes to check all the steps have completed successfully.)
·
Create a volume manager.
#
vxdiskadd c#t#d# c#t#d# ( answer the questions and
#
vxdg list (check the status of disk group)
#
vxdisk list
#
ls -l /dev/vx/dsk/ora_dg (On node1, check that the new ora_dg disk group is globally linked.)
#
vxassist -g ora_dg make system_vol 500m layout=mirror (Create a volume).
#
You may do the different application on node2 with the same repeated steps.
How
to register a disk group, create, test, manage, and remove a global file system?
·
Register a disk group.
#
scconf -a -D type=vxvm,name=ora_dg,nodelist=node1:node2,perferenced=true,failback-enabled
Note
to put the local node (node1) first in the node list. You could also use the scsetup command.
#
scstat -D (to check the status of the disk groups).
·
Create a global file system.
#
newfs /dev/vx/rdsk/ora_dg/system_vol (On node1, create as much you need a file system).
On
all nodes.
#
mkdir /global/oracle
#
vi vfstab (all /dev/vx/dsk/ora_dg/system_vol /dev/vx/rdsk/ora_dg/system_vol /global/system ufs 2 yes
global,logging
#
mount /global/system
#
mount
#
ls /global/system
·
Test your Global File system.
#
cd /global/system (on node2)
#
umount /global/system (on node1 and you should get an error since the file is busy.)
·
Manage your disk device groups.
#
scstat -D
#
vxdg list
#
vxassist -g ora_dg make users_vol 50m layout=mirror
#
vxprint users_vol
#
scsetup - or - scconf -c -D name=ora_dg, sync ( to register the changes.)
·
Remove a volume from a Disk Device Group.
#
vxedit -g ora_dg -rf rm users_vol ( to remove users_vol from disk group).
#
scconf -c -D name=ora_dg,sync
How
to migrate a device group from node1 to node2?
·
Check the current device group configuration.
#
scconf -p |grep group
·
From node1, switch the ora_dg device group to node2.
#
scswitch -z -D ora_dg -h node2
If
you boot node1, the ora_dg disk group will automatically migrate back to node1.
How
to manage volumes using Solaris volume manager (Solstice DiskSuite)
·
Initializing the Solaris Volume Manager Software Local Metadbs
#
format (On each node, check that the boot disk has a
small unused slice available (A typical path is c0t0d0s7)
#
metadb -a -c 3-f /dev/dsk/c0t0d0s7 (On each node, create three replicas on the unused boot disk
slice)
#
metadb (On each node, check the replicas are configured and operational)
·
Configuring the Solaris Volume Manager Disksets
#
scdidadm -L ( list all of available DID drivers and record the logical path that you want to create
the oracle disk sets and volumes)
Note:
Make sure the disks are dual-hosted and available to more than one cluster nodes.
#
metaset -s orads -a -h node1 node2 (On node1, create the orads diskset, and configure the nodes that
are physically connected to it.)
#
metaset -s orads -a -m node1 node2 (For created diskset, add the diskset mediators)
#
metaset -s orads -a /dev/did/rdsk/primary /dev/did/rdsk/mirror (Add the primary and mirror disks to
the orads diskset.)
#
metaset -a orads (Check the status of the orads diskset.)
#
medstat -s orads (Check the status of the orads diskset.)
#
scstat -D
·
Configuring the Solaris Volume Manager Volumes on orads diskset
#
metainit -s orads d0 1 1 /dev/did/rdsk/primarys0 (Create a submirror on each of disks in the orads
diskset.)
#
metainit -s orads d1 1 1 /dev/did/rdsk/mirrors0 (Create a submirror on each of disks in the orads
diskset.)
#
metainit -s orads d900 -m d0 (Create a volume, d900, and add the d0 submirror to it.)
#
metattach -s orads d900 d1 (Attach
the second submirror, d1, to the volume d900.)
#
metainit -s orads d901 -p d900 750m (Create a 750-Mbyte partition on top of your mirror. This is the
volume you will use.)
#
metastat -s orads (check the status of the new volume)
·
Creating a Global file system.
#
newfs /dev/md/orads/rdsk/d901 (on node1, create a file system on d901 in the orads diskset.)
#
mkdir /global/ora (On each node, create a global mount point for the new file system.)
On
all nodes, add a mount entry in the /etc/vfstab.
/dev/md/orads/dsk/901
/dev/md/orads/rdsk/901 /gloabal/ora ufs 2 yes gloabal,logging
#
mount /global/ora (On node1, mount the /global/ora file system)
#
mount (Check the file system is mounted on all nodes.)
How
to configure and test IPMP (IP Multi-pathing)?
·
Checking the local-mac-address?
#
eeprom “local-mac-address?”
It
is set to true unless was changed back manually by somebody. If you have to change it to true you
must reboot.
·
Checking the Adapters for the IPMP Group.
#
ls -l /etc/hostname.* (Check for public network adapter)
#
ifconfig -a (Check for public network adapter)
#
ifconfig ifname plump (Make sure a private transport interface is up.)
#
snoop -d ifname (Then on another window or node # ping -s pubnet_broadcast_addr)
·
Entering and checking test addresses in the /etc/hosts file.
Although
not required to have all test IP addresses in each node, it is a good idea to do so.
#
vi /etc/hosts (add all test ip addresses. For example)
#
IPMP Test Addresses
156.23.4.195
svr190-qfe1-test
156.23.4.196
svr190-qfe2-test
·
Creating /etc/hostname.xxx files - the following are just examples:
#
vi /etc/hostname.qfe1
svr190
group oracle up
addif
svr190-qfe1-test -failover backup up
#
vi /etc/hostname.qfe2
svr190-qfe2-test
group oracle -failover backup up
·
Reboot and verify the IPMP is configured.
Reboot
the node.
#
ifconfig -a (Check the new IPMP configuration.)
#
scstat -i (Check IPMP cluster-wide status)
·
Testing IPMP failover and failback.
#
ping -s nodename (from outside of the cluster, ping a cluster node.)
If
you have physical access to your cluster, unplug the Ethernet cable which currently has the node
physical interface on it.
Or
sabotage your adapter with: # ifconfig adapter_name modinsert ldterm@2
#
more /var/adm/messages (Check the node messages or use the console to see.)
#
scstat -i (Observe the output)
Connect
the broken cable or use the following to repair our sabotage: # ifconfig adapter_name modremove
ldterm@2
#
scstat -i (Observe the output again.)
Introducing
Data Services in the Cluster
·
Preparing to register and configure the SUN Cluster HA for “ora” Data Service
#
su - root (to node1)
#
scstat -p (Check that your cluster is active.)
#
df -k ( check that the /global/ora file system is mounted.)
#
vi /etc/nsswitch.conf (check to see line - “cluster files nis” if not add it.)
#
vi /etc/hosts (Make sure you have an entry for each node if not add them.)
--
do the following only in one node.
#
cd /global/ora
#
mkdir admin
#
cd admin
#
mkdir SUNW.ora
#
cd SUNW.ora
#
vi dfstab.ora-res (add the line: “share -F ora -o rw /global/ora/data”)
#
cd /global/ora
#
mkdir /global/ora/data
#
cdmod 777 /global/ora/data
#
touch /global/ora/data/sample.ora
·
Registering and configuring the SUN Cluster HA for ora data service
From
any node do the following to register the SUN Cluster HA and ora data service software package.
#
scrgadm -a -t SUNW.ora
#
scrgadm -a -t SUNW.HAStoragePlus
#
scrgadm -p
Do
the following to create the failover group.
#
scrgadm -a -g ora-rg -h node1,node2 -y Pathprefix=/global/ora/admin
#
scrgadm -a -L -g ora-rg -l clustername-ora (Add the logical hostname resource to the resource
group.)
Do
the following line to create SUNW.HAStoragePlus rresource
#
scrgadm -a -j ora-stor -g ora-ora -t SUNW.HAStoragePlus -x FilesystemMountpoints=/global/nfs -x
AffinityOn=True
Do
the following line to create SUNW.ora resource.
#
scrgadm -a -j ora-res -g ora-rg -t SUNW.ora -y Resource_dependencies=ora-stor
Do
the following to enable the resources and the resource monitors, manage the resource group, and
switch the resource group into the online state.
#
scswitch -Z -g ora-rg
#scstat
-g (Check that the data service is online.)
·
Verifying access by ora clients
#
ls /net/clustername-ora/global/ora/data (On the
administration workstation, check that you can access to your cluster files.)
Write
a script named test.myora to run every 1 second to timestamp to test.myfile.
·
Observing SUN Cluster HA for ora failover behavior.
On
one node, determine the name of the node currently hosting the data services.
#
scswitch -z -h dest-node -g ora-rg (transfer controk of the ora service from one to anther.)
Use
the mount and share commands on all nodes to verify which file systems the nodes are now mounting
and exporting.
#
ifconfig -a (To observe the additional IP address associated with…)
#
scswitch -z -h dest-node -g ora-rg (to transfer control of the ora service back to its preferred
host.)
Good
Luck!
|