dw5304 Posted July 8, 2009 Share Posted July 8, 2009 so i been experimenting w/ zfs and iscsi allready and have it currently working w/ solaris and esxi/ windows systems. I been trying to find a way of including avs into the mix avs is Sun StorageTek Availability Suite. http://www.opensolaris.org/os/project/avs/...9CE8DC479389E2A has anyone had any experiance w/ this? I would like to have a live offsite backup but have not been able to get the remote mirror working. sndradm -e localhost /dev/rdsk/c2d0s1 /dev/rdsk/c2d0s0 server1 /dev/rdsk/c2s0s1 /dev/rdisk/c2s0s0 ip sync cant no open /dev/rdsk/c2d0s1 any idea? Quote Link to comment Share on other sites More sharing options...
dw5304 Posted July 13, 2009 Author Share Posted July 13, 2009 so after three days of putting w/ it i finally figured it out. I now have a zfs avs iscsi data store. for live data backup. Woot +1 for me lol. quoted Well I have just finished deploying a solution like this for a large web server farm and thought it would be nice to share what we did and how it worked with the larger community. When researching this solution there was quite a lot of information out there but it was all in different contexts and none of it directly related to each other. By the end of this how to you will have a solution somewhat similar to the below. To achieve this you will need : * 2 servers for the file servers, we used a pair of IBM xSeries each with a single quad 2ghz xeon and 16gb’s of RAM. * A dedicated gig switch I prefer L2 dumb switches for this sort of application as it lessens the chance of performance issues stemming from L3 and vlan tagging. * 2 iSCSI chassis – we used Inforend Eonstor’s (A12E-G2121-2 to be exact), we chose to back the file servers with these iSCSI shelfs due to the fact we had two. If we had not had these two shelfs then it is quite likely that we would have used something like an IBM DS3200. * A spare gig nic in each of the servers you want to hook into the storage * 2 4 port gig nic’s for the file servers. The first step is to install OpenSolaris on the servers – you can get it here : http://opensolaris.org/os/downloads/ Once you have OS10 install it on the first server and let it use its defaults – the first thing I do with a new OpenSolaris install is to disable the annoying “boot into gui” note for sun – make solaris default into a console boot, this isnt windows guys or atleast give us an install option to change this To disable the guiboot you need to make the following changes to the grub config (/rpool/boot/grub/) change the kernel$ line to read “kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS” Once this is done you need to set pkg to use dev for its packages (there are some bugs in the iSCSI code that make it panic x64 boxes prior to b106) pfexec pkg set-authority -PO http://pkg.opensolaris.org/dev dev Then update the server to the latest and greatest release pfexec pkg image-update Check your menu.lst after you have done this and verify that the system is still going to console boot. Once the server is all up to date the real work can begin. I am leaving out details like setting up the ip addressing for the NIC’s as I kinda assume you can do that, if not leave a comment and i will expand the article to cover that too. In our case as we had used iSCSI units to hold the disks we ran a pair of cables from the iSCSI to the server and then on the iSCSI we mapped 3 drives to each of its ethernet interfaces. With that done we then setup the iSCSI initiator on the servers like so : pfexec pkg install SUNWiscsi pfexec iscsiadm add discovery-address 10.0.0.1 (this is interface 1 on the iscsi box) pfexec iscsiadm add discovery-address 10.0.1.1 (this is interface 2 on the iscsi box) pfexec iscsiadm modify discovery -t enable We should now be in a position where the disks are being exported byt the iSCSI unit and the OpenSolaris server will see them – verify this like so : pfexec format Our output looks as follows AVAILABLE DISK SELECTIONS: 0. c3t0d0 /pci@0,0/pci8086,25e3@3/pci1014,9580@0/disk@0,0 1. c5t2d0 /iscsi/disk@0000iqn.2002-10.com.infortrend,0 2. c5t3d0 /iscsi/disk@0000iqn.2002-10.com.infortrend,1 3. c5t4d0 /iscsi/disk@0000iqn.2002-10.com.infortrend,2 4. c5t5d0 /iscsi/disk@0000iqn.2002-10.com.infortrend,0 5. c5t6d0 /iscsi/disk@0000iqn.2002-10.com.infortrend,1 6. c5t7d0 /iscsi/disk@0000iqn.2002-10.com.infortrend,2 You will need to create a zpool on one of the disks temporarily so you can correctly size the sndr bitmap zpool create -f temp c5t7d0; zpool destroy temp dsbitmap -r /dev/rdsk/c5t7d0s0 | tee /tmp/vol_size Now that we have the bitmap sizing done we can go about setting up the disks, the way to do this feels a little hack to me but sadly sun don’t provide a really nice way to do this (the following code comes from http://blogs.sun.com/AVS/entry/avs_and_zfs_seamless) # VOL_SIZE="`cat /tmp/vol_size| grep 'size: [0-9]' | awk '{print $5}'`" # BMP_SIZE="`cat /tmp/vol_size| grep 'Sync ' | awk '{print $3}'`" # SVM_SIZE=$((((BMP_SIZE+((16-1)/16))*16)*2)) # ZFS_SIZE=$((VOL_SIZE-SVM_SIZE)) # SVM_OFFS=$(((34+ZFS_SIZE))) # echo "Original volume size: $VOL_SIZE, Bitmap size: $BMP_SIZE" # echo "SVM soft partition size: $SVM_SIZE, ZFS vdev size: $ZFS_SIZE" # find /dev/rdsk/c5*s0 | xargs -n1 fmthard -d 0:4:0:34:$ZFS_SIZE # find /dev/rdsk/c5*s0 | xargs -n1 fmthard -d 1:4:0:$SVM_OFFS:$SVM_SIZE # find /dev/rdsk/c5*s0 | xargs -n1 prtvtoc |egrep “^ [01]|partition map” Re-use the find command from above, with the additional selection of only even numbered disks, placing slice 1 of all selected disks into the SVM metadevice d101 # find /dev/rdsk/c5*[246]s1 | xargs -I {} echo 1 $1\{} | xargs metainit d101 `find /dev/rdsk/c5*[246]s1 | wc -l` Then do the odd numbered disks # find /dev/rdsk/c5*[135]s1 | xargs -I {} echo 1 $1\{} | xargs metainit d102 `find /dev/rdsk/c5*[135]s1 | wc -l` Now mirror metadevice d101 and d102, into mirror d100 this will give us nice redundancy for the metadata # metainit d100 -m d101 d102 We need to allocate bitmap volumes out of the soft paritions for each SNDR replica # OFFSET=1 # for n in `find /dev/rdsk/c5*s1 | grep -n s1 | cut -d ':' -f1 | xargs` do metainit d$n -p /dev/md/rdsk/d100 -o $OFFSET -b $BMP_SIZE OFFSET=$(((OFFSET + BMP_SIZE + 1))) done Do the above on both servers and we are almost done, we just have to setup the zpool, NFS and the actual replication. To get the replication going you need to do the following (on both servers): # DISK=1 # for ZFS_DISK in `find /dev/rdsk/c5*s0` do sndradm -nE NODE-A $ZFS_DISK /dev/md/rdsk/d$DISK NODE-B $ZFS_DISK /dev/md/rdsk/d$DISK ip sync g replicated-pool-of-win DISK=$(((DISK + 1))) done NODE-A and NODE-B need to have entries in each others /etc/hosts so that they can find each other. We now create the zpool on NODE-A (this will be replicated to NODE-B once thats turned on) # find /dev/rdsk/c5*s0 | xargs zpool create replicated-pool-of-win Now we will turn on the actual replication # sndradm -g replicated-pool-of-win -nu # sndradm -g replicated-pool-of-win -P # metastat -P With that done you can check your handiwork by running this on NODE-B # zpool status replicated-pool-of-win We now need to setup the NFS clustering Register resource types : scrgadm -a -t SUNW.HAStoragePlus scrgadm -a -t SUNW.nfs Create failover resource groups : scrgadm -a -g nfs-rg1 -h node-a,node-b -y PathPrefix=/nfs_data -y Failback=true Add logical hostname resources to the resource groups : scrgadm -a -j nfs-lh-rs1 -L -g nfs-rg1 -l log-name1 Create dfstab file for each NFS resource : mkdir -p /nfs_data/SUNW.nfs /nfs_data/share echo 'share -F nfs -o rw /nfs_share/share' > /nfs_data/SUNW.nfs/dfstab.share1 Configure device groups : scconf -c -D name=nfs1,nodelist=node-a:node-b,failback=enabled Create HAStoragePlus resources : scrgadm -a -j nfs-hastp-rs1 -g nfs-rg1 -t SUNW.HAStoragePlus -x FilesystemMountPoints=/nfs_data -x AffinityOn=True Share : share -F nfs -o rw /nfs_data/share Bring the groups online : scswitch -Z -g nfs-rg1 Create NFS resources : scrgadm -a -j share1 -g nfs-rg1 -t SUNW.nfs -y Resource_dependencies=nfs-hastp-rs1 Change the number of NFS threads – on each node edit the file /opt/SUNWscnfs/bin/nfs_start_daemons - instead of DEFAULT_NFSDCMD=”/usr/lib/nfs/nfsd -a 16″ put DEFAULT_NFSDCMD=”/usr/lib/nfs/nfsd -a 1024″ Enable NFS resources : scswitch -e -j share1 Switch resource groups to check the cluster : scswitch -z -h node2 -g nfs-rg1 Now we setup IPMP on both nodes to give us a floating VIP On node-a do this : cat > /etc/hostname.bge0 << eof node1 netmask + broadcast + group sc_ipmp0 up \ addif 10.1.1.1 netmask + broadcast + -failover -standby deprecated up eof And on node-b do this : cat > /etc/hostname.bge0 << eof node2 netmask + broadcast + group sc_ipmp0 up \ addif 10.1.1.1 netmask + broadcast + -failover -standby deprecated up eof And we are done – you will now have a fully redundant failover NFS server! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.