Jump to content

zsf iscsi avs server?


Recommended Posts

so i been experimenting w/ zfs and iscsi allready and have it currently working w/ solaris and esxi/ windows systems.

I been trying to find a way of including avs into the mix avs is Sun StorageTek Availability Suite.


has anyone had any experiance w/ this?

I would like to have a live offsite backup but have not been able to get the remote mirror working.

sndradm -e localhost /dev/rdsk/c2d0s1 /dev/rdsk/c2d0s0 server1 /dev/rdsk/c2s0s1 /dev/rdisk/c2s0s0 ip sync

cant no open /dev/rdsk/c2d0s1

any idea?

Link to comment
Share on other sites

so after three days of putting w/ it i finally figured it out. I now have a zfs avs iscsi data store. for live data backup.

Woot +1 for me lol.


Well I have just finished deploying a solution like this for a large web server farm and thought it would be nice to share what we did and how it worked with the larger community. When researching this solution there was quite a lot of information out there but it was all in different contexts and none of it directly related to each other.

By the end of this how to you will have a solution somewhat similar to the below.

To achieve this you will need :

* 2 servers for the file servers, we used a pair of IBM xSeries each with a single quad 2ghz xeon and 16gb’s of RAM.

* A dedicated gig switch I prefer L2 dumb switches for this sort of application as it lessens the chance of performance issues stemming from L3 and vlan tagging.

* 2 iSCSI chassis – we used Inforend Eonstor’s (A12E-G2121-2 to be exact), we chose to back the file servers with these iSCSI shelfs due to the fact we had two. If we had not had these two shelfs then it is quite likely that we would have used something like an IBM DS3200.

* A spare gig nic in each of the servers you want to hook into the storage

* 2 4 port gig nic’s for the file servers.

The first step is to install OpenSolaris on the servers – you can get it here : http://opensolaris.org/os/downloads/

Once you have OS10 install it on the first server and let it use its defaults – the first thing I do with a new OpenSolaris install is to disable the annoying “boot into gui” note for sun – make solaris default into a console boot, this isnt windows guys or atleast give us an install option to change this

To disable the guiboot you need to make the following changes to the grub config (/rpool/boot/grub/)

change the kernel$ line to read “kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS”

Once this is done you need to set pkg to use dev for its packages (there are some bugs in the iSCSI code that make it panic x64 boxes prior to b106)

pfexec pkg set-authority -PO http://pkg.opensolaris.org/dev dev

Then update the server to the latest and greatest release

pfexec pkg image-update

Check your menu.lst after you have done this and verify that the system is still going to console boot.

Once the server is all up to date the real work can begin. I am leaving out details like setting up the ip addressing for the NIC’s as I kinda assume you can do that, if not leave a comment and i will expand the article to cover that too.

In our case as we had used iSCSI units to hold the disks we ran a pair of cables from the iSCSI to the server and then on the iSCSI we mapped 3 drives to each of its ethernet interfaces. With that done we then setup the iSCSI initiator on the servers like so :

pfexec pkg install SUNWiscsi

pfexec iscsiadm add discovery-address (this is interface 1 on the iscsi box)

pfexec iscsiadm add discovery-address (this is interface 2 on the iscsi box)

pfexec iscsiadm modify discovery -t enable

We should now be in a position where the disks are being exported byt the iSCSI unit and the OpenSolaris server will see them – verify this like so :

pfexec format

Our output looks as follows


0. c3t0d0


1. c5t2d0


2. c5t3d0


3. c5t4d0


4. c5t5d0


5. c5t6d0


6. c5t7d0


You will need to create a zpool on one of the disks temporarily so you can correctly size the sndr bitmap

zpool create -f temp c5t7d0; zpool destroy temp

dsbitmap -r /dev/rdsk/c5t7d0s0 | tee /tmp/vol_size

Now that we have the bitmap sizing done we can go about setting up the disks, the way to do this feels a little hack to me but sadly sun don’t provide a really nice way to do this (the following code comes from http://blogs.sun.com/AVS/entry/avs_and_zfs_seamless)

# VOL_SIZE="`cat /tmp/vol_size| grep 'size: [0-9]' | awk '{print $5}'`"

# BMP_SIZE="`cat /tmp/vol_size| grep 'Sync ' | awk '{print $3}'`"

# SVM_SIZE=$((((BMP_SIZE+((16-1)/16))*16)*2))


# SVM_OFFS=$(((34+ZFS_SIZE)))

# echo "Original volume size: $VOL_SIZE, Bitmap size: $BMP_SIZE"

# echo "SVM soft partition size: $SVM_SIZE, ZFS vdev size: $ZFS_SIZE"

# find /dev/rdsk/c5*s0 | xargs -n1 fmthard -d 0:4:0:34:$ZFS_SIZE

# find /dev/rdsk/c5*s0 | xargs -n1 fmthard -d 1:4:0:$SVM_OFFS:$SVM_SIZE

# find /dev/rdsk/c5*s0 | xargs -n1 prtvtoc |egrep “^ [01]|partition map”

Re-use the find command from above, with the additional selection of only even numbered disks, placing slice 1 of all selected disks into the SVM metadevice d101

# find /dev/rdsk/c5*[246]s1 | xargs -I {} echo 1 $1\{} | xargs metainit d101 `find /dev/rdsk/c5*[246]s1 | wc -l`

Then do the odd numbered disks

# find /dev/rdsk/c5*[135]s1 | xargs -I {} echo 1 $1\{} | xargs metainit d102 `find /dev/rdsk/c5*[135]s1 | wc -l`

Now mirror metadevice d101 and d102, into mirror d100 this will give us nice redundancy for the metadata

# metainit d100 -m d101 d102

We need to allocate bitmap volumes out of the soft paritions for each SNDR replica


# for n in `find /dev/rdsk/c5*s1 | grep -n s1 | cut -d ':' -f1 | xargs`


metainit d$n -p /dev/md/rdsk/d100 -o $OFFSET -b $BMP_SIZE



Do the above on both servers and we are almost done, we just have to setup the zpool, NFS and the actual replication.

To get the replication going you need to do the following (on both servers):

# DISK=1

# for ZFS_DISK in `find /dev/rdsk/c5*s0`


sndradm -nE NODE-A $ZFS_DISK /dev/md/rdsk/d$DISK NODE-B $ZFS_DISK /dev/md/rdsk/d$DISK ip sync g replicated-pool-of-win

DISK=$(((DISK + 1)))


NODE-A and NODE-B need to have entries in each others /etc/hosts so that they can find each other.

We now create the zpool on NODE-A (this will be replicated to NODE-B once thats turned on)

# find /dev/rdsk/c5*s0 | xargs zpool create replicated-pool-of-win

Now we will turn on the actual replication

# sndradm -g replicated-pool-of-win -nu

# sndradm -g replicated-pool-of-win -P

# metastat -P

With that done you can check your handiwork by running this on NODE-B

# zpool status replicated-pool-of-win

We now need to setup the NFS clustering

Register resource types :

scrgadm -a -t SUNW.HAStoragePlus

scrgadm -a -t SUNW.nfs

Create failover resource groups :

scrgadm -a -g nfs-rg1 -h node-a,node-b -y PathPrefix=/nfs_data -y Failback=true

Add logical hostname resources to the resource groups :

scrgadm -a -j nfs-lh-rs1 -L -g nfs-rg1 -l log-name1

Create dfstab file for each NFS resource :

mkdir -p /nfs_data/SUNW.nfs /nfs_data/share

echo 'share -F nfs -o rw /nfs_share/share' > /nfs_data/SUNW.nfs/dfstab.share1

Configure device groups :

scconf -c -D name=nfs1,nodelist=node-a:node-b,failback=enabled

Create HAStoragePlus resources :

scrgadm -a -j nfs-hastp-rs1 -g nfs-rg1 -t SUNW.HAStoragePlus -x FilesystemMountPoints=/nfs_data -x AffinityOn=True

Share :

share -F nfs -o rw /nfs_data/share

Bring the groups online :

scswitch -Z -g nfs-rg1

Create NFS resources :

scrgadm -a -j share1 -g nfs-rg1 -t SUNW.nfs -y Resource_dependencies=nfs-hastp-rs1

Change the number of NFS threads – on each node edit the file /opt/SUNWscnfs/bin/nfs_start_daemons -

instead of

DEFAULT_NFSDCMD=”/usr/lib/nfs/nfsd -a 16″


DEFAULT_NFSDCMD=”/usr/lib/nfs/nfsd -a 1024″

Enable NFS resources :

scswitch -e -j share1

Switch resource groups to check the cluster :

scswitch -z -h node2 -g nfs-rg1

Now we setup IPMP on both nodes to give us a floating VIP

On node-a do this :

cat > /etc/hostname.bge0 << eof

node1 netmask + broadcast + group sc_ipmp0 up \

addif netmask + broadcast + -failover -standby deprecated up


And on node-b do this :

cat > /etc/hostname.bge0 << eof

node2 netmask + broadcast + group sc_ipmp0 up \

addif netmask + broadcast + -failover -standby deprecated up


And we are done – you will now have a fully redundant failover NFS server!

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Create New...