Virtualized ODA X6-2HA – working with VMs

It’s been awhile since I built a virtualized ODA with VMs on a shared repo so I thought I’d go through the basic steps.

  1. install the OS
    1. install Virtual ISO image
    2. configure networking
    3. install ODA_BASE patch
    4. deploy ODA_BASE
    5. configure networking in ODA_BASE
    6. deploy ODA_BASE with configurator
  2. create shared repository.  This is where your specific situation plays out.  Depending on your hardware you may have less or more space in DATA or RECO.  Your DBA will be able to tell you how much they need for each and where you can borrow a few terabytes (or however much you need) for your VMs
  3. (optionally) create a separate shared repository to store your templates.  This all depends on how many of the same kind of VM you’ll be deploying.  If it makes no sense to keep the templates around once you create your VMs then don’t bother with this step
  4. import template into repository
    1. download the assembly file from Oracle (it will unzip into an .ova archive file)
    2. ***CRITICAL*** copy the .ova to /OVS on either nodes’ DOM0, not into ODA_BASE
    3. import the assembly (point it to the file sitting in DOM0 /OVS)
  5. modify template config as needed (# of vCPUs, Memory, etc)
  6. clone the template to a VM
  7. add network to VM (usually net1 for first public network, net2 for second and net3+ for any VLANs you’ve created
  8. boot VM and start console (easiest way is to VNC into ODA_BASE and launch it from there)
  9. set up your hostname, networking, etc the way you want it
  10. reboot VM to ensure changes persist
  11. rinse and repeat as needed

If you need to configure HA, preferred node or any other things, this is the time to do it.

 

Create VM in Oracle VM for x86 using NFS share

I’m using OVM Manager 3.4.2 and OVM Server 3.3.2 to test an upgrade for one of our customers.  I am using Starwind iSCSI server to present the shared storage to the cluster but in production you should use enterprise grade hardware to do this.  There’s an easier way to do this- create an HVM VM and install from an ISO stored in a repository.  Then power the VM off and change the type to PVM then power on.  This may not work with all operating systems however so I’m going over how to create a new PVM VM from an ISO image shared from an NFS server.

* Download ISO (I'm using Oracle Linux 6.5 64bit for this example)
* Copy ISO image to OVM Manager (any NFS server is fine)
* Mount ISO on the loopback device
# mount -o loop /var/tmp/V41362-01.iso /mnt

* Share the folder via NFS
# service nfs start
Starting NFS services: [ OK ]
Starting NFS quotas: [ OK ]
Starting NFS mountd: [ OK ]
Starting NFS daemon: [ OK ]
Starting RPC idmapd: [ OK ]

# exportfs *:/mnt/

# showmount -e
Export list for ovmm:
/mnt *

* Create new VM in OVM Manager
* Edit VM properties and configure as PVM
* Set additional properties such as memory, cpu and network
* At the boot order tab, enter the network boot path formatted like this:
  nfs:{ip address or FQDN of NFS host}:/{path to ISO image top level directory}

For example, our NFS server is 10.2.3.4 and the path where I mounted the ISO is at /mnt.  Leave the {}'s off of course:

  nfs:10.2.3.4:/mnt 

You should be able to boot your VM at this point and perform the install of the OS.

Nimble PowerShell Toolkit

I was working on an internal project to test performance of a converged system solution.  The storage component is a Nimble AF7000 from which we’re presenting a number of LUNs.  There are almost 30 LUNs and I’ve had to create, delete and provision them a number of times throughout the project.  It became extremely tedious to do this through the WebUI so I decided to see if it could be scripted.

I know you can log into the nimble via ssh and basically do what I’m trying to do- and I did test this with success.  However I’ve recently had a customer who wanted to use PowerShell to perform some daily snapshot/clone operations for Oracle database running on windows (don’t ask).  We decided to leverage the Nimble PowerShell Toolkit to perform the operations right from the windows server.  The script was fairly straightforward, although we had to learn a little about PowerShell syntax and such.  I’ve included a sanitized script below that basically does what I need to.

$arrayname = "IP address or FQDN of array management address"
$nm_uid = "admin"
$nm_password = ConvertTo-SecureString -String "admin" -AsPlainText -Force
$nm_cred = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $nm_uid,$nm_password
$initiatorID = Get-NSInitiatorGroup -name {name of initiator group} | select -expandproperty id

# Import Nimble Tool Kit for PowerShell
import-module NimblePowerShellToolKit

# Connect to the array
Connect-NSGroup -group $arrayname -credential $nm_cred

# Create 10 DATA Disks
for ($i=1; $i -le 10; $i++) {
    New-NSVolume -Name DATADISK$i -Size 1048576 -PerfPolicy_id 036462b75de9a4f69600000000000000000000000e -online $true
    $volumeID = Get-NSVolume -name DATADISK$i | select -expandproperty id
    New-NSAccessControlRecord -initiator_group_id $initiatorID -vol_id $volumeID
}

# Create 10 RECO Disks
for ($i=1; $i -le 10; $i++) {
    New-NSVolume -Name RECODISK$i -Size 1048576 -PerfPolicy_id 036462b75de9a4f69600000000000000000000000e -online $true
    $volumeID = Get-NSVolume -name RECODISK$i | select -expandproperty id
    New-NSAccessControlRecord -initiator_group_id $initiatorID -vol_id $volumeID
}

# Create 3 GRID Disks
for ($i=1; $i -le 3; $i++) {
    New-NSVolume -Name GRIDDISK$i -Size 2048 -PerfPolicy_id 036462b75de9a4f69600000000000000000000000e -online $true
    $volumeID = Get-NSVolume -name GRIDDISK$i | select -expandproperty id
    New-NSAccessControlRecord -initiator_group_id $initiatorID -vol_id $volumeID
}

I also wrote a script to delete the LUNs below:

$arrayname = "IP address or FQDN of array management address"  
$nm_uid = "admin"
$nm_password = ConvertTo-SecureString -String "admin" -AsPlainText -Force
$nm_cred = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $nm_uid,$nm_password
$initiatorID = Get-NSInitiatorGroup -name {name of initiator group} | select -expandproperty id

# Import Nimble Tool Kit for PowerShell
import-module NimblePowerShellToolKit

# Connect to the array 
Connect-NSGroup -group $arrayname -credential $nm_cred


# Delete 10 DATA Disks
for ($i=1; $i -le 10; $i++) {
    Set-NSVolume -name DATADISK$i -online $false
    Remove-NSVolume -name DATADISK$i
}

# Delete 10 RECO Disks
for ($i=1; $i -le 10; $i++) {
    Set-NSVolume -name RECODISK$i -online $false
    Remove-NSVolume -name RECODISK$i 
}

# Delete 3 GRID Disks
for ($i=1; $i -le 3; $i++) {
    Set-NSVolume -name GRIDDISK$i -online $false
    Remove-NSVolume -name GRIDDISK$i 
}

Obviously you’ll have to substitute some of the values such as $arrayname, $nm_uid, $nm_password and $initiatorID (make sure you remove the {}’s when you put your value here). This is a very insecure method of storing your password but it was a quick and dirty solution at the time. There are ways to store the value of a password from a highly secured text file and encrypt it into a variable. Or if you don’t mind being interactive, you can skip providing the credentials and it will pop up a password dialog box for you to enter them every time the script runs.

It made the project go a lot faster- hopefully you can use this to model different scripts to do other things. The entire command set of the Nimble array is basically exposed through the toolkit so there’s not a whole lot you can’t do here that you could in the WebUI. When you download the toolkit- there is a README PDF that goes through all the commands. When in PowerShell, you can also get help for each of the commands. For example:

PS C:\Users\esteed> help New-NSVolume

NAME
    New-NSvolume

SYNOPSIS
    Create operation is used to create or clone a volume. Creating volumes requires name and size attributes. Cloning
    volumes requires clone, name and base_snap_id attributes where clone is set to true. Newly created volume will not
    have any access control records, they can be added to the volume by create operation on access_control_records
    object set. Cloned volume inherits access control records from the parent volume.


SYNTAX
    New-NSvolume [-name] <String> [-size] <UInt64> [[-description] <String>] [[-perfpolicy_id] <String>] [[-reserve]
    <UInt64>] [[-warn_level] <UInt64>] [[-limit] <UInt64>] [[-snap_reserve] <UInt64>] [[-snap_warn_level] <UInt64>]
    [[-snap_limit] <UInt64>] [[-online] <Boolean>] [[-multi_initiator] <Boolean>] [[-pool_id] <String>] [[-read_only]
    <Boolean>] [[-block_size] <UInt64>] [[-clone] <Boolean>] [[-base_snap_id] <String>] [[-agent_type] <String>]
    [[-dest_pool_id] <String>] [[-cache_pinned] <Boolean>] [[-encryption_cipher] <String>] [<CommonParameters>]


DESCRIPTION
    Create operation is used to create or clone a volume. Creating volumes requires name and size attributes. Cloning
    volumes requires clone, name and base_snap_id attributes where clone is set to true. Newly created volume will not
    have any access control records, they can be added to the volume by create operation on access_control_records
    object set. Cloned volume inherits access control records from the parent volume.


RELATED LINKS

REMARKS
    To see the examples, type: "get-help New-NSvolume -examples".
    For more information, type: "get-help New-NSvolume -detailed".
    For technical information, type: "get-help New-NSvolume -full".

You can also use the -detail parameter at the end to get a more complete description of each option. Additionally you can use -examples to see the commands used in real world situations. Have fun!

Temperature monitoring script with email alerts

We have quite a bit of expensive equipment in our server room and we’ve had the A/C fail a couple times. As a result, I’ve installed a raspberry pi zero with a DS18B20 temperature sensor connected to it to monitor the temperature of the room.  If it goes above a set threshold, it will send an email to the engineers so we can log in and shut stuff down until the problem is fixed.

 

This project branches off from the one I did earlier on monitoring temperature with a raspberry pi and MRTG.  This too uses MRTG but I won’t get into the details of that- you can see how I set that up here.

 

The big piece here is the alerting logic.  You’d be surprised how fast the temp can go up in a small room with lots of gear putting out a lot of heat.  For that reason, I monitor the temperature every minute.  If the current temperature exceeds the threshold I set in the script, it fires an email and sets an alert flag to true.  The reason I did this is so we don’t get an email every minute while the temperature is above threshold.  How irritating would that be?  So another piece of logic in the script checks to see if the alert flag has been tripped.  If it has, no email is sent until the temperature comes down below the threshold.  Then an all clear email is sent and the cycle repeats itself.

 

I used the instructions here to set up ssmtp on the pi.  In my case, I used our comcast email relay since we have comcast so the instructions for that are a little different.  You can also use your company’s own mail relay if you have one internally that can be used to send email to external addresses.  As has been my practice lately, I’ve uploaded the code to GitHub here for you to do with as you please.

 

As always, if you have any constructive criticism or comments, feel free to leave them below and I’ll get back to you ASAP.

SSH Tunneling with PuTTY

From time to time I have a need to connect to a system inside another remote network (usually my work).  Normally I just ssh in and then jump to the machine I need to be on.  That’s all fine and dandy if you don’t need a GUI.  What if you need to be on the GUI console of the target machine inside the firewall and the firewall doesn’t allow the port you need to use?

 

Enter VNC and PuTTY.  You aren’t limited to doing this with PuTTY or VNC.  It’s just that a majority of my work is done from a windows machine and I refuse to install the bloated CYGWIN app on my machine just to get an ssh command line session.  Bah.. that’s a story for another day.  Anyway- SSH tunnels can be a bit confusing to the lay person so I thought I’d do a graphical illustration to help that out.

 

In this scenario, I will be using my laptop at home to connect into a landing pad UNIX machine at work.  I will then open a tunnel to another machine inside the remote network that will establish a connection to the VNC server running on that machine.  I won’t go into how to set up a VNC Server on linux as there are plenty of tutorials out there that will cover it.  The one thing I will say is make sure you use a password when you start it up.  This is a visual example of what the connection looks like:

 

capture

 

Here are some enlarged views so you can see what’s going on.  First we start PuTTY on the laptop.  I’ll show an example of what options you need to select inside the Putty connection later.  Once the tunnel is in place, fire up your favorite VNC client and point it to 127.0.0.1 or localhost on port 59001:

capture1

We pointed our VNC client to the address and port of the tunnel we just created, so the traffic is sent through the tunnel into the external Landing Pad and being forwarded on into the remote network:

capture2

Finally, the tunnel terminates on the server inside the remote network and connects the tunnel to port 5901 on that machine:

capture3

 

It may seem odd to connect your VNC client to the laptop’s localhost address in order to reach the target machine.  This is because you’re sending that traffic through the SSH tunnel that we set up rather than pointing it directly to the server you want to reach.

 

Now I’ll show you how to configure PuTTY to create the tunnel.  First, fire up Putty and populate the username and IP address of the landing pad server in our example (substitute yours of course).  Leave the port at 22:

capture4

 

Next, scroll down on the left hand side in the Category window and select Tunnels.  Here, populate the source port (59001 in my example), the IP address of the final destination server along with the port you want to connect to on that machine (5901 in my example).  Remember, you aren’t putting the IP address of the landing pad here- we want the target server in the Destination field. Once you have the Source port and Destination fields filled in, click Add and it will pop into the window as seen below:

capture5

 

To establish the tunnel, click Open. This will launch the PuTTY terminal and prompt you for your password.  In this screenshot, I’m using root to log in however generally it’s a good idea to use a non-privileged user to log into any machine:

 

capture6

Once you see the user prompt and you’re logged in, the tunnel is now in place.  Keep in mind that this SSH session you have open is the only thing keeping that tunnel open.  If you log out of the shell, it also tears down the tunnel so keep this window open while you’re using the tunnel.

 

The next step is to launch a VNC Viewer on your laptop and point it to your local machine on port 59001:

capture7

Click the connect button and you should see the next window prompting you for the password you set up earlier:

capture8

Finally, once you click OK you will be brought to your VNC Desktop on the machine inside the remote network!

capture9

 

So let’s take a step back and review what we’ve effectively done here:

 

Start VNC server:

We have to start a VNC server on the target computer, along with configuring a password to keep everyone else out.  This would have to be done separately.

 

Establish Tunnel:

We first establish the tunnel from the laptop, through the landing pad and finally to the remote server.  I’m making the obvious assumption here that you have the landing pad accessible to the internet on port 22 and that you have an SSH server running that will accept such connections.  You’re effectively logging into the landing pad just like you would on any other day.  The difference here is that we’re also telling PuTTY to set up a tunnel for us pointing to the remote server as well.  Aside from that- your login session will look and feel just the same.

 

Launch VNC Client:

We then start the VNC client on our laptop.  Normally, we would point it directly to the server we want to VNC into.  In our case, we created a tunnel that terminates on your laptop at port 59001.  So we connect our VNC client to the laptop (localhost or 127.0.0.1 should work) and point it to port 59001 instead of the standard port 5901.  The VNC client doesn’t care how the traffic is getting to the VNC server, it just does its job.

Think of this SSH tunnel as kind of a wormhole if that type of thing were to actually exist.  The traditional method of connecting to your remote endpoint would be similar to pointing our space shuttle towards the Andromeda galaxy which is about 2.5 million light years away.  It’s essentially not possible to get there- similar to a firewall that is blocking us.  But what if there were a wormhole that terminated near Earth that ended in the Andromeda galaxy?  If we were to point our space shuttle into the wormhole, theoretically we would pop out the other side at our target.

 

If you do plan on doing something like this, make sure you network administrator is ok with it.  They may detect the traffic as malicious if they’re not sure where it’s coming from and you may wind up in trouble.  I hope this helps give a basic understanding of how SSH Tunnels work.

 

 

 

 

 

Internet Ping Meter (part 2 of 2)

Onto the fun stuff!  Below is the python script that does most of the heavy lifting.  Remember with Python, indentation is critical.  It’s actually used to delimit things like functions rather than more traditional delimiters like {}.  Best practice is to use spaces, not tabs for indentation because they can be inconsistent and cause problems.  To avoid this, I like to use an IDE such as notepad++ or the Arduino IDE. It does a great job of taking care of the spacing and indentation. It will even go through the entire script and fix any indentation errors you have automatically. Highly recommended. FYI- You’ll also need to install the PySerial module for this to work:

#!/usr/bin/python

##
## Internet Ping Meter v1.0 - Eric Steed
##
## 01/03/17 - first version - EPS
##
import serial
import sys
import subprocess
import time
latency = 0
ping_targets="8.8.8.8 4.2.2.2 208.67.220.220"
retVal = 0
failLevel = 0
lastLEDStatus = ""

##
## Define array variable alertLevel[] and assign color codes to be sent to the NeoPixel.
## Based on the number of total ping failures, iterate the failLevel by one and
## send the appropriate color code.
##
clearLED = "ic"
alertLevel = []
alertLevel = ["h","g","f","e","d"]

##
## Open the serial port to talk to the NeoPixel. Have to wait for it to initialize
## before we start sending signals
##
port = serial.Serial("/dev/ttyACM0", baudrate=9600, timeout=1)
time.sleep(3)

##
## Green = h
## Greenish Yellow = g
## Yellow = f
## Orange = e
## Red = d
## Black = i
##
## LED #'s
##
## 1-9 = 1-9
## 10 = a
## 11 = b
## 12 = c
##
##
## I'm using a NeoPixel ring with 12 LED Segments to indicate the average latency of
## multiple established servers on the internet.  This way I can tell visually if
## my internet connection is slow, or even down.
##
## To control the NeoPixel, I've assigned specific characters to indicate how many
## LED's to illuminate and what color.  When we tell the NeoPixel to illuminate a
## given number of LED's, we have to account for the fact that the last command
## string that was sent is persistent in that the LED stays lit even when the next
## command string comes in.  For example, if reading 1 determines that 4 LED's
## should be lit, then reading 2 calls for 3 LED's, you wouldn't be able to see that
## because all 4 LED's were still illuminated from the previous cycle.
##
## To account for this, we send an instruction to "illuminate" all 12 LED's with
## the color Black before sending the actual value desired.  This is done by
## assigning a value of 'ic' to the variable clearLED.  I've also added some logic
## at the end of the infinite while loop that says don't send any instructions
## unless there's been a change since the last one.  This gets rid of the blinking
## effect that I was seeing on every update- rather annoying!
##

##
## I'm using the subprocess library for now unless I can get the native Python ping library
## to do it for me.  If stdout is null for a given target, return 0.
##
def doPing(host):
    import os,platform
    pingOutput = subprocess.Popen(["ping -c 1 -w 1 " + host + " | grep rtt | awk -F/ '{print $5}' | awk -F. '{print $1}'"], stdout=subprocess.PIPE, shell=True)
    (out, err) = pingOutput.communicate()
    if (out.rstrip('\n') == ''):
        return 0
    else:
        return out.rstrip('\n')

##
## Get average latency from all of the ping targets. Had to cast the output of
## doPing() into an integer to be able to do math against it
##
while True:
    count=0
    for x in ping_targets.split():
        retVal = int(doPing(x))
        #print "latency = [{0}]".format(retVal)
        # print "type = [{0}]".format(type(retVal))
        if (retVal > 0):
            latency += retVal
            count+=1

    ##
    ## If count is zero, that means we were not able to successfully ping
    ## any of the targets and we should start incrementing the failure count.
    ## Furthermore, if we have been incrementing failLevel and we are now
    ## able to ping, reset the failLevel back to 0 at that time.
    ##
    if (count == 0):
        # Increase failure level
        #print "Failed to ping any host"
        failLevel += 1
        if (failLevel > 4):
            failLevel = 4

    else:
        latency=(latency/count)
        failLevel = 0

    ##
    ## Set LEDStatus to the appropriate value based on latency and failure count
    ##

    #print "Average Latency = [{0}]".format(latency)

            if (latency > 1) and (latency <= 10):                 #print "1-10"                 LEDStatus = clearLED + alertLevel[failLevel] + "1"         elif (latency >= 11) and (latency <= 20):                 #print "11-20"                 LEDStatus = clearLED + alertLevel[failLevel] + "2"         elif (latency >= 21) and (latency <= 30):                 #print "21-30"                 LEDStatus = clearLED + alertLevel[failLevel] + "3"         elif (latency >= 31) and (latency <= 40):                 #print "31-40"                 LEDStatus = clearLED + alertLevel[failLevel] + "4"         elif (latency >= 41) and (latency <= 50):                 #print "41-50"                 LEDStatus = clearLED + alertLevel[failLevel] + "5"         elif (latency >= 51) and (latency <= 60):                 #print "51-60"                 LEDStatus = clearLED + alertLevel[failLevel] + "6"         elif (latency >= 61) and (latency <= 70):                 #print "61-70"                 LEDStatus = clearLED + alertLevel[failLevel] + "7"         elif (latency >= 71) and (latency <= 80):                 #print "71-80"                 LEDStatus = clearLED + alertLevel[failLevel] + "8"         elif (latency >= 81) and (latency <= 90):                 #print "81-90"                 LEDStatus = clearLED + alertLevel[failLevel] + "9"         elif (latency >= 91) and (latency <= 100):
                #print "91-100"
                LEDStatus = clearLED + alertLevel[failLevel] + "a"

        else:
                #print "latency greater than 101"
                LEDStatus = clearLED + alertLevel[failLevel] + "c"

    ##
    ## If the latency is within a different range than the last iteration, send
    ## the command to update the LED count on the NeoPixel.  Otherwise you get
    ## a rather annoying blinking effect as the LED's are updated even if it's the
    ## same measurement as the last time.
    ##
    if (LEDStatus != lastLEDStatus):
        port.write(LEDStatus)
        lastLEDStatus = LEDStatus

    #time.sleep(5)
    #print LEDStatus
    latency = 0

I left the debugging code in the script if you want to uncomment them and watch the terminal as the script runs to see what’s going on. Most of the script is fairly straightforward so I won’t dwell too much on explaining it step by step.

Now, onto the Arduino code. I’m using the Arduino basically as a driver for the NeoPixel. Again- I could have probably just used the Pi by itself, but what fun would that be?

#include <Adafruit_NeoPixel.h>

//
// Internet Ping Meter v1.0 - Eric Steed
//
// 01/03/17 - first version - EPS
//
// Set up variables
byte leds = 0;
uint8_t delayVal = 30;

// Set the PIN number that the NeoPixel is connected to
#define PIN   7

// How bright are the LED's (0-255)
#define INTENSITY 60

// Set color to Green to start
uint8_t  r = 0;
uint8_t  g = INTENSITY;
uint8_t  b = 0;

// Set the number of pixels on the NeoPixel
#define NUMPIXELS   12

// When we setup the NeoPixel library, we tell it how many pixels, and which pin to use to send signals.
Adafruit_NeoPixel pixels = Adafruit_NeoPixel(NUMPIXELS, PIN, NEO_GRB + NEO_KHZ800);

// Initialize everything and prepare to start
void setup()
{
  uint8_t i;

  // Set up the serial port for communication
  Serial.begin(9600);
  Serial.println("Started Serial Monitor");

  // This initializes the NeoPixel library.
  pixels.begin();

  // This sets all the pixels to "off"
  for (i = 0; i < NUMPIXELS; i++) {
    pixels.setPixelColor(i, pixels.Color(0, 0, 0));
    pixels.show();
  }

  // Cycle each pixel through the primary colors to make sure they work, then turn them all off
  // Red
  for (i = 0; i < NUMPIXELS; i++) {
    pixels.setPixelColor(i, pixels.Color(INTENSITY, 0, 0));
    pixels.show();
    delay(delayVal);
  }

  // Green
  for (i = 0; i < NUMPIXELS; i++) {
    pixels.setPixelColor(i, pixels.Color(0, INTENSITY, 0));
    pixels.show();
    delay(delayVal);
  }

  // Blue
  for (i = 0; i < NUMPIXELS; i++) {
    pixels.setPixelColor(i, pixels.Color(0, 0, INTENSITY));
    pixels.show();
    delay(delayVal);
  }

  // White
  for (i = 0; i < NUMPIXELS; i++) {
    pixels.setPixelColor(i, pixels.Color(INTENSITY, INTENSITY, INTENSITY));
    pixels.show();
    delay(delayVal);
  }

  // Turn off all LED's
  for (i = 0; i < NUMPIXELS; i++) {
    pixels.setPixelColor(i, pixels.Color(0, 0, 0));
    pixels.show();
  }
}

// Main loop
//
// When sending LED signals, send the color code first, then the number of LED's to
// turn on.  For example 6 Green LED's would be h6, 11 Red LED's would be db, all
// 12 LED's to Black would be ic
void loop()
{
  uint8_t i;
  if (Serial.available())
  {
    char ch = Serial.read();
    // Serial.print("ch = ");
    // Serial.println(ch);
    int led = ch - '0';

    // Serial.print("led = ");
    // Serial.println(led);

    // Set Color of LED based on how many fails in a row
    //RED = 52(d)
    //ORANGE = 53(e)
    //YELLOW = 54(f)
    //YELLOW-GREEN = 55(g)
    //GREEN = 56(h)
    //BLACK = 57(i)

    switch (led) {
      // Set color to RED
      case 52: {
          r = INTENSITY;
          g = 0;
          b = 0;
        }
        break;

      // Set color to ORANGE
      case 53: {
          r = INTENSITY;
          g = (INTENSITY / 2);
          b = 0;
        }
        break;

      // Set color to YELLOW
      case 54: {
          r = INTENSITY;
          g = INTENSITY;
          b = 0;
        }
        break;

      // Set color to YELLOW-GREEN
      case 55: {
          r = (INTENSITY / 2);
          g = INTENSITY;
          b = 0;
        }
        break;

      // Set color to GREEN
      case 56: {
          r = 0;
          g = INTENSITY;
          b = 0;
        }
        break;

      // Set color to BLACK
      case 57: {
          r = 0;
          g = 0;
          b = 0;
        }
        break;

      // To save on code, if we receive a 0 through a 9, turn on that
      // number of LED's
      case 0 ... 9:
        for (i = 0; i < led; i++) {
          pixels.setPixelColor(i, pixels.Color(r, g, b));
          pixels.show();
        }
        break;

      // If we receive an "a", turn on 10 LED's
      case 49:
        for (i = 0; i < 10; i++) {
          pixels.setPixelColor(i, pixels.Color(r, g, b));
          pixels.show();
        }
        break;

      // If we receive a "b", turn on 11 LED's
      case 50:

        for (i = 0; i < 11; i++) {
          pixels.setPixelColor(i, pixels.Color(r, g, b));
          pixels.show();
        }
        break;

      // If we receive a "c", turn on 12 LED's
      case 51:

        for (i = 0; i < 12; i++) {
          pixels.setPixelColor(i, pixels.Color(r, g, b));
          pixels.show();
        }
        break;

      // For testing, insert a delay if we see a ,
      case -4:
        delay(delayVal * 10);
        break;

      default:
        // if nothing else matches, do the default
        // default is optional
        break;
    }
    // I had to add this bit of code to fix a problem where the Arduino buffer
    // apparently filled up after a very short time.  It would set the LED's on
    // but then pause for 2-3 seconds before it would receive the next command.
    // This tells the Arduino to flush out the buffer immediately.
    Serial.flush();
  }
}

If it’s not already evident, I’m not very adept at either Python or Arduino coding. I’m just starting out. The most frustrating thing for me is stumbling across syntax issues with code. 9 times out of 10, I know it’s possible to do something but I just can’t get the syntax right or use the correct modules. All this comes with time so maybe in a year, this code would be half or 1/3 the size it is right now.

Once you have everything installed and tested (you can turn on and off the LED’s), you have to connect the Pi to your network. I would consider this device to be a single purpose device and not put anything else on it that could interfere with the script and timing. They’re cheap enough that you should be able to justify this.

 

You can find this code on Github at https://github.com/esteed/Internet-Ping-Meter.  Please feel free to make modifications and generate a pull request- I’m always looking for a better mousetrap!

 

I hope this has helped you even a little bit. I had a great time setting it up and I look forward to making enhancements. The first one will be to indicate current upstream and downstream throughput using white and blue LED’s basically overlaid on the top of the latency indicators. Wish me luck!!

Internet Ping Meter (part 1 of 2)

THE INTERNET IS DOWN!!

How many of you “home IT support technicians” have heard this before?  I hear it a lot, so I decided to create a device that would notify me visually when problems occur.  Sometimes it winds up being a flaky wifi router that either reboots or just needs to take a breath.  Other times, it’s our Comcast connection in which case I can’t do anything other than call and file an outage.  The kids seem to have a hard time with understanding that even though I’ve explained it to them a hundred times.

A little background on the reason for this project.  At the company I work for, we employ a WAN load balancer which uses a series of pings to major internet presences such as google, AT&T or OpenDNS servers.  Basically the device pings each of those addresses once per second and based on specific criteria, can determine if one of the two internet connections is down and can take appropriate action.

This is what made me decide to develop my version of the ping meter.  There are a number of projects like this for the raspberry pi that involve some sort of visual representation.  I wanted to put together a project that incorporated both the raspberry pi, an arduino board and the NeoPixel ring.  This was mainly a project for me to learn how to integrate multiple devices.  Honestly I could probably have done this without the Arduino but I wanted to challenge myself a little.

At this point, I have the device working the way I want.  My next challenge is to package the device into something more aesthetically pleasing.  WAF (Wife Acceptance Factor) is an important aspect to any geek project like this if it’s gonna be displayed somewhere that’s visible.  I’m thinking maybe a small picture frame or maybe some sort of glass object that looks nice.

Here is a list of the parts you’ll need:

  • Raspberry Pi (any model should work)
  • Arduino board (I used an UNO but even that is overkill)
  • NeoPixel LED ring (12 LED segments)
  • Micro SD card (at least 4gb)
  • USB Type A to USB type B (printer/scanner cable)
  • 5V Micro USB power source (iPad charging brick is perfect)

I haven’t tested using a Pi Zero yet but I don’t see why it wouldn’t work.  I also have an Arduino Trinket (5v version) that I’m trying to use for this however out of the box it doesn’t support serial communication.  For size reasons, this combination would be perfect for just about any implementation where room is an issue.  You could just as easily use a larger NeoPixel ring or even a strip with some very minor code modifications.

There are two programs that are used to make this system work.  One is the “firmware” that you load onto the Arduino board itself.  The other is the python script that runs on the Pi.  Basically I use the Pi to ping 3 different IP addresses, and use the NeoPixel ring to display the average ping latency in LED segments.  If I can’t ping all three then I start to progressively change the color of the LED’s from green to red.  Throughout this project I learned a lot about programming in python, Arduino and interacting with external physical devices.  I first started by just getting the LED’s to turn on and off.  I borrowed a lot of code from examples and implemented the same routines to get the NeoPixel to do what I wanted to.

I tried to sprinkle comments throughout the code to explain what I’m doing and why.  Most of these were added after I made a breakthrough in something that was kicking my ass for awhile so I would know how to fix the problem the next time around.  I won’t focus a lot on how to install the OS on your Pi or how to download code to the Arduino- there are a LOT of helpful resources on the internet that can walk you through it.  Also, in the spirit of this being a learning exercise for me- I think it’s valuable for someone starting out fresh to do the research and have a basic understanding of what’s going on rather than just copying and pasting code.  If you’re trying to put this together and run into problems, feel free to comment on the article and I’ll do my best to answer questions.

In the next article, I’ll show you the code and how it all works.  Stay tuned!

Using VVOLs with vSphere 6 and Nimble

VMware Virtual Volumes is a concept that represents a major paradigm shift from the way storage is used in VMware today.

Below is a short 5 minute video that explains the basic concept of VVOLs.

 

Additionally, knowing the difference between communicating with LUNs as in the old world and communicating with PEs (Protocol Endpoints) is crucial to understanding what VVOLs brings to the table and why.

In short, PE’s are actually a special LUN on the storage array that the ESXi server uses to communicate with the array.  It’s not a LUN in the traditional sense, but more like a logical gateway to talk to the array.  I would say in some ways it’s similar in function to a gatekeeper LUN on an EMC array.  That LUN in turn maps to multiple sub-luns that make up the VM’s individual storage related components (vmdk, vswp, vmsd, vmsn etc).  When the host wants to talk to a LUN, it sends the request to the address of the PE “LUN” with an offset address of the actual LUN on the storage array.  Two things immediately came to mind once I understood this concept:

  1. Since all communication related to the sub-volumes is a VASA function, what happens when vCenter craps the bed?
  2. If I only have 1 PE, isn’t that going to be a huge bottleneck for storage I/O?

The answers to these and other questions are handily dealt with in a post here by VMware vExpert Paul Meehan.  Again- the short version is that vCenter is not needed after the host boots and gets information on PE’s and address offsets.  When it IS needed however is during a host boot.  Secondly, I/O traffic actually goes through the individual volumes, not the PE.  Remember, the PE is a logical LUN that serves as a map to the actual volumes underneath.

This brings me to the next video- understanding PEs.  This link starts about 12 minutes into an hour long presentation where PE’s are talked about.  Feel free to watch the entire video if you want to learn more!

 

Finally, let’s walk through how to set up VVOLs on your Nimble array.  There are a few pre-requisites before you can start:

  • NOS version 3.x or newer
  • vSphere 6.x or newer

Here’s the step by step process:

  1. Connect to web interface of local array
  2. Click on Administration -> VMware integration
  3. Fill in the following information
    • vCenter Name (this can be a vanity name- doesn’t have to be the address of the host)
    • choose the proper subnet on your Nimble array to communicate with vCenter
    • vCenter Host (FQDN or IP address)
    • Credentials
    • Check Web Client and VASA Provider
    • Click Save (This registers vCenter with the storage array and installs the VASA 2.0 provider)
  4. Navigate to Manage -> Storage Pools
  5. Select the Pool in which you want to create the VVOLs (for most sites this will be default)
  6. Click New Folder
  7. Change the Management Type to VMware Virtual Volumes
  8. Give the folder a Name and Description
  9. Set the size of the folder
  10. Choose the vCenter that you registered above, then click Create

Now you have a storage container on the Nimble array that you can use to create VVOLs.  Let’s look at the VMware side now:

  1. Connect to the vSphere web client for your vCenter 6 instance (this will not work with the thick client)
  2. Navigate to Storage and highlight your ESX server
  3. Click on Datastores on the right side of the window
  4. Click on the icon to create a new datastore
  5. Select your location (datacenter) then click next
  6. Select type VVOL then click next
  7. You should see at least one container- click next.  If not, try rescanning your HBA’s in the web client and start again
  8. Assign which host(s) will need access to the VVOL and click next
  9. On the summary screen- click finish

You should now see a new datastore.  Now let’s create a VM in the datastore and see what it looks like in the Nimble web interface!

  1. In vCenter, navigate to hosts and clusters
  2. Right click on your host to create a new virtual machine
  3. Click next under creation type to create a new virtual machine
  4. Give the VM a name, select the location where it should be created and click next
  5. Select the VVOL no requirements policy under VM storage policy
  6. Select the VVOL datastore that is compatible and click next
  7. Select ESXI 6.0 and later under the VM compatibility dtrop down and click next
  8. Choose the appropriate guest OS family and version then click next
  9. Adjust the virtual hardware to meet your needs and click next
  10. At the summary screen, verify all settings are correct and click Finish

Now if you navigate to Manage volumes in your Nimble web interface you will see multiple volumes for each VM you created.  Instead of putting all the .vmdk, .vmx, .vswp and other files inside a single datastore on a single LUN, each object is it’s own volume.  This is what allows you to set performance policies on a per VM basis because each volume can be treated differently.  You can set high performance policy on your production VM’s and low performance on dev/test for example.  Normally you would have to split your VMs into separate datastores and manage the performance policies on a per datastore level.  The problem with this is that you still have no visibility into each VM in that datastore at the storage layer.  With VVOLs, you can see latency, throughput and even noisy neighbor information on a per VM basis in the Nimble web interface!

 

Windows Wifi troubleshooting tools

Have you ever tried connecting your laptop to a Wifi network and for one reason or another it failed?  It can be extremely frustrating, even to a seasoned vet who knows their way around Windows.  The big problem is that you get virtually no information from the connection failure.  No logs, no error codes, nothing.

There is a reporting tool that is built into windows 8 and newer that will compile a very detailed report showing each connection attempt, its status and a ton of other stuff.  Here’s how to run the report and where it gets put:

 

  • Open a command prompt as administrator
  • Run the following command
    • netsh wlan show wlanreport
  • Note the path where the html file is generated.  It should be C:\ProgramData\Microsoft\Windows\WlanReport\wlan-report-latest.html
  • Open your favorite web browser and point it to that file.  voila!

 

Here’s a snippet of what some of the report looks like:

capture

 

There is a LOT more information below this including extremely detailed information about all the network adapters on the system, the system itself and some script output.  If you click on the items in the session list above, it will bring you to a detailed log of that session and why it was able to or not able to connect.

 

Suffice it to say this is an invaluable tool to review logs and information all in one place.

ODA Patching – get ahead of yourself?

I was at a customer site deploying an X5-2 ODA.  They are standardizing on the 12.1.2.6.0 patch level.  Even though 12.1.2.7.0 is currently the latest, they don’t want to be on the bleeding edge.  Recall that the 12.1.2.6.0 patch doesn’t include infrastructure patches (mostly firmware) so you have to install 12.1.2.5.0 first, run the –infra patch to get the firmware and then update to 12.1.2.6.0.

 

We unpacked the 12.1.2.5.0 patch on both systems and then had an epiphany.  Why don’t we just unpack the 12.1.2.6.0 patch as well and save some time later?  What could possibly go wrong?  Needless to say, when we went to install or even verify the 12.1.2.5.0 patch it complained as follows:

ERROR: Patch version must be 12.1.2.6.0

 

Ok, so there has to be a way to clean that patch off the system so I can use 12.1.2.5.0 right?  I stumbled across the oakcli manage cleanrepo command and thought for sure that would fix things up nicely.  Ran it and I got this output:

 


[root@CITX-5ODA-ODABASE-NODE0 tmp]# oakcli manage cleanrepo --ver 12.1.2.6.0
Deleting the following files...
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/OAK/12.1.2.6.0/Base
Deleting the files under /DOM0OAK/12.1.2.6.0/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/Seagate/ST95000N/SF04/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/Seagate/ST95001N/SA03/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/WDC/WD500BLHXSUN/5G08/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/HGST/H101860SFSUN600G/A770/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/Seagate/ST360057SSUN600G/0B25/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/HITACHI/H106060SDSUN600G/A4C0/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/HITACHI/H109060SESUN600G/A720/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/HITACHI/HUS1560SCSUN600G/A820/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/HGST/HSCAC2DA6SUN200G/A29A/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/HGST/HSCAC2DA4SUN400G/A29A/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/STEC/ZeusIOPs-es-G3/E12B/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/STEC/Z16IZF2EUSUN73G/9440/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Expander/ORACLE/DE2-24P/0018/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Expander/ORACLE/DE2-24C/0018/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Expander/ORACLE/DE3-24C/0291/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Controller/LSI-es-Logic/0x0072/11.05.03.00/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Controller/LSI-es-Logic/0x0072/11.05.03.00/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Ilom/SUN/X4370-es-M2/3.0.16.22.f-es-r100119/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/HITACHI/H109090SESUN900G/A720/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/STEC/Z16IZF4EUSUN200G/944A/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/HGST/H7240AS60SUN4.0T/A2D2/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/HGST/H7240B520SUN4.0T/M554/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Disk/HGST/H7280A520SUN8.0T/P554/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Expander/SUN/T4-es-Storage/0342/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Controller/LSI-es-Logic/0x0072/11.05.03.00/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Controller/LSI-es-Logic/0x005d/4.230.40-3739/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Controller/LSI-es-Logic/0x0097/06.00.02.00/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Controller/Mellanox/0x1003/2.11.1280/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Ilom/SUN/X4170-es-M3/3.2.4.26.b-es-r101722/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Ilom/SUN/X4-2/3.2.4.46.a-es-r101689/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Ilom/SUN/X5-2/3.2.4.52-es-r101649/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/HMP/2.3.4.0.1/Base
Deleting the files under /DOM0HMP/2.3.4.0.1/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/IPMI/1.8.12.4/Base
Deleting the files under /DOM0IPMI/1.8.12.4/Base
Deleting the files under /JDK/1.7.0_91/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/ASR/5.3.1/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/OPATCH/12.1.0.1.0/Patches/6880880
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/OPATCH/12.0.0.0.0/Patches/6880880
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/OPATCH/11.2.0.4.0/Patches/6880880
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/GI/12.1.0.2.160119/Patches/21948354
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/DB/12.1.0.2.160119/Patches/21948354
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/DB/11.2.0.4.160119/Patches/21948347
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/DB/11.2.0.3.15/Patches/20760997
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/DB/11.2.0.2.12/Patches/17082367
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/OEL/6.7/Patches/6.7.1
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/OVM/3.2.9/Patches/3.2.9.1
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/OVS/12.1.2.6.0/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Controller/LSI-es-Logic/0x0072/11.05.02.00/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/thirdpartypkgs/Firmware/Controller/LSI-es-Logic/0x0072/11.05.02.00/Base
Deleting the files under $OAK_REPOS_HOME/pkgrepos/orapkgs/GI/12.1.0.2.160119/Base

 

So I assumed that this fixed the problem.  Nope…

 


[root@CITX-5ODA-ODABASE-NODE0 tmp]# oakcli update -patch 12.1.2.5.0 --verify

ERROR: Patch version must be 12.1.2.6.0

 

 

Ok so more searching the CLI manual and the oakcli help pages came up with bupkiss.  So I decided to do an strace of the oakcli command I had just ran.  As ususal- there was a LOT of garbage I didn’t care about or didn’t know what it was doing.  I did find however that it was reading the contents of a file that looked interesting to me:

 


[pid 5509] stat("/opt/oracle/oak/pkgrepos/System/VERSION", {st_mode=S_IFREG|0777, st_size=19, ...}) = 0
[pid 5509] open("/opt/oracle/oak/pkgrepos/System/VERSION", O_RDONLY) = 3
[pid 5509] read(3, "version=12.1.2.6.0\n", 8191) = 19
[pid 5509] read(3, "", 8191) = 0
[pid 5509] close(3) = 0
[pid 5509] fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
[pid 5509] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f159799d000
[pid 5509] write(1, "\n", 1
) = 1
[pid 5509] write(1, "ERROR: Patch version must be 12."..., 40ERROR: Patch version must be 12.1.2.6.0
) = 40
[pid 5509] exit_group(0) = ?

 

There were a dozen or so lines after that, but I had what I needed.  Apparently /opt/oracle/oak/pkgrepos/System/VERSION contains the current version of the latest patch that has been unpacked.  The system software version is kept somewhere else because after I unpacked the 12.1.2.6.0 patch, I ran an oakcli show version and it reported 12.1.2.5.0.  But the VERSION file referenced earlier said 12.1.2.6.0.  I assume when I unpacked the 12.1.2.6.0 patch, it updates this file.  So what I wound up doing is changing the VERSION file back to 12.1.2.5.0 as well as deleting the folder /opt/oracle/oak/pkgrepos/System/12.1.2.6.0.  Once I did this, everything worked as I expected.  I was able to verify and install the –infra portion of 12.1.2.5.0 and continue on my merry way.

 

This highlights the fact that there isn’t a known way (to me at least) to delete an unpacked patch via oakcli or any python scripts I’ve been able to find yet.  Also- as an aside I tried just deleting the VERSION file assuming it would be rebuilt by oakcli and it didn’t.  I got this:

 


[root@CITX-5ODA-ODABASE-NODE0 System]# oakcli update -patch 12.1.2.5.0 --verify
ERROR : Couldn't find the VERSION file to extract the current allowed version

 

So I just recreated the file and all was good.  I was hoping that the oak software didn’t maintain some sort of binary formatted database that kept track of all this information- I think I got lucky in this case.  Hope this helps someone out in a pinch!

Troubleshooting ODA Network connectivity

TroubleShootAudits1Setting up an ODA in a customer’s environment can either go very well or give you lots of trouble.  It all depends on having your install checklist completed, reviewed by the customer and any questions answered ahead of time.

 

I’ve installed dozens of ODA’s in a variety of configurations.  Ranging from a simple bare metal install to a complex virtualized install with multiple VMs and networks.  Now understand that I’m not a network engineer nor do I play one on TV, but I know enough about networking to have a civil conversation with a 2nd level network admin without getting too far out of my comfort zone. Knowing this- I can certainly appreciate the level of complexity involved in configuring and supporting an enterprise grade network.

 

Having said that, I find that when there are issues with a deployment, whether it’s an ODA, ZFS appliance, Exadata or other device, at least 80% of the time network misconfigurations are the culprit.  I can’t tell you how many times I’ve witnessed misconfigurations where the network admin swore up and down that they were set correctly but in fact were wrong.  It usually involves checking, re-checking and checking yet again to finally uncover the culprit.  Below, I’ll outline some of the snafu’s I’ve been involved with and the troubleshooting that can help resolve the issue.

 

Internet lock

 

  • Cabling: Are you sure the cables are all plugged into the right place?

Make sure that if you didn’t personally cable the ODA and you’re having network issues, don’t go too long without personally validating the cable configuration.  In this case, the fancy setup charts are a lifesaver!  On the X5-2 ODA’s for example, the InfiniBand private interconnect is replaced by the 10gb fiber ethernet option if the customer needs 10gb ethernet over fiber.  There is only one expansion slot available so unfortunately it’s either or.  As a result of this, the private interconnect is then facilitated by net0 and net1 with crossover cables (green and yellow) between the two compute nodes instead of the InfiniBand cables.  This can be missed very easily.  Also make sure the storage cables are all connected to the proper ports for your configuration- whether it’s one storage shelf or two.  This will typically be caught shortly after deploying the OS image whether it’s virtualized or bare metal.  There’s a storagetopology check that gets run during the install process that will catch most cabling mistakes but best not to chance it.

  • Switch configuration: Trunk port vs. Access port

When you configure a switch port, you need to tell the switch about what kind of traffic will pass through that port.  One of the important items is what network(s) does the server attached to this port need to talk on.  If you’re configuring a standalone physical server, chances are you won’t have a need to talk on more than one VLAN.  In this case, it’s usually appropriate to configure the switch port as an access port.  You can still put the server on a non-default VLAN (a VLAN other than 1) but the VLAN “tags” get stripped off at the switch and the server never sees them.

If however you’re setting up a VMware server or a machine that uses virtualization technology, it’s more likely that the VM’s that run on that server may indeed need to talk on more than one VLAN through the same network adapter(s).  In this case, you would need to set the port mode to trunked.  You then need to make sure to assign all the VLAN’s that the server will need to communicate on to that trunk port.  The server is then responsible for analyzing the VLAN tags and passing the traffic to the appropriate destination on the server.  This is one of the areas where the switch is usually configured incorrectly.  Most of the time, the network engineer fails to configure trunk mode on the port, forgets to assign the proper VLANs to the port or even setting a native VLAN on the port.

There is a difference between the default VLAN and a native VLAN.  The default VLAN is always present and is typically needed for intra-network device communication to take place.  Things like Cisco’s CDP protocol use this VLAN.  The Native VLAN, if configured, is treated similar to an access port from the perspective of the network adapter on the server.  The server NIC does not have to have a VLAN interface configured on top of it to be able to talk on the native VLAN.  If you want to talk on any other VLAN on this port however, you would need to configure a VLAN interface on the server to be able to receive those packets.  I’ve not seen the native VLAN used in a lot of configurations where more than one VLAN is needed, but it is most certainly a valid configuration.  Have the network team check these settings and make sure you understand how it should apply to your device.

  • Switch configuration: Aggregated ports vs. regular ports

Most switches have the ability to cobble together 2 to as many as 8 ports to provide higher throughput/utilization of the ports as well as redundancy at the same time.  This is referred to in different ways depending on your switch vendor.  Cisco calls it etherchannel, HP calls it Dynamic LACP trunking while extreme networks refer to it as sharing (LAG).  However you want to refer to it, it’s an implementation of a portion of the 802.3 IEEE standard which is commonly referred to as Link Aggregation or LACP (Link Aggregation Control Protocol).  Normally when you want to configure a pair of network interfaces on a server together, it’s usually to provide redundancy and avoid a SPOF (Single Point Of Failure).  I’ll refer to the standard Linux implementation mainly because I’m familiar with the different methods of load balancing that is typically employed.  This isn’t to say that other OS’s don’t have this capability (almost all do), I’m just not very experienced with all of them.

Active-Backup (Linux bonding driver mode=1) is a very simple implementation in which a primary interface is used for all traffic until that interface fails.  The traffic then moves over to the backup interface and communication is restored almost seamlessly.  There are other load balancing modes besides this one that don’t require any special configurations on the switch, each has their strengths and weaknesses.

LACP, which does require a specific configuration on the switch ports that are involved in order to work tends to be more performant while still maintaining redundancy.  The main reason for this is that there is out of band communication via the multicast group MAC address (01:80:c2:00:00:02) between the network driver on the server and the switch to keep both partners up to date on the status of the link.  This allows both ports to be utilized with an almost 50/50 split to evenly distribute the load between the totality of all the NICs in the LACP group effectively doubling (or better) throughput.

The reason I’m talking about this in the first place is because of the configuration that needs to be in place on the switch if you’re to use LACP.  If you configure your network driver for Active-Backup mode but the switch ports are set to LACP, you likely won’t see any packets at all on the server.  Likewise, if you have LACP configured on the server but the switch isn’t properly set up to handle it you’ll get the same result.  This is another setting that commonly gets misconfigured.  Other parameters such as STP (Spanning Tree Protocol), lacp_rate and passive vs. active LACP are some of the more common misconfigurations.  Also sometimes the configuration has to be split between two switches (again- no SPOF) and an MLAG configuration needs to be properly set up in order to allow LACP to work between switches.  Effectively, MLAG is one way of making two switches appear as one from a network protocol perspective and is required to span multiple switches within a LACP port group.  The take away here is to have the network admin verify their configuraiton on the switch(es) and ports involved.

  • Link speed: how fast can the server talk on the network?

Sometimes a server is capable of communicating at 10gb/s versus the more common 1gb/s either via copper or fiber media (most typically).  It used to be that you had to force switches to talk at 1gb/s in order for the server to negotiate that speed.  This was back when 1gb/s was newer and the handshake protocol that takes place between the NIC and the switch port at connection time was not as mature as it is now.  However, as a holdover from those halcyon days of yore, some network admins are prone to still set port speeds manually rather than letting them auto-negotiate like a good network admin should.  Thus you have servers connecting at 1gb/s when they should be running at 10gb/s.  Again- just something to keep in mind if you’re having speed issues.

  • Cable Quality: what speed is your cable rated at?

There are currently four common ratings for copper ethernet cables.  They are by no means the only ones but these are the most commonly used in datacenters.  They all have to do with how fast you can send data through the cables.  Cat 5 is capable of transmitting up to 1gb/s.  Cat 5e was an improvement on Cat 5 and introduced some enhancements that limited crosstalk (interference) between the 8 strands of a standard ethernet cable.  Cat 6 and 6a are further improvements on those standards, now allowing speeds of up to 10gb/s or more.  Basically the newer the Cat x number/letter the faster you can safely transmit data without data loss or corruption.  The reason I mention this is that I’ve been burned on more than one occasion when using cat5 for 1gb/s and had too much crosstalk which severely limited throughput and resulted in a lot of collisions.  Replacing the cable with a new cat 5 or higher rated cable almost always fixed the problem.  If you’re having communication problems, rule this out early on so you’re not chasing your tail in other areas.

  • IP Networking: Ensuring you have accurate network configurations

I’ve had a lot of problems in this area.  The biggest problem seems to be the fact that not all customers have taken the time to review and fill out the pre-install checklist.  This checklist prompts you for all the networking information you’ll need to do the install.  If you’ve been given IP information, before you tear your hair out make sure it’s correct.  I’ve been given multiple configurations at the same customer for the same appliance and each time there was something critical wrong that kept me from talking on the network.  Configuring VLAN’s can be especially trying because if you have it wrong, you just won’t see any traffic.  With regular non-VLAN configurations, If you put yourself on the wrong physical switch port or network, you can always sniff the network (tcpdump is now installed as part of the ODA software).  This doesn’t really work with VLAN traffic.  Other things to verify would be your subnet mask and default gateway.  If either of these are misconfigured, you’re gonna have problems.  Also as I mentioned earlier, don’t make the mistake of assuming you have to create a VLAN interface on the ODA just because you’re connected to a trunked port.  Remember the native VLAN traffic is passed on to the server with the VLAN tags stripped off so it uses a regular network interface (i.e. net1).

These are just some of the pitfalls you may encounter.  I hope some of this has helped!

How to create VLANs in DOM0 on a virtualized ODA

Capture

I’ve been working with a local customer the last week or so to help them set up a pair of ODA’s in virtualized mode.  In one of the datacenters, they needed it to be on a VLAN- including DOM0.  Normally, I just configure net1 for the customer’s network and I’m off to the races.  In this case, there are a few additional steps we have to do.

First thing you’ll need to do is install the ODA software from the install media.  Once this is done, you need to log into the console since we don’t have any IP information configured yet.  Below is a high level checklist of the steps needed to complete this activity:

 

  • Determine which VLAN DOM0 needs to be on
  • Pick a name for the VLAN interface.  It doesn’t have to be eth2 or anything like that.  I usually go with “VLAN456” if my VLAN ID is 456 so it’s self descriptive.
  • Run the following command in DOM0 on node 0 (assuming your VLAN ID is 456)

# oakcli create vlan VLAN456 -vlanid 456 -if bond0

 

At this point, you’ll have the following structures in place on each compute node:

VLAN 1

 

We now have networking set up so that eth2 and eth3 are bonded together (bond0).  Then we put a VLAN bond interface (bond0.456) on top of the bond pair.  Finally we create a VLAN bridge (VLAN456) that can be used to forward that network into the VM, and also allow DOM0 to talk on that VLAN.   I’ve shown in the example above what it looks like to connect more than one VLAN to a bond pair.  If you need access to both VLAN’s from within DOM0 then each VLAN interface on each node will need an IP address assigned to it.  You’ll need to rerun configure firstnet for each interface.  Note also that if you need to access more than one VLAN from a bond pair,  you’ll need to set the switch ports that eth2 and eth3 are connected to into trunked mode so they can pass more than a single VLAN.  Your network administrator will know what this means.

 

 

After that’s in place, you can continue to deploy ODA_BASE, do a configure firstnet in ODA_BASE (remember to assign the VLAN interface to ODA_BASE), yadda yadda…

 

Then, as you configure ODA_BASE and create your VM(s), the NetBack and NetFront drivers are created that are responsible for plumbing the network into the VM.  Here’s a completed diagram with a VM that has access to both VLAN’s:

VLAN final

 

Happy Hunting!

 

 

UPDATE: The way this customer wound up configuring their switches at the end of the day was to put the ODA and ODA_BASE on the Native VLAN.  In this case, even though the switch port is trunked to have access to one or more VLAN’s at a time, the Native VLAN traffic is actually passed untagged down to the server.  This implies that you do not need a special VLAN interface on the ODA to talk on this network, just use the regular net1 or net2 interface.  Now, if you want to talk on any other VLANs through that switch port, you will need to follow the procedure above and configure a VLAN interface for that VLAN.

OVM Disaster Recovery In A Box (Part 4 of 5)

Now that you’ve touched a file inside the VM- we have a way to prove that the VM which will be replicated to the other side via replication is actually the one we created.  Apparently in my case, faith is overrated.

 

Now that I’ve fire-hosed a TON of information at you on how to set up your virtual prod and dr sites, this would be a good breaking point to talk a little about how the network looks from a 10,000 foot view.  Here’s a really simple diagram that should explain how things work.  And when I say simple, we’re talking crayon art here folks.  Really- does anyone have a link to any resources on the web or in a book that could help a guy draw better network diagrams?  Ok- I digress.. here’s the diagram:

OVM DR Network Diagram

 

One of the biggest take aways from this diagram highlights something that a LOT of people get confused about.  In OVM DR- you do NOT replicate OVM Manager, the POOL filesystem or the OVM servers on the DR side.  In other words, you don’t replicate the operating environment, only the contents therein (i.e. the VM’s via their storage repositories).  You basically have a complete implementation of OVM at each location just as if it were a standalone site.  The only difference is that some of the repositories are replicated.  The only other potential difference (and I don’t show it or deal with it in my simulation) is RAW lun’s presented to the VM.  Those would have to be replicated at the storage layer as well.

 

I’ve not bothered to mess up the diagram with the VM or Storage networks- you know they’re there and that they’re serving their purpose.  You can see that replication is configured between the PROD Repo LUN and a LUN in DR.  This would be considered an Active/Passive DR Solution.  In this scenario, I don’t show it but you could potentially have some DR workloads running at the DR site.  It isn’t replicated back to PROD but note the next sentence. Now, some companies might have a problem with shelling out all that money for the infrastructure at the DR site and have it sitting unused until a DR event occurred.  Those companies might just decide to run some of their workload in the DR site and have PROD be its DR.  In this Active/Active scenario, your workflow would be pretty much the same, there are just more VM’s and repositories at each site so you need to be careful and plan well.  Here is what an Active/Active configuration would look like:

OVM DR Network Diagram active active

 

Again- my article doesn’t touch on Active/Active but you could easily apply the stuff you learn in these 5 articles to accommodate an Active/Active configuraiton fairly easily.  We’ll be focusing on Active/Passive just as a reminder.  We now have a Virtual Machine running in PROD to facilitate our replication testing.  Make sure the VM runs and can ping the outside network so we know we have a viable machine.  Don’t be expecting lightning performance either, we’re running a VM inside a VM which is inside of a VM.  Not exactly recommended for production use.  Ok- DO NOT use this as your production environment.  There- all the folks who ignore the warnings on hair dryers about using them in the shower should be covered now.

 

Below are the high level steps used to fail over to your DR site.  Once you’ve accomplished this, make sure to remember failback.  Most people are usually so excited about getting the failover to work that they forget they’ll have to fail back at some point once things have been fixed in PROD.

 

FAILOVER (this works if you’re doing a controlled fail over or if a real failure at prod occurs):

  • Ensure all PROD resources are nominal and functioning properly
  • Ensure all DR resources are nominal and functioning properly
  • Ensure replication between PROD and DR ZFS appliances is in place and replicating
  • on ZFSDR1, Stop replication of PROD_REPO
  • on ZFSDR1, Clone PROD_REPO project to new project DRFAIL
  • Rescan physical disk on ovmdr1 (may have to reboot to see new LUN)
  • Verify new physical disk appears
  • Rename physical disk to PROD_REPO_FAILOVER
  • Take ownership of replicated repository in DR OVM Manager
  • Scan for VM’s in the unassigned VM’s folder
  • Migrate the VM to the DR pool
  • Start the VM
  • Check /var/tmp/ and make sure you see the ovmprd1 file that you touched when it was running in PROD.  This proves that it’s the same VM
  • Ping something on your network to establish network access
  • Ping or connect to something on the internet to establish external network access

 

FAILBACK:

  • Ensure all PROD resources are nominal and functioning properly
  • Ensure all DR resources are nominal and functioning properly
  • Restart replication in the opposite direction from ZFSDR1 to ZFSPRD1
  • Ensure replication finishes successfully
  • Rescan physical disks on ovmprd1
  • Verify your PROD Repo LUN is still visible and in good health
  • Browse the PROD Repo and ensure your VM(s) are there
  • Power on your VM’s in PROD and ensure that whatever data was modified while in DR has been replicated back to PROD successfully.
  • Ping something on your network to establish network access
  • Ping or connect to something on the internet to establish external network access

 

Now that we’ve shown you how all this works, I’ll summarize in part 5.

OVM Disaster Recovery In A Box (Part 5 of 5)

Finishing lineCongratulations in your successful failover!  There’s a lot of manual labor involved in failing OVM resources from one site to another- hence the reason for writing this article.  It’s not by any stretch of the imagination an easy or clear cut task.  I’ve attempted to show you in great detail how to set up your environment so it’s ready to accommodate Disaster Recovery.  Out of the box, it won’t work without some tweaks which I’ve shown you in the second and third postings.

 

Since the inception of these posts, Oracle has released a new product offering that is outlined in InfoDoc 1959182.1.  Head on over to MOS and give it a look see.  Basically it’s OVM with OEM integration for the DR failover piece.  That’s an oversimplification but essentially it makes what I laid out in the last 4 posts much easier.

 

I hope this has been an informative series of posts and that someone got some good information out of it!  Take care and happy replicating!

OVM Disaster Recovery In A Box (Part 3 of 5)

Continued from Part 2 of 5:

OVMMDR1

  • create a VM based on Oracle Linux 64bit OS
  • rename the VM to ovmmdr1
  • Give the VM 4gb memory and 2 cpu’s
  • Give the VM a 30gb hard drive
  • configure ovmmdr1 with the following network adapters:
    adapter 1: (Host Only) DR Management
    adapter 2: (NAT Network) DRPublic

  • boot the VM and install Oracle Linux 6.5 and select the Desktop server type.
    We do this so you have a GUI to log into- if that’s not a priority for you personally, then just pick Basic server

  • configure the VM with the following information:


Host Name: ovmmdr1
IP Address (eth0): 10.1.12.110
IP Netmask: 255.255.255.0
IP Address (eth1): 192.168.12.110
IP Netmask: 255.255.255.0
Default Router: 10.1.12.1
DNS Server: 127.0.0.1
Root Password: Way2secure

  • turn off iptables and selinux:


[root@ovmmdr1 ~]# service iptables stop ; chkconfig iptables off
iptables: Setting chains to policy ACCEPT: filter [ OK ]
iptables: Flushing firewall rules: [ OK ]
iptables: Unloading modules: [ OK ]
[root@ovmmdr1 ~]#

[root@ovmmdr1 ~]# setenforce 0
setenforce: SELinux is disabled
[root@ovmmdr1 ~]#

[root@ovmmdr1 ~]# vi /etc/selinux/config

NOTE: Set SELINUX=disabled in file below:

This file controls the state of SELinux on the system.

SELINUX= can take one of these three values:

enforcing – SELinux security policy is enforced.

permissive – SELinux prints warnings instead of enforcing.

disabled – No SELinux policy is loaded.

SELINUX=disabled

SELINUXTYPE= can take one of these two values:

targeted – Targeted processes are protected,

mls – Multi Level Security protection.

SELINUXTYPE=targeted

  • add following line to /etc/hosts

    10.1.12.110 ovmmdr1

  • set parameters in /etc/sysconfig/network

    NETWORKING=yes
    HOSTNAME=ovmmdr1
    GATEWAY=192.168.12.1

  • reboot to make selinux disabled permanently

  • attach the OVM Manager 3.3.2 install ISO to the VM
  • run the createOracle.sh script to prep the VM for the installation of OVM Manager


[root@ovmmdr1 mnt]# ./createOracle.sh
Adding group 'oinstall' with gid '54323' ...
groupadd: group 'oinstall' already exists
Adding group 'dba'
groupadd: group 'dba' already exists
Adding user 'oracle' with user id '54322', initial login group 'dba',
supplementary group 'oinstall' and home directory '/home/oracle' ...
User 'oracle' already exists ...
uid=54321(oracle) gid=54322(dba) groups=54322(dba),54321(oinstall)
Creating user 'oracle' succeeded ...
For security reasons, no default password was set for user 'oracle'.
If you wish to login as the 'oracle' user, you will need to set a password for this account.

Verifying user ‘oracle’ OS prerequisites for Oracle VM Manager …
oracle soft nofile 8192
oracle hard nofile 65536
oracle soft nproc 2048
oracle hard nproc 16384
oracle soft stack 10240
oracle hard stack 32768
oracle soft core unlimited
oracle hard core unlimited
Setting user ‘oracle’ OS limits for Oracle VM Manager …
Altered file /etc/security/limits.conf
Original file backed up at /etc/security/limits.conf.orabackup
Verifying & setting of user limits succeeded …
Changing ‘/u01’ permission to 755 …
Changing ‘/u01/app’ permission to 755 …
Changing ‘/u01/app/oracle’ permission to 755 …
Modifying iptables for OVM
Adding rules to enable access to:
7002 : Oracle VM Manager https
54322 : Oracle VM Manager core via SSL
123 : NTP
10000 : Oracle VM Manager CLI Tool
iptables: Setting chains to policy ACCEPT: filter [ OK ]
iptables: Flushing firewall rules: [ OK ]
iptables: Unloading modules: [ OK ]
iptables: Applying firewall rules: [ OK ]
iptables: Saving firewall rules to /etc/sysconfig/iptables:[ OK ]
iptables: Setting chains to policy ACCEPT: filter [ OK ]
iptables: Flushing firewall rules: [ OK ]
iptables: Unloading modules: [ OK ]
iptables: Applying firewall rules: [ OK ]
Rules added.
[root@ovmmdr1 mnt]#

NOTE: you will need to gather the UUID of OVM Manager in production and install this instance
with that UUID. Run the following command on OVMMPRD1:

grep UUID /u01/app/oracle/ovm-manager-3/.config

  • run the runInstaller.sh script to install OVM Manager


[root@ovmmdr1 mnt]# ./runInstaller.sh -u {UUID from previous step}

Oracle VM Manager Release 3.3.2 Installer

Oracle VM Manager Installer log file:
/var/log/ovmm/ovm-manager-3-install-2015-03-04-170449.log

Please select an installation type:
1: Install
2: Upgrade
3: Uninstall
4: Help

Select Number (1-4): 1

Starting production with local database installation …

Verifying installation prerequisites …
*** WARNING: Recommended memory for the Oracle VM Manager server installation using Local MySql DB is 7680 MB RAM

One password is used for all users created and used during the installation.
Enter a password for all logins used during the installation: Way2secure
Enter a password for all logins used during the installation (confirm): Way2secure

Please enter your fully qualified domain name, e.g. ovs123.us.oracle.com, (or IP address) of your management server for SSL certification generation, more than one IP address are detected: 10.1.12.110 192.168.12.110 [ovmmdr1]: ovmmdr1

Verifying configuration …

Start installing Oracle VM Manager:
1: Continue
2: Abort

Select Number (1-2): 1

Step 1 of 9 : Database Software…
Installing Database Software…
Retrieving MySQL Database 5.6 …
Unzipping MySQL RPM File …
Installing MySQL 5.6 RPM package …
Configuring MySQL Database 5.6 …
Installing MySQL backup RPM package …

Step 2 of 9 : Java …
Installing Java …

Step 3 of 9 : Database schema …
Creating database ‘ovs’ …
Creating database ‘appfw’
Creating user ‘ovs’ for database ‘ovs’…
Creating user ‘appfw’ for database ‘appfw’

Step 4 of 9 : WebLogic and ADF…
Retrieving Oracle WebLogic Server 12c and ADF …
Installing Oracle WebLogic Server 12c and ADF …
Applying patches to Weblogic …

Step 5 of 9 : Oracle VM …
Installing Oracle VM Manager Core …
Retrieving Oracle VM Manager Application …
Extracting Oracle VM Manager Application …

Retrieving Oracle VM Manager Upgrade tool …
Extracting Oracle VM Manager Upgrade tool …
Installing Oracle VM Manager Upgrade tool …

Step 6 of 9 : Domain creation …
Creating Oracle WebLogic Server domain …
Starting Oracle WebLogic Server 12c …
Creating Oracle VM Manager user ‘admin’ …

Retrieving Oracle VM Manager CLI tool …
Extracting Oracle VM Manager CLI tool…
Installing Oracle VM Manager CLI tool …

Step 7 of 9 : Deploy …
Configuring Https Identity and Trust…
Deploying Oracle VM Manager Core container …
Configuring Client Cert Login…
Deploying Oracle VM Manager UI Console …
Deploying Oracle VM Manager Help …
Disabling HTTP access …

Step 8 of 9 : Oracle VM Tools …

Retrieving Oracle VM Manager Shell & API …
Extracting Oracle VM Manager Shell & API …
Installing Oracle VM Manager Shell & API …

Retrieving Oracle VM Manager Wsh tool …
Extracting Oracle VM Manager Wsh tool …
Installing Oracle VM Manager Wsh tool …

Retrieving Oracle VM Manager Tools …
Extracting Oracle VM Manager Tools …
Installing Oracle VM Manager Tools …
Copying Oracle VM Manager shell to ‘/usr/bin/ovm_shell.sh’ …
Installing ovm_admin.sh in ‘/u01/app/oracle/ovm-manager-3/bin’ …
Installing ovm_upgrade.sh in ‘/u01/app/oracle/ovm-manager-3/bin’ …

Step 9 of 9 : Start OVM Manager …
Enabling Oracle VM Manager service …
Shutting down Oracle VM Manager instance …
Starting Oracle VM Manager instance …
Waiting for the application to initialize …
Oracle VM Manager is running …

Please wait while WebLogic configures the applications…
Oracle VM Manager installed.

Installation Summary

Database configuration:
Database type : MySQL
Database host name : localhost
Database name : ovs
Database listener port : 49500
Database user : ovs

Weblogic Server configuration:
Administration username : weblogic

Oracle VM Manager configuration:
Username : admin
Core management port : 54321
UUID : 0004fb00000100006231d80f2ca9856b

Passwords:
There are no default passwords for any users. The passwords to use for Oracle VM Manager, Database, and Oracle WebLogic Server have been set by you during this installation. In the case of a default install, all passwords are the same.

Oracle VM Manager UI:
https://ovmmdr1:7002/ovm/console
Log in with the user ‘admin’, and the password you set during the installation.

Note that you must install the latest ovmcore-console package for your Oracle Linux distribution to gain VNC and serial console access to your Virtual Machines (VMs).
Please refer to the documentation for more information about this package.

For more information about Oracle Virtualization, please visit:
http://www.oracle.com/virtualization/

Oracle VM Manager installation complete.

Please remove configuration file /tmp/ovm_configcKjMF_.
[root@ovmmdr1 mnt]#

Capture

  • Install ovmcore-console-1.0-41.el6.noarch.rpm on ovmmdr1

yum install -y /var/tmp/ovmcore-console-1.0-41.el6.noarch.rpm

OVMPRD1
==============================================

  • create a VM based on Oracle Linux 64bit OS
  • rename the VM to ovmprd1
  • Give the VM 2gb memory and 2 cpu’s
  • Give the VM a 6gb hard drive
  • configure ovmprd1 with the following network adapters:
    adapter 1: (Host Only) Prod Management
    adapter 2: (NAT Network) ProdPublic
    adapter 3: (Host Only) Prod Storage
    adapter 4: (Host Only) Prod Storage

  • boot the VM and install OVM Server 3.3.2

  • configure the VM with the following settings:


Host Name: ovmprd1
IP Address (eth0): 10.1.11.101
IP Netmask: 255.255.255.0
Default Router: 10.1.11.1
DNS Server: 192.168.11.110
OVS Agent Password: Way2secure
Root Password: Way2secure

  • Once the VM has booted fully and is at the splash screen you can continue to the next step


* log into PROD OVM Manager
* discover the PROD OVM server
* choose Servers and VM's tab
* click on PROD OVM server
* select the "Bond Ports" perspective
* create a new bond with the following parameters:

Interface Name: bond1
Addressing: static
IP Address: 172.16.11.101
Mask: 255.255.255.0
MTU: 1500
Description: (optional)
Bonding: Load Balanced
Selected Ports: eth2 and eth3

  • choose Networking tab
  • select the Network labeled 10.1.11.0 and configure with the following parameters:

** Configuration Tab **

Name: Management
Description: (optional)
Network Uses: Check Management, Live Migrate and Cluster Heartbeat

** Ports Tab **

Port Name: bond0

** VLAN Interfaces **

None

== NETWORK CONFIGURATION ==

* Create a new network
* select "Create a Network with Ports/Bond Ports/VLAN Interfaces" radio button and click next
* Give it a name of "Storage", select the "Storage" checkbox then click next
* add bond1 from ovmprd1 and click ok
* click next - there will not be any VLAN interfaces so click Finish
* Create a new network
* select "Create a Network with Ports/Bond Ports/VLAN Interfaces" radio button and click next
* Give it a name of "Public", select the "Virtual Machine" checkbox then click next
* add eth1 from ovmprd1 and click ok
* click next - there will not be any VLAN interfaces so click Finish

== STORAGE CONFIGURATION ==

* Click on the "Storage" tab
* Discover SAN Server
* Assign name of "PROD-ZFS"
* Make sure Storage Type says "iSCSI Storage Server"
* Make sure Storage Plug-in says "Oracle Generic SCSI Plugin"
* Click next
* Add an Access Host with IP address of 172.16.11.100
* Click next
* Add ovmprd1 to Selected Servers then click next
* Edit the default access group
* On the storage initiators tab, add ovmprd1's iqn to Selected Storage Initiators then click ok
* Click Finish
* Highlight the PROD-ZFS SAN server and click Refresh SAN Server
* Verify that two physical disks are visible
* Rename the 12gb LUN to PROD-PoolFS
* Rename the 30gb LUN to PROD-REPO

== SERVER POOL CREATION ==

* Click on the Servers and VM's tab
* Create a new Server Pool called PROD
* Give it a VIP of 10.1.11.102
* Select Physical Disk radio button
* Select Storage Location and choose the PROD-PoolFS LUN
* Click next
* move ovmprd1 to Selected servers then click finish

== STORAGE REPOSITORY CREATION ==

* Click on the Repositories tab
* Create a new Repository called PROD-REPO
* Select the Physical Disk radio button under Repository Location
* Click on the magnifying glass and choose PROD-REPO then click next
* move ovmprd1 to Present to Servers then click finish

== STORAGE REPOSITORY REPLICATION ==

* Log into zfsprd1
* Click on Configuration, then services
* Edit the replication service
* Add a target

Name: ovmdr1
Hostname: 172.16.10.101
Root password: Way2secure

  • Click on shares
  • Edit the PROD-REPO Project
  • Click on the Replication sub group
  • Add an action

Target: ovmdr1
Pool: DR
Update Frequency: Scheduled
Add a Schedule for every half hour at 00 minutes after
Leave the rest of the settings at the default

  • Click Add
  • Hover over the Target near the STATUS column and click on the picture of the two circular
    arrows pointing to eachother. This will kick off a manual replication.
  • Monitor replication status until completely replicated (should take about 5 minutes)

==============================================

OVMDR1

  • create a VM based on Oracle Linux 64bit OS
  • rename the VM to ovmdr1
  • Give the VM 2gb memory and 2 cpu’s
  • Give the VM a 6gb hard drive
  • configure ovmdr1 with the following network adapters:
    adapter 1: (Host Only) DR Management
    adapter 2: (NAT Network) DRPublic
    adapter 3: (Host Only) DR Storage
    adapter 4: (Host Only) DR Storage
  • boot the VM and install OVM Server 3.3.2
  • configure the VM with the following settings:

Host Name: ovmdr1
IP Address (eth0): 10.1.12.101
IP Netmask: 255.255.255.0
Default Router: 10.1.12.1
DNS Server: 192.168.12.110
OVS Agent Password: Way2secure
Root Password: Way2secure

  • Once the VM has booted fully and is at the splash screen you can continue to the next step
  • Copy following configuration files from ovmprd1 to ovmdr1. Note that in your installation,
    the bridge name referenced below will be different

/etc/sysconfig/network-scripts/meta-eth1
/etc/sysconfig/network-scripts/ifcfg-{bridge} (example /etc/sysconfig/network-scripts/ifcfg-1080940192)

  • Edit /etc/sysconfig/network-scripts/ifcfg-{bridge} on ovmdr1 to make the
    MAC address match that of eth1 on ovmdr1 but leave the bridge number intact

  • log into DR OVM Manager

  • discover the DR OVM server
  • choose Servers and VM’s tab
  • click on DR OVM server
  • select the “Bond Ports” perspective
  • create a new bond with the following parameters:

Interface Name: bond1
Addressing: static
IP Address: 172.16.12.101
Mask: 255.255.255.0
MTU: 1500
Description: (optional)
Bonding: Load Balanced
Selected Ports: eth2 and eth3

  • choose Networking tab
  • select the Network labeled 10.1.12.0 and configure with the following parameters:

** Configuration Tab **
Name: Management
Description: (optional)
Network Uses: Check Management, Live Migrate and Cluster Heartbeat

** Ports Tab **
Port Name: bond0

** VLAN Interfaces **
None

== NETWORK CONFIGURATION ==
* Create a new network
* select “Create a Network with Ports/Bond Ports/VLAN Interfaces” radio button and click next
* Give it a name of “Storage”, select the “Storage” checkbox then click next
* add bond1 from ovmdr1 and click ok
* click next – there will not be any VLAN interfaces so click Finish
* Create a new network
* Observe that the Public network is already there. This is done by copying the meta file and the
bridge file from OVMPRD1

== STORAGE CONFIGURATION ==
* Click on the “Storage” tab
* Discover SAN Server
* Assign name of “DR-ZFS”
* Make sure Storage Type says “iSCSI Storage Server”
* Make sure Storage Plug-in says “Oracle Generic SCSI Plugin”
* Click next
* Add an Access Host with IP address of 172.16.12.100
* Click next
* Add ovmdr1 to Selected Servers then click next
* Edit the default access group
* On the storage initiators tab, add ovmdr1’s iqn to Selected Storage Initiators then click ok
* Click Finish
* Highlight the DR-ZFS SAN server and click Refresh SAN Server
* Verify that one physical disk is visible
* Rename the 12gb LUN to DR-PoolFS

== SERVER POOL CREATION ==
* Click on the Servers and VM’s tab
* Create a new Server Pool called DR
* Give it a VIP of 10.1.12.102
* Select Physical Disk radio button
* Select Storage Location and choose the DR-PoolFS LUN
* Click next
* move ovmdr1 to Selected servers then click finish

== TEMPLATE IMPORT ON PROD ==
* Download template to /var/tmp on ovmprd1 (should be .ova format to proceed- unzip if needed)
* Start Python web server on ovmprd1

python -m SimpleHTTPServer 8080

  • Navigate to the Repositories tab in ovmmprd1
  • Expand the PROD-REPO repository and highlight the Assemblies folder
  • Click on the import VM Assembly button
    VM Template URLs: http://10.1.11.101:8080/OVM_OL6U6_x86_64_PVM.ova

  • Click on assembly that was just imported

  • Create template from assembly
    Assembly Virtual Machines: {select the assembly you just imported}
    VM Template Name: t_OL6.6

  • Click ok

  • Edit the t_OL6.6 template
    Add public network
    change sizing to 1gb memory and 1vcpu

  • Clone the t_OL6.6 template to a virtual machine
    Clone Name: ol6.6
    Target Server Pool: PROD

  • Edit ol6.6.0 VM
    Change VM name to ol6.6
    If there is an extra VNIC, remove it

  • Start VM and connect to console

  • Configure VM with hostname and root password
  • Touch a file

touch /var/tmp/ovmprd1

 

Continued in part 4 of 5