Log into sftp server with filezilla using ssl key

Prerequisites:

  1. create account on sftp server (optionally in a chroot sftp only environment for safety)
  2. generate an rsa ssl key with puttygen
  3. save the key in both public and private key (.ppk) format
  4. copy the public key to the remote system you’re connecting to in the appropriate location (usually ~username/.ssh/authorized_keys)
  • copy the private .ppk key you created to the local system
  • open filezilla
  • click file -> Site Manager
  • click New Site
    • Host: {address of sftp server}
    • Port: 22
    • Protocol: sftp
    • Logon Type: key file
    • user: {username created on the sftp server}
    • Key file: {browse to location where you saved the .ppk file}
    • click connect
  • In the left pane, cd to the location of the file you want to upload
  • Drag the file you want to upload from the left pane to the right pane and wait for the upload to complete
  • Close the program

 

DONE!

Advertisements

Log into UNIX servers with key via Putty

  • Create an rsa keypair with no passphrase using puttygen
  • Save the public key
  • Save the private key (.ppk format)
  • Configure PuTTy to use your private key
    • Connection -> SSH -> Auth -> Private key for authentication
  • Configure PuTTy to automatically log you in with your username
    • Connection -> Data -> Auto-login Username
  • Save the profile in PuTTy
  • Copy your public key to ~/.ssh/authorized_hosts on each server you want to connect to

Oracle X7-2 HA ODA fiber interface issues

I was working with a customer to deploy an X7-2HA ODA awhile back.  They opted to use 10gbE fiber (as most customers do) for their public network interface.  One problem I ran into quite early on is what turned out to be a bug in the driver for those onboard SFP ports.  They actually can negotiate up to 25gb in addition to 10gb.  This is what caused my problem- the fiber switch ports couldn’t do 25gb and that’s what the onboard adapters was trying to negotiate.

I applied the updated driver to the NIC and rebooted.  Nothing.  No link no packets.  I verified with ethtool that the NIC was only advertising 10gb as its max speed and not 25 like before.  After some troubleshooting, I disabled autonegotiation and forced each of the two adapters to 10gb.  A few seconds after this the link came up and I was able to ping my default gateway!

After reboot however, same thing- no link.

33

 

I wound up having to put the following lines at the end of /etc/rc.local on each compute node:

# btbond1 - force em2/em3 to 10 gig Ethernet speed
/sbin/ethtool -s em2 autoneg off speed 10000
/sbin/ethtool -s em3 autoned off speed 10000
sleep 10
ping -c 5 {default gateway}
sleep 10
ping -c 5 {default gateway}

This basically ensures the adapters get forced to 10gb and makes them ping to “wake up” the interface at the end of the system boot process.  Make sure you don’t forget to limit the ping to 5 packets or something reasonable, otherwise guess what you’re going to see on your console every second until eternity?

I’ve been assured this has been fixed in future driver releases- and I’m pretty confident that it will be or there are going to be quite a few pissed off Oracle customers out there!

Rescan virtual disk in a VMware linux VM without reboot

I’ve run into this situation a number of times.  I get a request from a user to resize a filesystem and add some space to it.  Thankfully, through the magic of virtualization I can change the size of the disk on the fly (assuming there are no snapshots in effect).

 

Resizing the disk in VMware goes fine, so I log into the VM.  Normally for a physical machine with SAN or even SCSI disks, I’d go to /sys/class/scsi_host/ and figure out which adapter the disk I want to resize is sitting on.  Usually a combination of fdisk and lsscsi will give me the info I need here.  Good to go!  So I cd into the hostX folder that represents the right controller.  Here’s where things go south.  I’ve had luck with sending an all HCTL scan and the disk recognizes the new size:
 

echo "- - -" > scan

 

By the way, when I mentioned sending an all HCTL scan, let me explain what that means. HCTL stands for scsi_Host, Channel, Target and Lun. When you’re looking at the output of lsscsi as you’ll see below shortly, you’ll see some numbers separated by colons like such:

[2:0:3:0] disk VMware Virtual disk 1.0 /dev/sdd

The four numbers here represent the following
2 = scsi_host
    This is the numeric iteration of the host bus adapter that the scsi device is connected to.  Think of a scsi card or fiber channel card here.  The first scsi_host is usually the internal ones built into most servers as they usually get enumerated first during POST.

0 = Channel
    This is the channel on the HBA that is being referred to.  Think of a dual channel SCSI card or a dual port Fiber Channel HBA.  

3 = Target
    This refers to the SCSI target of the device we're looking at.  In this case, a good description would be an internal SCSI card that has a tape drive, CDROM and a couple hard drives attached to it.  Each of those devices would have a separate "target" to address that device specifically.

0 = Logical Unit Number or LUN
    This is the representation of a sub unit of a target.  A good example would be an optical disk library where the drive itself gets a SCSI target and assumes LUN 0, then the optical disks themselves get assigned LUNs so they can be addressed.  This also more commonly comes into play when you have a SAN that is exporting multiple disks (a.k.a. LUNs).  Say you have an EMC SAN that is presenting 30 disks to a server.  Based on conventional SCSI limitations, most cards can only address up to 15 targets per HBA channel (some will do up to 24 but they are extremely rare).  In this scenario you would need a couple SCSI HBAs to see all those disks.  Now picture thousands of disks... You see where I'm going with this.

I’ve always had luck seeing newly presented disks by doing an all HCTL scan, even in VMware.  But I always wound up having to reboot the damn VM just to get it to recognize the new size of the disk.  Well I stumbled upon a slightly different process today that lets me do what I’ve been trying to do.  Here’s the breakdown:

  • Determine which disk you want to resize.  fdisk -l usually does the trick:
[root@iscsi ~]# fdisk -l

Disk /dev/sda: 32.2 GB, 32212254720 bytes
64 heads, 32 sectors/track, 30720 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0009e4a5

Device Boot Start End Blocks Id System
/dev/sda1 * 2 501 512000 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2 502 30720 30944256 8e Linux LVM
Partition 2 does not end on cylinder boundary.

Disk /dev/mapper/VolGroup-lv_root: 27.5 GB, 27455913984 bytes
255 heads, 63 sectors/track, 3337 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/mapper/VolGroup-lv_swap: 4227 MB, 4227858432 bytes
255 heads, 63 sectors/track, 514 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdb: 1099.5 GB, 1099511627776 bytes
255 heads, 63 sectors/track, 133674 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x02020202

Disk /dev/sdc: 12.9 GB, 12884901888 bytes
64 heads, 32 sectors/track, 12288 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

WARNING: GPT (GUID Partition Table) detected on '/dev/sdd'! The util fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sdd: 16.6 GB, 32212254720 bytes
256 heads, 63 sectors/track, 3900 cylinders
Units = cylinders of 16128 * 512 = 8257536 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdd1 1 2081 16777215+ ee GPT
[root@iscsi ~]#
  • Ok so /dev/sdd is the disk I want to resize.  I go resize it in VMware and re-run fdisk but still get the same size- no shock there.
  • Now let’s use the lsscsi command to show us some information about the scsi devices that the OS sees.  You may have to install this tool first.
[root@iscsi ~]# lsscsi -v
[1:0:0:0] cd/dvd NECVMWar VMware IDE CDR10 1.00 /dev/sr0
dir: /sys/bus/scsi/devices/1:0:0:0 [/sys/devices/pci0000:00/0000:00:07.1/host1/target1:0:0/1:0:0:0]
[2:0:0:0] disk VMware Virtual disk 1.0 /dev/sda
dir: /sys/bus/scsi/devices/2:0:0:0 [/sys/devices/pci0000:00/0000:00:15.0/0000:03:00.0/host2/target2:0:0/2:0:0:0]
[2:0:1:0] disk Nimble Server 1.0 /dev/sdb
dir: /sys/bus/scsi/devices/2:0:1:0 [/sys/devices/pci0000:00/0000:00:15.0/0000:03:00.0/host2/target2:0:1/2:0:1:0]
[2:0:2:0] disk Nimble Server 1.0 /dev/sdc
dir: /sys/bus/scsi/devices/2:0:2:0 [/sys/devices/pci0000:00/0000:00:15.0/0000:03:00.0/host2/target2:0:2/2:0:2:0]
[2:0:3:0] disk VMware Virtual disk 1.0 /dev/sdd
dir: /sys/bus/scsi/devices/2:0:3:0 [/sys/devices/pci0000:00/0000:00:15.0/0000:03:00.0/host2/target2:0:3/2:0:3:0]

 

  • I used the -v (verbose) flag to have it tell me more information about each scsi device.  This gives us a great shortcut to decoding where in /sys/class/scsi_device our target resides.
  • To tell the scsi subsystem to rescan the scsi device, we simply echo a 1 to the rescan file which is located inside the folder identified above.  In our case, the folder is /sys/devices/pci0000:00/0000:00:15.0/0000:03:00.0/host2/target2:0:3/2:0:3:0.  If you cd into this folder, you see a bunch of entries.  BE CAREFUL here, these file entries represent in most cases a live view of what the system is seeing or doing.  If you do the wrong thing, you could tell the disk to power off, or maybe something even more destructive.  There’s no failsafe, the OS isn’t going to ask you if you’re sure you want to do this- we’re poking around in the live kernel here through the “back door” if you will.  Here’s a listing of what my systems shows:
[root@iscsi 2:0:3:0]# ls -la
total 0
drwxr-xr-x 8 root root 0 Mar 21 09:22 .
drwxr-xr-x 4 root root 0 Mar 21 09:22 ..
drwxr-xr-x 3 root root 0 Mar 21 09:22 block
drwxr-xr-x 3 root root 0 Mar 21 09:33 bsg
--w------- 1 root root 4096 Mar 21 09:33 delete
-r--r--r-- 1 root root 4096 Mar 21 09:33 device_blocked
-rw-r--r-- 1 root root 4096 Mar 21 09:33 dh_state
lrwxrwxrwx 1 root root 0 Mar 21 09:33 driver -> ../../../../../../../bus/scsi/drivers/sd
-r--r--r-- 1 root root 4096 Mar 21 09:33 evt_media_change
lrwxrwxrwx 1 root root 0 Mar 21 09:30 generic -> scsi_generic/sg4
-r--r--r-- 1 root root 4096 Mar 21 09:33 iocounterbits
-r--r--r-- 1 root root 4096 Mar 21 09:33 iodone_cnt
-r--r--r-- 1 root root 4096 Mar 21 09:33 ioerr_cnt
-r--r--r-- 1 root root 4096 Mar 21 09:33 iorequest_cnt
-r--r--r-- 1 root root 4096 Mar 21 09:33 modalias
-r--r--r-- 1 root root 4096 Mar 21 09:33 model
drwxr-xr-x 2 root root 0 Mar 21 09:33 power
-r--r--r-- 1 root root 4096 Mar 21 09:33 queue_depth
-r--r--r-- 1 root root 4096 Mar 21 09:33 queue_type
--w------- 1 root root 4096 Mar 21 09:33 rescan
-r--r--r-- 1 root root 4096 Mar 21 09:30 rev
drwxr-xr-x 3 root root 0 Mar 21 09:22 scsi_device
drwxr-xr-x 3 root root 0 Mar 21 09:33 scsi_disk
drwxr-xr-x 3 root root 0 Mar 21 09:33 scsi_generic
-r--r--r-- 1 root root 4096 Mar 21 09:30 scsi_level
-rw-r--r-- 1 root root 4096 Mar 21 09:33 state
lrwxrwxrwx 1 root root 0 Mar 21 09:33 subsystem -> ../../../../../../../bus/scsi
-rw-r--r-- 1 root root 4096 Mar 21 09:33 timeout
-r--r--r-- 1 root root 4096 Mar 21 09:33 type
-rw-r--r-- 1 root root 4096 Mar 21 09:33 uevent
-r--r--r-- 1 root root 4096 Mar 21 09:33 vendor
[root@iscsi 2:0:3:0]#
  • The file we’re interested in is called “rescan”.  The way these generally work is you can poke a value into the kernel by echoing that value into the file like you were appending something to a text file.  Depending on the kernel parameter you’re working with, the value you poke into it will determine what action it takes.  They generally take a 1 or 0 for true or false.  In this case, we want the kernel to rescan this device so we echo “1” > rescan.  This tells the kernel to take another look at the device itself and register any changes that have been made since the system first became aware of it at boot time.
[root@iscsi 2:0:3:0]# echo "1" > rescan
[root@iscsi 2:0:3:0]# fdisk -l /dev/sdd

WARNING: GPT (GUID Partition Table) detected on '/dev/sdd'! The util fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sdd: 32.2 GB, 32212254720 bytes
256 heads, 63 sectors/track, 3900 cylinders
Units = cylinders of 16128 * 512 = 8257536 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdd1 1 2081 16777215+ ee GPT

 

Notice the new size of /dev/sdd is now 32GB where it was 16GB before.  Congratulations!  Ok don’t get all excited just yet, your journey has just begun.  Now you have to update the partition table to reflect the new number of sectors so the OS can use the new space.  Then you have to resize whatever filesystem resides on that disk.  If you’re using LVM or software RAID, you’ll need to make some changes at that level first before messing with the filesystem.  The good news is most if not all of this stuff can be done online without having to reboot.

 

I hope this helps and let me know if you have any questions or input as to how to do this better!

 

Oracle VM for x86: Hard Partitioning Hands On

As most of you likely know, Oracle has stringent licensing rules when it comes to running their software in a virtual environment.  With anything other than Oracle VM Server for x86, you basically have to license every core in the cluster (VMware, Hyper-V, etc).  With OVM, Oracle does accept a specific configuration that satisfies their definition of a “hard partition” where processor licensing is concerned.  This means that if you own 2 processor licenses for Oracle Database EE for example, and are running on a platform that has a .5 license multiplier (such as x86), you are entitled to run that software on 4 cores.

 

Here are the requirements to satisfy the hard partition I mentioned above (taken from a document that is linked in InfoDoc 1529408.1):

To conform to the Oracle hard partition licensing requirement, you must follow the instructions described in this white paper to bind vCPUs to physical CPU threads or cores.

<

p style=”padding-left:30px;”>Live migration of CPU pinned virtual machines to another Oracle VM Server is not permitted under the terms of the hard partitioning license. Consequently, for Oracle VM Release 3, any servers running CPU pinned guests must not be included in DRS (Distributed Resource Scheduler) and DPM (Distributed Power Management) policies.
When live migration is used in an Oracle VM server pool, hard partition licensing is not applicable. You must determine the number of virtual machines running the Oracle Software and then license the same number of physical servers (starting with the largest servers based on the CPU core count) up to the total number of the physical servers in the pool. For example, if a customer has a server pool with 32 servers and 20 virtual machines running Oracle Software within the server pool, the customer must license the 20 largest physical servers in the pool. If the customer is running 50 virtual machines with Oracle Software in a pool of 32 physical servers, they need only to license the 32 physical servers in the pool.

Live migration of other virtual machines with non-Oracle software within the server pool is not relevant to Oracle software hard partitioning or has no impact to how Oracle software license is calculated.

“Trusted Partitions” allow subset licensing without limitation on live migration, but only available on the approved Oracle Engineered Systems listed on Oracle licensing policies for partitioned environments.

 

There is more information in that document on how to actually perform the CPU pinning but we don’t need to get into that level of detail just yet.  To summarize- here are the key takeaways you should be aware of when considering using OVM for hard partitioning:

  • The use of hyperthreading or no hyperthreading is irrelevant to Oracle from a licensing perspective
  • vCPUs are bound or “pinned” to physical cores using an OVM Manager utility that must be downloaded and installed on your OVM Manager
  • Live Migration, DRS and DPM is not allowed for pinned VMs
  • You have to choose which vCPUs you want to PIN your VM to.  Be careful that you don’t accidentally pin more than one VM to a given set of vCPUs- it’s a completely valid configuration but your performance will go to hell due to contention in the CPU scheduler.
  • Get in the habit of pinning your secondary workloads (applications that don’t require hard partitions) to a set of unused vCPUs.  This way they can’t potentially run on the same vCPU that you just pinned your production database VM to.
  • Make sure when you bind vCPUs that you don’t accidentally cross core boundaries.  It only takes 1 vCPU running on a separate core to mess up your licensing costs.  See my blog post here to get an idea of what I mean.

 

The Real World

Now I want to show you a few things that they don’t talk about in the licensing documents that you are likely to run across in your life as an OVM administrator.

  • live migrate a pinned VM from one OVM Server to another

Capture 2

As you can see above, we have 4 VMs running in this cluster.  Below is an overview of prod_db1.  Take note of the ID field, we’ll use it later to identify the VM:

Capture1

We’re gonna use prod_db1 as our guinea pig for this experiment.  Currently prod_db1 is running on server OVM1 and is pinned to vCPUs 0-3 as noted in the vm.cfg snippet below:

Capture3

I also have a VM running on server ovm2 that is pinned to the very same vCPUs:

Capture4

One would think you cannot live migrate the VM from ovm1 to ovm2 because of the fact that prod_db3 is already pinned to the same vCPUs on ovm2?

Screenshot (7)

 

You certainly can perform the live migration.  Here’s what will happen:

  • The VM will successfully migrate to ovm2
  • prod_db1 will only run on vCPUs 0-3 on ovm2
  • prod_db3 will only run on vCPUs 0-3 on ovm2
  • your performance in both VMs will likely go down the drain
  • you will be out of compliance with Oracle hard partition licensing requirements

 

I’ve had a LOT of people ask me this question, so here’s your proof:

[root@ovm1 ~]# xm vcpu-list
Name ID VCPU CPU State Time(s) CPU Affinity
0004fb00000600000632b8de1db5a014 3 0 20 -b- 3988.0 any cpu
0004fb00000600000632b8de1db5a014 3 1 21 -b- 133.8 any cpu
0004fb00000600008825773ba1661d01 2 0 0 -b- 3083.6 0-3
0004fb00000600008825773ba1661d01 2 1 3 -b- 308.1 0-3
Domain-0 0 0 0 r-- 63990.1 0
Domain-0 0 1 1 r-- 62421.0 1
Domain-0 0 2 2 -b- 16102.8 2
Domain-0 0 3 3 -b- 10355.7 3
Domain-0 0 4 4 -b- 2718.1 4
Domain-0 0 5 5 -b- 9427.4 5
Domain-0 0 6 6 -b- 5660.8 6
Domain-0 0 7 7 -b- 3932.0 7
Domain-0 0 8 8 -b- 2268.0 8
Domain-0 0 9 9 -b- 8477.9 9
Domain-0 0 10 10 -b- 4950.6 10
Domain-0 0 11 11 -b- 4304.6 11
Domain-0 0 12 12 -b- 2001.5 12
Domain-0 0 13 13 -b- 10321.1 13
Domain-0 0 14 14 -b- 5221.5 14
Domain-0 0 15 15 -b- 3515.0 15
Domain-0 0 16 16 -b- 2408.8 16
Domain-0 0 17 17 -b- 9905.2 17
Domain-0 0 18 18 -b- 6105.3 18
Domain-0 0 19 19 -b- 4504.2 19



[root@ovm2 ~]# xm vcpu-list
Name ID VCPU CPU State Time(s) CPU Affinity
Domain-0 0 0 0 -b- 54065.1 0
Domain-0 0 1 1 -b- 10110.4 1
Domain-0 0 2 2 -b- 4909.4 2
Domain-0 0 3 3 -b- 6344.0 3
Domain-0 0 4 4 -b- 1012.4 4
Domain-0 0 5 5 -b- 6506.3 5
Domain-0 0 6 6 -b- 4163.1 6
Domain-0 0 7 7 -b- 1564.5 7
Domain-0 0 8 8 -b- 1367.5 8
Domain-0 0 9 9 -b- 14307.2 9
Domain-0 0 10 10 -b- 4068.7 10
Domain-0 0 11 11 -b- 1799.4 11
Domain-0 0 12 12 -b- 1731.3 12
Domain-0 0 13 13 -b- 5478.0 13
Domain-0 0 14 14 -b- 6983.5 14
Domain-0 0 15 15 -b- 5781.6 15
Domain-0 0 16 16 -b- 723.4 16
Domain-0 0 17 17 r-- 4922.6 17
Domain-0 0 18 18 r-- 3585.3 18
Domain-0 0 19 19 -b- 1705.8 19
0004fb0000060000c9e5303a8dc2c675 3 0 0 -b- 5556.6 0-3
0004fb0000060000c9e5303a8dc2c675 3 1 3 -b- 144.4 0-3
  • Now I live migrate prod_db1 from ovm1 to ovm2

Screenshot (8)Screenshot (9)

Screenshot (10)

 

Here is the new vcpu-list post-migration:

[root@ovm1 ~]# xm vcpu-list
Name ID VCPU CPU State Time(s) CPU Affinity
0004fb00000600000632b8de1db5a014 3 0 20 -b- 4007.2 any cpu
0004fb00000600000632b8de1db5a014 3 1 21 -b- 134.4 any cpu
Domain-0 0 0 0 r-- 64376.4 0
Domain-0 0 1 1 r-- 62793.1 1
Domain-0 0 2 2 -b- 16201.5 2
Domain-0 0 3 3 -b- 10418.6 3
Domain-0 0 4 4 -b- 2743.2 4
Domain-0 0 5 5 -b- 9486.1 5
Domain-0 0 6 6 -b- 5702.4 6
Domain-0 0 7 7 -b- 3955.7 7
Domain-0 0 8 8 -b- 2279.8 8
Domain-0 0 9 9 -b- 8530.4 9
Domain-0 0 10 10 -b- 4984.4 10
Domain-0 0 11 11 -b- 4328.3 11
Domain-0 0 12 12 -b- 2013.2 12
Domain-0 0 13 13 -b- 10390.7 13
Domain-0 0 14 14 -b- 5257.2 14
Domain-0 0 15 15 -b- 3542.0 15
Domain-0 0 16 16 -b- 2422.3 16
Domain-0 0 17 17 -b- 9969.5 17
Domain-0 0 18 18 -b- 6150.0 18
Domain-0 0 19 19 -b- 4532.5 19



[root@ovm2 ~]# xm vcpu-list
Name ID VCPU CPU State Time(s) CPU Affinity
0004fb00000600008825773ba1661d01 5 0 2 -b- 1.9 0-3
0004fb00000600008825773ba1661d01 5 1 1 -b- 0.2 0-3
Domain-0 0 0 0 -b- 54418.2 0
Domain-0 0 1 1 -b- 10228.5 1
Domain-0 0 2 2 -b- 4939.8 2
Domain-0 0 3 3 -b- 6373.9 3
Domain-0 0 4 4 -b- 1024.7 4
Domain-0 0 5 5 -b- 6547.6 5
Domain-0 0 6 6 -b- 4218.0 6
Domain-0 0 7 7 -b- 1596.2 7
Domain-0 0 8 8 -b- 1374.9 8
Domain-0 0 9 9 -b- 14341.6 9
Domain-0 0 10 10 -b- 4099.5 10
Domain-0 0 11 11 -b- 1822.6 11
Domain-0 0 12 12 -b- 1737.6 12
Domain-0 0 13 13 r-- 5513.4 13
Domain-0 0 14 14 -b- 7016.8 14
Domain-0 0 15 15 -b- 5814.6 15
Domain-0 0 16 16 -b- 731.6 16
Domain-0 0 17 17 -b- 4960.6 17
Domain-0 0 18 18 -b- 3617.2 18
Domain-0 0 19 19 -b- 1714.2 19
0004fb0000060000c9e5303a8dc2c675 3 0 3 -b- 5590.3 0-3
0004fb0000060000c9e5303a8dc2c675 3 1 0 -b- 145.6 0-3

 

You can see that both VMs are pinned to the same vCPUs and they’re still running just fine.  Like I said- it will technically work but you’re shooting yourself in the foot in multiple ways if you do this.  Also keep in mind- if you turn on HA for prod_db1 and ovm1 goes down, the VM will fail to start on ovm2 because of the cpu pinning.  Don’t say I didn’t warn you!

 

  • Apply CPU pinning to a VM online with no reboot

In OVM 3.2 and 3.3, you were able to apply CPU pinning to a VM live without having to restart it.  A bug emerged in OVM 3.4.1 and 3.4.2 that broke this.  However it was fixed in OVM 3.4.3.  So depending on which version of OVM you’re running, you may be able to pin your VMs without having to take a reboot.  Watch and be amazed!

 

Currently running OVM 3.3.3:

[root@ovm1 ~]# cat /etc/ovs-release
Oracle VM server release 3.3.3

 

ovm_vmcontrol utilities are installed:

[root@ovmm ovm_util]# pwd
/u01/app/oracle/ovm-manager-3/ovm_util
[root@ovmm ovm_util]# ls -la
total 44
drwxrwxr-x 5 root root 4096 Jul 2 2014 .
drwxr-xr-x 11 oracle dba 4096 Aug 29 13:04 ..
drwxrwxr-x 2 root root 4096 Jul 2 2014 class
drwxr-xr-x 2 root root 4096 Jul 2 2014 lib
drwxr-xr-x 3 root root 4096 Jul 2 2014 man
-rwxr-xr-x 1 root root 1229 Jul 2 2014 ovm_reporestore
-rwxr-xr-x 1 root root 1227 Jul 2 2014 ovm_vmcontrol
-rwxr-xr-x 1 root root 1245 Jul 2 2014 ovm_vmdisks
-rwxr-xr-x 1 root root 1245 Jul 2 2014 ovm_vmhostd
-rwxr-xr-x 1 root root 1246 Jul 2 2014 ovm_vmmessage
-rwxr-xr-x 1 root root 2854 Jul 2 2014 vm-dump-metrics

 

I have an existing VM that is currently allowed to run on any vCPU on the server:

[root@ovm1 ~]# xm vcpu-list
Name ID VCPU CPU State Time(s) CPU Affinity
0004fb00000600000632b8de1db5a014 3 0 20 -b- 4012.8 any cpu
0004fb00000600000632b8de1db5a014 3 1 21 -b- 134.6 any cpu
Domain-0 0 0 0 -b- 64446.0 0
Domain-0 0 1 1 -b- 62820.1 1
Domain-0 0 2 2 -b- 16213.7 2
Domain-0 0 3 3 -b- 10426.0 3
Domain-0 0 4 4 -b- 2746.1 4
Domain-0 0 5 5 -b- 9499.3 5
Domain-0 0 6 6 -b- 5712.5 6
Domain-0 0 7 7 -b- 3960.2 7
Domain-0 0 8 8 -b- 2282.3 8
Domain-0 0 9 9 -b- 8541.0 9
Domain-0 0 10 10 -b- 4992.0 10
Domain-0 0 11 11 -b- 4334.6 11
Domain-0 0 12 12 -b- 2015.6 12
Domain-0 0 13 13 -b- 10404.4 13
Domain-0 0 14 14 -b- 5265.1 14
Domain-0 0 15 15 -b- 3546.7 15
Domain-0 0 16 16 -b- 2423.7 16
Domain-0 0 17 17 r-- 9983.8 17
Domain-0 0 18 18 -b- 6158.2 18
Domain-0 0 19 19 -b- 4536.8 19

 

Now let’s pin that VM to vcpu 8-11:

[root@ovmm ovm_util]# ./ovm_vmcontrol -u admin -p ******** -h localhost -v prod_db2 -c vcpuset -s 8-11
Oracle VM VM Control utility 2.0.1.
Connecting with a secure connection.
Connected.
Command : vcpuset
Pinning virtual CPUs
Pinning of virtual CPUs to physical threads '8-11' 'prod_db2' completed.

 

And here’s our proof that the pinning is applied immediately with no reboot:

[root@ovm1 ~]# xm vcpu-list
Name ID VCPU CPU State Time(s) CPU Affinity
0004fb00000600000632b8de1db5a014 3 0 10 -b- 4013.6 8-11
0004fb00000600000632b8de1db5a014 3 1 8 -b- 134.6 8-11
Domain-0 0 0 0 -b- 64454.8 0
Domain-0 0 1 1 -b- 62823.2 1
Domain-0 0 2 2 -b- 16215.2 2
Domain-0 0 3 3 -b- 10427.0 3
Domain-0 0 4 4 -b- 2746.3 4
Domain-0 0 5 5 r-- 9500.6 5
Domain-0 0 6 6 -b- 5713.6 6
Domain-0 0 7 7 -b- 3960.6 7
Domain-0 0 8 8 -b- 2282.5 8
Domain-0 0 9 9 -b- 8542.9 9
Domain-0 0 10 10 -b- 4992.8 10
Domain-0 0 11 11 -b- 4335.0 11
Domain-0 0 12 12 -b- 2015.8 12
Domain-0 0 13 13 -b- 10406.7 13
Domain-0 0 14 14 -b- 5266.4 14
Domain-0 0 15 15 -b- 3547.2 15
Domain-0 0 16 16 -b- 2424.2 16
Domain-0 0 17 17 -b- 9984.8 17
Domain-0 0 18 18 -b- 6159.6 18
Domain-0 0 19 19 -b- 4537.6 19

 

You’ll just have to take my word that I didn’t reboot the VM inbetween the steps- which should be validated by the time column for that VM (note that it increased a little, not reset to 0).

 

 

Well- happy hunting for now!

OVM CPU Pinning

shutterstock_90181546

 

Oracle has published a few documents (2240035.1 and 2213691.1 for starters) about CPU pinning in relation to hard partitions for VMs running on OVM.  This is to avoid having to license every core on the server (like you have to with VMware) for Oracle products that are licensed per core or per user.

 

I’m going to provide an excel spreadsheet at the end of this post that will help you visualize which VM is pinned to which CPU and if there is any overlap.  When a VM is not pinned to a given CPU, it is allowed to run on any cpu within the constraints of the Xen scheduler and where it wants the VM to run.  It will take into account things like NUMA and core boundaries to avoid scheduling a VM in a way that is inefficient.

 

You will need to modify this spreadsheet to fit your server configuration.  Use the information in the ovm-hardpart-168217 document to figure out what your systems CPU topology looks like.

 

A couple things to keep in mind:

  • You cannot live migrate a VM that is pinned.  Technically it will work and the VM will migrate. but Oracle does not allow this based on the terms of their hard partitioning license.  See attached document ovm-hardpart-168217 at the end of this post for more information.
  • When you pin a VM to a vCPU or range of vCPUs, that VM can only run on those vCPUs.  However, if you have other VMs that are not pinned, they can run on any vCPU on the system- including the ones that you just pinned your production database to!  If you have a combination of pinned and unpinned VMs, pin all the other VMs to the range of vCPUs that you want to lock them to.  This way, they can’t run on any vCPUs that you’ve already pinned VMs to.
  • Remember that DOM0 has to be scheduled to run just like the other resources.  Based on how big your system is, OVM will run DOM0 on the first few vCPUs.  This shouldn’t be a problem unless your DOM0 is extremely busy doing work such as processing I/O for the VMs that are running and handling interrupts.  In this case, if you have VMs that are pinned to the same vCPUs as DOM0 you might have some performance problems.  I’ve outlined where DOM0 runs by default on the size system in the example.
  • Realize that you can pin more than one VM to a vCPU.  I wouldn’t recommend this for obvious performance reasons but it’s possible to do.  This is where the spreadsheet comes in handy.
  • If you’re installing the ovm utilities which provides ovm_vmcontrol, you may need to enable remote connections first.  If you get an error message stating that there is an error connecting to localhost, perform the steps below.  You have to pay attention to the version of the ovm utilites that you install.  The readme will show you which of the three (currently) versions to install based on the version of OVM you’re running.
  • Below are the steps to enable remote connections (this was taken from Douglas Hawthorne’s blog here).  Note that the steps below should be performed as the root user, not oracle:
[root@melbourne ~]# cd /u01/app/oracle/ovm-manager-3/bin
[root@melbourne bin]# ./secureOvmmTcpGenKeyStore.sh
Generate OVMM TCP over SSH key store by following steps:
Enter keystore password:
Re-enter new password:
What is your first and last name?
 [Unknown]: OVM
What is the name of your organizational unit?
 [Unknown]: melbourne
What is the name of your organization?
 [Unknown]: YAOCM
What is the name of your City or Locality?
 [Unknown]: Melbourne
What is the name of your State or Province?
 [Unknown]: Victoria
What is the two-letter country code for this unit?
 [Unknown]: AU
Is CN=OVM, OU=melbourne, O=YAOCM, L=Melbourne, ST=Victoria, C=AU correct?
 [no]: yes

Enter key password for <ovmm>
 (RETURN if same as keystore password):
Re-enter new password:
[root@melbourne bin]# ./secureOvmmTcp.sh
Enabling OVMM TCP over SSH service

Please enter the Oracle VM manager user name: admin

Please enter the Oracle VM manager user password:

Please enter the password for TCPS key store :

The job of enabling OVMM TCPS service is committed, please restart OVMM to take effect.





[root@melbourne ~]# service ovmm restart
Stopping Oracle VM Manager [ OK ]
Starting Oracle VM Manager [ OK ]

 

If you have any questions- feel free to post them here.  Good luck!

 

 

CPU pinning example

ovm-hardpart-168217

OVM Manager Cipher Mismatch fix

I was installing a virtual OVM 3.3.3 test environment the other day and when I got to logging into OVM Manager for the first time I got this error:

3ssOL

This has to due with the fact that most modern browsers have dropped support for the older RC4 encryption cipher which is what OVM Manager uses.  There is a “fix” until you update to a newer version that has this bug patched.  See InfoDoc 2099148.1 for all the details, but here’s the meat of it:

 

  • Make a backup of the Weblogic config file
# cd /u01/app/oracle/ovm-manager-3/domains/ovm_domain/config
# cp config.xml config.xml.bak

 

  • Add the following line to the cihpersuite section (search for ciphersuite)
<ciphersuite>TLS_RSA_WITH_AES_128_CBC_SHA</ciphersuite>

 

  • Restart the ovm manager service and all is well
# service ovmm restart

Configure simple DNS server on RHEL 6

Sometimes when setting up hardware for a customer, it makes things a lot easier if I can simulate their network in our lab.  This allows me to deploy the solution plug and play without having to re-ip a bunch of stuff or wait until I’m on their network to do most of the install.  A couple problems I’ve come across are access to the internet for patches/updates and DNS.

 

I’ve generally used an old netgear or linksys router to front the customer’s internal network inside my lab environment and just connect it to the back of our cable modem.  This solves the first problem- internet access.  The other problem is a bit more involved, since you have to have a DNS server on that network (preferrably on the same IP address as in the real network when it’s deployed) I’ve taken to using Linux as a stepping stone.  It’s really simple to install Linux or grab one that’s already there and plug it into my private sandbox.  Once that’s done, you just need to install and configure a DNS server.  Here is the step by step process (your IP network will be different, just substitute where appropriate). FYI- I’m running Oracle Linux 6.7 with the Red Hat Compatible Kernel for this tutorial. CentOS 6.7 and RHEL 6.7 are no different other than the repositories you point to in order to get your patches.

Let’s install BIND (Berkley Internet Name Domain) better known as DNS

# yum install -y bind bind-utils
[root@tempDNS ~]# yum install -y bind bind-utils
Loaded plugins: refresh-packagekit, security, ulninfo
Setting up Install Process
public_ol6_latest                                                                                            | 1.4 kB     00:00
Resolving Dependencies
--> Running transaction check
---> Package bind.x86_64 32:9.8.2-0.62.rc1.el6_9.2 will be installed
--> Processing Dependency: bind-libs = 32:9.8.2-0.62.rc1.el6_9.2 for package: 32:bind-9.8.2-0.62.rc1.el6_9.2.x86_64
---> Package bind-utils.x86_64 32:9.8.2-0.37.rc1.el6 will be updated
---> Package bind-utils.x86_64 32:9.8.2-0.62.rc1.el6_9.2 will be an update
--> Running transaction check
---> Package bind-libs.x86_64 32:9.8.2-0.37.rc1.el6 will be updated
---> Package bind-libs.x86_64 32:9.8.2-0.62.rc1.el6_9.2 will be an update
--> Finished Dependency Resolution

Dependencies Resolved

====================================================================================================================================
 Package                   Arch                  Version                                     Repository                        Size
====================================================================================================================================
Installing:
 bind                      x86_64                32:9.8.2-0.62.rc1.el6_9.2                   public_ol6_latest                4.0 M
Updating:
 bind-utils                x86_64                32:9.8.2-0.62.rc1.el6_9.2                   public_ol6_latest                188 k
Updating for dependencies:
 bind-libs                 x86_64                32:9.8.2-0.62.rc1.el6_9.2                   public_ol6_latest                891 k

Transaction Summary
====================================================================================================================================
Install       1 Package(s)
Upgrade       2 Package(s)

Total download size: 5.1 M
Downloading Packages:
(1/3): bind-9.8.2-0.62.rc1.el6_9.2.x86_64.rpm                                                                | 4.0 MB     00:00
(2/3): bind-libs-9.8.2-0.62.rc1.el6_9.2.x86_64.rpm                                                           | 891 kB     00:00
(3/3): bind-utils-9.8.2-0.62.rc1.el6_9.2.x86_64.rpm                                                          | 188 kB     00:00
------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                               3.8 MB/s | 5.1 MB     00:01
warning: rpmts_HdrFromFdno: Header V3 RSA/SHA256 Signature, key ID ec551f03: NOKEY
Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
Importing GPG key 0xEC551F03:
 Userid : Oracle OSS group (Open Source Software group) 
 Package: 6:oraclelinux-release-6Server-7.0.5.x86_64 (@anaconda-OracleLinuxServer-201507280245.x86_64/6.7)
 From   : /etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Updating   : 32:bind-libs-9.8.2-0.62.rc1.el6_9.2.x86_64                                                                       1/5
  Updating   : 32:bind-utils-9.8.2-0.62.rc1.el6_9.2.x86_64                                                                      2/5
  Installing : 32:bind-9.8.2-0.62.rc1.el6_9.2.x86_64                                                                            3/5
  Cleanup    : 32:bind-utils-9.8.2-0.37.rc1.el6.x86_64                                                                          4/5
  Cleanup    : 32:bind-libs-9.8.2-0.37.rc1.el6.x86_64                                                                           5/5
  Verifying  : 32:bind-utils-9.8.2-0.62.rc1.el6_9.2.x86_64                                                                      1/5
  Verifying  : 32:bind-9.8.2-0.62.rc1.el6_9.2.x86_64                                                                            2/5
  Verifying  : 32:bind-libs-9.8.2-0.62.rc1.el6_9.2.x86_64                                                                       3/5
  Verifying  : 32:bind-libs-9.8.2-0.37.rc1.el6.x86_64                                                                           4/5
  Verifying  : 32:bind-utils-9.8.2-0.37.rc1.el6.x86_64                                                                          5/5

Installed:
  bind.x86_64 32:9.8.2-0.62.rc1.el6_9.2

Updated:
  bind-utils.x86_64 32:9.8.2-0.62.rc1.el6_9.2

Dependency Updated:
  bind-libs.x86_64 32:9.8.2-0.62.rc1.el6_9.2

Complete!

Ok, now that we have that done, let’s do a system update to make sure we have all the latest bits and bytes. If this is a production system, consult your companys policy on updates and patches before doing this. I don’t want to be responsible for making the other applications on this server potentially not work for any reason.

[root@tempDNS ~]# yum update -y
Loaded plugins: refresh-packagekit, security, ulninfo
Setting up Update Process
Resolving Dependencies
--> Running transaction check
---> Package ConsoleKit.x86_64 0:0.4.1-3.el6 will be updated
---> Package ConsoleKit.x86_64 0:0.4.1-6.el6 will be an update
---> Package ConsoleKit-libs.x86_64 0:0.4.1-3.el6 will be updated
..
..
..
Complete!

At this point, I generally recommend a reboot so any updates that had prerequisites for a reboot are taken care of. Also it just makes sure the system is at a known good place for our work.

Let’s edit the /etc/named.conf file and replace the options section with our own custom code:

options {
    listen-on port 53 { 127.0.0.1; 192.168.1.50; };
        #listen-on-v6 port 53 { ::1; };
        directory   "/var/named";
        dump-file   "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        memstatistics-file "/var/named/data/named_mem_stats.txt";
        allow-query { any; };
        allow-transfer     { localhost; };
        recursion yes;

        dnssec-enable yes;
        dnssec-validation yes;
        dnssec-lookaside auto;

        /* Path to ISC DLV key */
        bindkeys-file "/etc/named.iscdlv.key";

        managed-keys-directory "/var/named/dynamic";
};

Note above that I have added my local IP address to the end of the listen-on line. Now let’s add a couple zone files.

zone "mydomain.com" IN {
                type master;
                file "mydomain.com.zone";
                allow-update { none; };
};

zone "1.168.192.in-addr.arpa" IN {
                type master;
                file "1.168.192.in-addr.arpa";
                allow-update { none; };
};

Obviously change the domain name to your own on the zone line and the file line. Leave the .zone at the end though.

Here are the two files you want to put into /var/named/

mydomain.com

$TTL 86400
@   IN  SOA     ns1.mydomain.com. root.mydomain.com. (
        2017062601  ;Serial
        3600        ;Refresh
        1800        ;Retry
        604800      ;Expire
        86400       ;Minimum TTL
)
; Specify our nameserver
                IN      NS              ns1.mydomain.com.

; Resolve nameserver hostname to IP
ns1             IN      A               192.168.1.50

; Define hostname -> IP pairs which you wish to resolve
gateway         IN      A               192.168.1.1

1.168.192.in-addr.arpa

$TTL 86400
@       IN      SOA     ns1.mydomain.com.        root.mydomain.com. (
                        2017062601
                        21600      ; refresh after 6 hours
                        3600       ; retry after 1 hour
                        604800     ; expire after 1 week
                        86400 )    ; minimum TTL of 1 day
;
@       IN      NS      ns1.mydomain.com.
;
1       IN      PTR     gateway.mydomain.com.

There are a few things I’d like you to note.

1) You have to update the serial number any time you make a change to the zone file (forward or reverse). I usually use the format YYYYMMDD## where ## is a sequential number starting with 01. This way if you make multiple updates on the same day, the root servers on the internet will know which version is current.

2) Take notice of the . at the end of the entries in the reverse zone file. These have to be there- they terminate the domain hierarchy and tell the server that this is the root so it doesn’t try to keep looking any further.

3) In my example above, I also have an entry for gateway.mydomain.com which has an IP address of 192.168.1.1. This is not normally something you would need or want to do but I wanted to show the syntax of how to do it.

4) For every record you want to add to DNS, it’s a good idea to make sure you also add a reverse record. This lets you do an nslookup or dig against the IP address and it will return the name. A lot of stuff will break or at the very least give you problems if it’s not in place so just get in the habit of doing it.

That’s pretty much it. There are a lot of other nuances that I don’t need to get into here. I almost didn’t write this because there are so many tutorials out there that IMHO are written better than mine. Mainly I wanted to keep it for my own use so I know right where to go when I need to install a quick and dirty DNS server. Hopefully one of you will benefit from this.

Enjoy!!

Virtualized ODA X6-2HA – working with VMs

It’s been awhile since I built a virtualized ODA with VMs on a shared repo so I thought I’d go through the basic steps.

  1. install the OS
    1. install Virtual ISO image
    2. configure networking
    3. install ODA_BASE patch
    4. deploy ODA_BASE
    5. configure networking in ODA_BASE
    6. deploy ODA_BASE with configurator
  2. create shared repository.  This is where your specific situation plays out.  Depending on your hardware you may have less or more space in DATA or RECO.  Your DBA will be able to tell you how much they need for each and where you can borrow a few terabytes (or however much you need) for your VMs
  3. (optionally) create a separate shared repository to store your templates.  This all depends on how many of the same kind of VM you’ll be deploying.  If it makes no sense to keep the templates around once you create your VMs then don’t bother with this step
  4. import template into repository
    1. download the assembly file from Oracle (it will unzip into an .ova archive file)
    2. ***CRITICAL*** copy the .ova to /OVS on either nodes’ DOM0, not into ODA_BASE
    3. import the assembly (point it to the file sitting in DOM0 /OVS)
  5. modify template config as needed (# of vCPUs, Memory, etc)
  6. clone the template to a VM
  7. add network to VM (usually net1 for first public network, net2 for second and net3+ for any VLANs you’ve created
  8. boot VM and start console (easiest way is to VNC into ODA_BASE and launch it from there)
  9. set up your hostname, networking, etc the way you want it
  10. reboot VM to ensure changes persist
  11. rinse and repeat as needed

If you need to configure HA, preferred node or any other things, this is the time to do it.

 

Create VM in Oracle VM for x86 using NFS share

I’m using OVM Manager 3.4.2 and OVM Server 3.3.2 to test an upgrade for one of our customers.  I am using Starwind iSCSI server to present the shared storage to the cluster but in production you should use enterprise grade hardware to do this.  There’s an easier way to do this- create an HVM VM and install from an ISO stored in a repository.  Then power the VM off and change the type to PVM then power on.  This may not work with all operating systems however so I’m going over how to create a new PVM VM from an ISO image shared from an NFS server.

* Download ISO (I'm using Oracle Linux 6.5 64bit for this example)
* Copy ISO image to OVM Manager (any NFS server is fine)
* Mount ISO on the loopback device
# mount -o loop /var/tmp/V41362-01.iso /mnt

* Share the folder via NFS
# service nfs start
Starting NFS services: [ OK ]
Starting NFS quotas: [ OK ]
Starting NFS mountd: [ OK ]
Starting NFS daemon: [ OK ]
Starting RPC idmapd: [ OK ]

# exportfs *:/mnt/

# showmount -e
Export list for ovmm:
/mnt *

* Create new VM in OVM Manager
* Edit VM properties and configure as PVM
* Set additional properties such as memory, cpu and network
* At the boot order tab, enter the network boot path formatted like this:
  nfs:{ip address or FQDN of NFS host}:/{path to ISO image top level directory}

For example, our NFS server is 10.2.3.4 and the path where I mounted the ISO is at /mnt.  Leave the {}'s off of course:

  nfs:10.2.3.4:/mnt 

You should be able to boot your VM at this point and perform the install of the OS.

Nimble PowerShell Toolkit

I was working on an internal project to test performance of a converged system solution.  The storage component is a Nimble AF7000 from which we’re presenting a number of LUNs.  There are almost 30 LUNs and I’ve had to create, delete and provision them a number of times throughout the project.  It became extremely tedious to do this through the WebUI so I decided to see if it could be scripted.

I know you can log into the nimble via ssh and basically do what I’m trying to do- and I did test this with success.  However I’ve recently had a customer who wanted to use PowerShell to perform some daily snapshot/clone operations for Oracle database running on windows (don’t ask).  We decided to leverage the Nimble PowerShell Toolkit to perform the operations right from the windows server.  The script was fairly straightforward, although we had to learn a little about PowerShell syntax and such.  I’ve included a sanitized script below that basically does what I need to.

$arrayname = "IP address or FQDN of array management address"
$nm_uid = "admin"
$nm_password = ConvertTo-SecureString -String "admin" -AsPlainText -Force
$nm_cred = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $nm_uid,$nm_password
$initiatorID = Get-NSInitiatorGroup -name {name of initiator group} | select -expandproperty id

# Import Nimble Tool Kit for PowerShell
import-module NimblePowerShellToolKit

# Connect to the array
Connect-NSGroup -group $arrayname -credential $nm_cred

# Create 10 DATA Disks
for ($i=1; $i -le 10; $i++) {
    New-NSVolume -Name DATADISK$i -Size 1048576 -PerfPolicy_id 036462b75de9a4f69600000000000000000000000e -online $true
    $volumeID = Get-NSVolume -name DATADISK$i | select -expandproperty id
    New-NSAccessControlRecord -initiator_group_id $initiatorID -vol_id $volumeID
}

# Create 10 RECO Disks
for ($i=1; $i -le 10; $i++) {
    New-NSVolume -Name RECODISK$i -Size 1048576 -PerfPolicy_id 036462b75de9a4f69600000000000000000000000e -online $true
    $volumeID = Get-NSVolume -name RECODISK$i | select -expandproperty id
    New-NSAccessControlRecord -initiator_group_id $initiatorID -vol_id $volumeID
}

# Create 3 GRID Disks
for ($i=1; $i -le 3; $i++) {
    New-NSVolume -Name GRIDDISK$i -Size 2048 -PerfPolicy_id 036462b75de9a4f69600000000000000000000000e -online $true
    $volumeID = Get-NSVolume -name GRIDDISK$i | select -expandproperty id
    New-NSAccessControlRecord -initiator_group_id $initiatorID -vol_id $volumeID
}

I also wrote a script to delete the LUNs below:

$arrayname = "IP address or FQDN of array management address"  
$nm_uid = "admin"
$nm_password = ConvertTo-SecureString -String "admin" -AsPlainText -Force
$nm_cred = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $nm_uid,$nm_password
$initiatorID = Get-NSInitiatorGroup -name {name of initiator group} | select -expandproperty id

# Import Nimble Tool Kit for PowerShell
import-module NimblePowerShellToolKit

# Connect to the array 
Connect-NSGroup -group $arrayname -credential $nm_cred


# Delete 10 DATA Disks
for ($i=1; $i -le 10; $i++) {
    Set-NSVolume -name DATADISK$i -online $false
    Remove-NSVolume -name DATADISK$i
}

# Delete 10 RECO Disks
for ($i=1; $i -le 10; $i++) {
    Set-NSVolume -name RECODISK$i -online $false
    Remove-NSVolume -name RECODISK$i 
}

# Delete 3 GRID Disks
for ($i=1; $i -le 3; $i++) {
    Set-NSVolume -name GRIDDISK$i -online $false
    Remove-NSVolume -name GRIDDISK$i 
}

Obviously you’ll have to substitute some of the values such as $arrayname, $nm_uid, $nm_password and $initiatorID (make sure you remove the {}’s when you put your value here). This is a very insecure method of storing your password but it was a quick and dirty solution at the time. There are ways to store the value of a password from a highly secured text file and encrypt it into a variable. Or if you don’t mind being interactive, you can skip providing the credentials and it will pop up a password dialog box for you to enter them every time the script runs.

It made the project go a lot faster- hopefully you can use this to model different scripts to do other things. The entire command set of the Nimble array is basically exposed through the toolkit so there’s not a whole lot you can’t do here that you could in the WebUI. When you download the toolkit- there is a README PDF that goes through all the commands. When in PowerShell, you can also get help for each of the commands. For example:

PS C:\Users\esteed> help New-NSVolume

NAME
    New-NSvolume

SYNOPSIS
    Create operation is used to create or clone a volume. Creating volumes requires name and size attributes. Cloning
    volumes requires clone, name and base_snap_id attributes where clone is set to true. Newly created volume will not
    have any access control records, they can be added to the volume by create operation on access_control_records
    object set. Cloned volume inherits access control records from the parent volume.


SYNTAX
    New-NSvolume [-name] <String> [-size] <UInt64> [[-description] <String>] [[-perfpolicy_id] <String>] [[-reserve]
    <UInt64>] [[-warn_level] <UInt64>] [[-limit] <UInt64>] [[-snap_reserve] <UInt64>] [[-snap_warn_level] <UInt64>]
    [[-snap_limit] <UInt64>] [[-online] <Boolean>] [[-multi_initiator] <Boolean>] [[-pool_id] <String>] [[-read_only]
    <Boolean>] [[-block_size] <UInt64>] [[-clone] <Boolean>] [[-base_snap_id] <String>] [[-agent_type] <String>]
    [[-dest_pool_id] <String>] [[-cache_pinned] <Boolean>] [[-encryption_cipher] <String>] [<CommonParameters>]


DESCRIPTION
    Create operation is used to create or clone a volume. Creating volumes requires name and size attributes. Cloning
    volumes requires clone, name and base_snap_id attributes where clone is set to true. Newly created volume will not
    have any access control records, they can be added to the volume by create operation on access_control_records
    object set. Cloned volume inherits access control records from the parent volume.


RELATED LINKS

REMARKS
    To see the examples, type: "get-help New-NSvolume -examples".
    For more information, type: "get-help New-NSvolume -detailed".
    For technical information, type: "get-help New-NSvolume -full".

You can also use the -detail parameter at the end to get a more complete description of each option. Additionally you can use -examples to see the commands used in real world situations. Have fun!

Temperature monitoring script with email alerts

We have quite a bit of expensive equipment in our server room and we’ve had the A/C fail a couple times. As a result, I’ve installed a raspberry pi zero with a DS18B20 temperature sensor connected to it to monitor the temperature of the room.  If it goes above a set threshold, it will send an email to the engineers so we can log in and shut stuff down until the problem is fixed.

 

This project branches off from the one I did earlier on monitoring temperature with a raspberry pi and MRTG.  This too uses MRTG but I won’t get into the details of that- you can see how I set that up here.

 

The big piece here is the alerting logic.  You’d be surprised how fast the temp can go up in a small room with lots of gear putting out a lot of heat.  For that reason, I monitor the temperature every minute.  If the current temperature exceeds the threshold I set in the script, it fires an email and sets an alert flag to true.  The reason I did this is so we don’t get an email every minute while the temperature is above threshold.  How irritating would that be?  So another piece of logic in the script checks to see if the alert flag has been tripped.  If it has, no email is sent until the temperature comes down below the threshold.  Then an all clear email is sent and the cycle repeats itself.

 

I used the instructions here to set up ssmtp on the pi.  In my case, I used our comcast email relay since we have comcast so the instructions for that are a little different.  You can also use your company’s own mail relay if you have one internally that can be used to send email to external addresses.  As has been my practice lately, I’ve uploaded the code to GitHub here for you to do with as you please.

 

As always, if you have any constructive criticism or comments, feel free to leave them below and I’ll get back to you ASAP.

SSH Tunneling with PuTTY

From time to time I have a need to connect to a system inside another remote network (usually my work).  Normally I just ssh in and then jump to the machine I need to be on.  That’s all fine and dandy if you don’t need a GUI.  What if you need to be on the GUI console of the target machine inside the firewall and the firewall doesn’t allow the port you need to use?

 

Enter VNC and PuTTY.  You aren’t limited to doing this with PuTTY or VNC.  It’s just that a majority of my work is done from a windows machine and I refuse to install the bloated CYGWIN app on my machine just to get an ssh command line session.  Bah.. that’s a story for another day.  Anyway- SSH tunnels can be a bit confusing to the lay person so I thought I’d do a graphical illustration to help that out.

 

In this scenario, I will be using my laptop at home to connect into a landing pad UNIX machine at work.  I will then open a tunnel to another machine inside the remote network that will establish a connection to the VNC server running on that machine.  I won’t go into how to set up a VNC Server on linux as there are plenty of tutorials out there that will cover it.  The one thing I will say is make sure you use a password when you start it up.  This is a visual example of what the connection looks like:

 

capture

 

Here are some enlarged views so you can see what’s going on.  First we start PuTTY on the laptop.  I’ll show an example of what options you need to select inside the Putty connection later.  Once the tunnel is in place, fire up your favorite VNC client and point it to 127.0.0.1 or localhost on port 59001:

capture1

We pointed our VNC client to the address and port of the tunnel we just created, so the traffic is sent through the tunnel into the external Landing Pad and being forwarded on into the remote network:

capture2

Finally, the tunnel terminates on the server inside the remote network and connects the tunnel to port 5901 on that machine:

capture3

 

It may seem odd to connect your VNC client to the laptop’s localhost address in order to reach the target machine.  This is because you’re sending that traffic through the SSH tunnel that we set up rather than pointing it directly to the server you want to reach.

 

Now I’ll show you how to configure PuTTY to create the tunnel.  First, fire up Putty and populate the username and IP address of the landing pad server in our example (substitute yours of course).  Leave the port at 22:

capture4

 

Next, scroll down on the left hand side in the Category window and select Tunnels.  Here, populate the source port (59001 in my example), the IP address of the final destination server along with the port you want to connect to on that machine (5901 in my example).  Remember, you aren’t putting the IP address of the landing pad here- we want the target server in the Destination field. Once you have the Source port and Destination fields filled in, click Add and it will pop into the window as seen below:

capture5

 

To establish the tunnel, click Open. This will launch the PuTTY terminal and prompt you for your password.  In this screenshot, I’m using root to log in however generally it’s a good idea to use a non-privileged user to log into any machine:

 

capture6

Once you see the user prompt and you’re logged in, the tunnel is now in place.  Keep in mind that this SSH session you have open is the only thing keeping that tunnel open.  If you log out of the shell, it also tears down the tunnel so keep this window open while you’re using the tunnel.

 

The next step is to launch a VNC Viewer on your laptop and point it to your local machine on port 59001:

capture7

Click the connect button and you should see the next window prompting you for the password you set up earlier:

capture8

Finally, once you click OK you will be brought to your VNC Desktop on the machine inside the remote network!

capture9

 

So let’s take a step back and review what we’ve effectively done here:

 

Start VNC server:

We have to start a VNC server on the target computer, along with configuring a password to keep everyone else out.  This would have to be done separately.

 

Establish Tunnel:

We first establish the tunnel from the laptop, through the landing pad and finally to the remote server.  I’m making the obvious assumption here that you have the landing pad accessible to the internet on port 22 and that you have an SSH server running that will accept such connections.  You’re effectively logging into the landing pad just like you would on any other day.  The difference here is that we’re also telling PuTTY to set up a tunnel for us pointing to the remote server as well.  Aside from that- your login session will look and feel just the same.

 

Launch VNC Client:

We then start the VNC client on our laptop.  Normally, we would point it directly to the server we want to VNC into.  In our case, we created a tunnel that terminates on your laptop at port 59001.  So we connect our VNC client to the laptop (localhost or 127.0.0.1 should work) and point it to port 59001 instead of the standard port 5901.  The VNC client doesn’t care how the traffic is getting to the VNC server, it just does its job.

Think of this SSH tunnel as kind of a wormhole if that type of thing were to actually exist.  The traditional method of connecting to your remote endpoint would be similar to pointing our space shuttle towards the Andromeda galaxy which is about 2.5 million light years away.  It’s essentially not possible to get there- similar to a firewall that is blocking us.  But what if there were a wormhole that terminated near Earth that ended in the Andromeda galaxy?  If we were to point our space shuttle into the wormhole, theoretically we would pop out the other side at our target.

 

If you do plan on doing something like this, make sure you network administrator is ok with it.  They may detect the traffic as malicious if they’re not sure where it’s coming from and you may wind up in trouble.  I hope this helps give a basic understanding of how SSH Tunnels work.

 

 

 

 

 

Internet Ping Meter (part 2 of 2)

Onto the fun stuff!  Below is the python script that does most of the heavy lifting.  Remember with Python, indentation is critical.  It’s actually used to delimit things like functions rather than more traditional delimiters like {}.  Best practice is to use spaces, not tabs for indentation because they can be inconsistent and cause problems.  To avoid this, I like to use an IDE such as notepad++ or the Arduino IDE. It does a great job of taking care of the spacing and indentation. It will even go through the entire script and fix any indentation errors you have automatically. Highly recommended. FYI- You’ll also need to install the PySerial module for this to work:

#!/usr/bin/python

##
## Internet Ping Meter v1.0 - Eric Steed
##
## 01/03/17 - first version - EPS
##
import serial
import sys
import subprocess
import time
latency = 0
ping_targets="8.8.8.8 4.2.2.2 208.67.220.220"
retVal = 0
failLevel = 0
lastLEDStatus = ""

##
## Define array variable alertLevel[] and assign color codes to be sent to the NeoPixel.
## Based on the number of total ping failures, iterate the failLevel by one and
## send the appropriate color code.
##
clearLED = "ic"
alertLevel = []
alertLevel = ["h","g","f","e","d"]

##
## Open the serial port to talk to the NeoPixel. Have to wait for it to initialize
## before we start sending signals
##
port = serial.Serial("/dev/ttyACM0", baudrate=9600, timeout=1)
time.sleep(3)

##
## Green = h
## Greenish Yellow = g
## Yellow = f
## Orange = e
## Red = d
## Black = i
##
## LED #'s
##
## 1-9 = 1-9
## 10 = a
## 11 = b
## 12 = c
##
##
## I'm using a NeoPixel ring with 12 LED Segments to indicate the average latency of
## multiple established servers on the internet.  This way I can tell visually if
## my internet connection is slow, or even down.
##
## To control the NeoPixel, I've assigned specific characters to indicate how many
## LED's to illuminate and what color.  When we tell the NeoPixel to illuminate a
## given number of LED's, we have to account for the fact that the last command
## string that was sent is persistent in that the LED stays lit even when the next
## command string comes in.  For example, if reading 1 determines that 4 LED's
## should be lit, then reading 2 calls for 3 LED's, you wouldn't be able to see that
## because all 4 LED's were still illuminated from the previous cycle.
##
## To account for this, we send an instruction to "illuminate" all 12 LED's with
## the color Black before sending the actual value desired.  This is done by
## assigning a value of 'ic' to the variable clearLED.  I've also added some logic
## at the end of the infinite while loop that says don't send any instructions
## unless there's been a change since the last one.  This gets rid of the blinking
## effect that I was seeing on every update- rather annoying!
##

##
## I'm using the subprocess library for now unless I can get the native Python ping library
## to do it for me.  If stdout is null for a given target, return 0.
##
def doPing(host):
    import os,platform
    pingOutput = subprocess.Popen(["ping -c 1 -w 1 " + host + " | grep rtt | awk -F/ '{print $5}' | awk -F. '{print $1}'"], stdout=subprocess.PIPE, shell=True)
    (out, err) = pingOutput.communicate()
    if (out.rstrip('\n') == ''):
        return 0
    else:
        return out.rstrip('\n')

##
## Get average latency from all of the ping targets. Had to cast the output of
## doPing() into an integer to be able to do math against it
##
while True:
    count=0
    for x in ping_targets.split():
        retVal = int(doPing(x))
        #print "latency = [{0}]".format(retVal)
        # print "type = [{0}]".format(type(retVal))
        if (retVal > 0):
            latency += retVal
            count+=1

    ##
    ## If count is zero, that means we were not able to successfully ping
    ## any of the targets and we should start incrementing the failure count.
    ## Furthermore, if we have been incrementing failLevel and we are now
    ## able to ping, reset the failLevel back to 0 at that time.
    ##
    if (count == 0):
        # Increase failure level
        #print "Failed to ping any host"
        failLevel += 1
        if (failLevel > 4):
            failLevel = 4

    else:
        latency=(latency/count)
        failLevel = 0

    ##
    ## Set LEDStatus to the appropriate value based on latency and failure count
    ##

    #print "Average Latency = [{0}]".format(latency)

            if (latency > 1) and (latency <= 10):                 #print "1-10"                 LEDStatus = clearLED + alertLevel[failLevel] + "1"         elif (latency >= 11) and (latency <= 20):                 #print "11-20"                 LEDStatus = clearLED + alertLevel[failLevel] + "2"         elif (latency >= 21) and (latency <= 30):                 #print "21-30"                 LEDStatus = clearLED + alertLevel[failLevel] + "3"         elif (latency >= 31) and (latency <= 40):                 #print "31-40"                 LEDStatus = clearLED + alertLevel[failLevel] + "4"         elif (latency >= 41) and (latency <= 50):                 #print "41-50"                 LEDStatus = clearLED + alertLevel[failLevel] + "5"         elif (latency >= 51) and (latency <= 60):                 #print "51-60"                 LEDStatus = clearLED + alertLevel[failLevel] + "6"         elif (latency >= 61) and (latency <= 70):                 #print "61-70"                 LEDStatus = clearLED + alertLevel[failLevel] + "7"         elif (latency >= 71) and (latency <= 80):                 #print "71-80"                 LEDStatus = clearLED + alertLevel[failLevel] + "8"         elif (latency >= 81) and (latency <= 90):                 #print "81-90"                 LEDStatus = clearLED + alertLevel[failLevel] + "9"         elif (latency >= 91) and (latency <= 100):
                #print "91-100"
                LEDStatus = clearLED + alertLevel[failLevel] + "a"

        else:
                #print "latency greater than 101"
                LEDStatus = clearLED + alertLevel[failLevel] + "c"

    ##
    ## If the latency is within a different range than the last iteration, send
    ## the command to update the LED count on the NeoPixel.  Otherwise you get
    ## a rather annoying blinking effect as the LED's are updated even if it's the
    ## same measurement as the last time.
    ##
    if (LEDStatus != lastLEDStatus):
        port.write(LEDStatus)
        lastLEDStatus = LEDStatus

    #time.sleep(5)
    #print LEDStatus
    latency = 0

I left the debugging code in the script if you want to uncomment them and watch the terminal as the script runs to see what’s going on. Most of the script is fairly straightforward so I won’t dwell too much on explaining it step by step.

Now, onto the Arduino code. I’m using the Arduino basically as a driver for the NeoPixel. Again- I could have probably just used the Pi by itself, but what fun would that be?

#include <Adafruit_NeoPixel.h>

//
// Internet Ping Meter v1.0 - Eric Steed
//
// 01/03/17 - first version - EPS
//
// Set up variables
byte leds = 0;
uint8_t delayVal = 30;

// Set the PIN number that the NeoPixel is connected to
#define PIN   7

// How bright are the LED's (0-255)
#define INTENSITY 60

// Set color to Green to start
uint8_t  r = 0;
uint8_t  g = INTENSITY;
uint8_t  b = 0;

// Set the number of pixels on the NeoPixel
#define NUMPIXELS   12

// When we setup the NeoPixel library, we tell it how many pixels, and which pin to use to send signals.
Adafruit_NeoPixel pixels = Adafruit_NeoPixel(NUMPIXELS, PIN, NEO_GRB + NEO_KHZ800);

// Initialize everything and prepare to start
void setup()
{
  uint8_t i;

  // Set up the serial port for communication
  Serial.begin(9600);
  Serial.println("Started Serial Monitor");

  // This initializes the NeoPixel library.
  pixels.begin();

  // This sets all the pixels to "off"
  for (i = 0; i < NUMPIXELS; i++) {
    pixels.setPixelColor(i, pixels.Color(0, 0, 0));
    pixels.show();
  }

  // Cycle each pixel through the primary colors to make sure they work, then turn them all off
  // Red
  for (i = 0; i < NUMPIXELS; i++) {
    pixels.setPixelColor(i, pixels.Color(INTENSITY, 0, 0));
    pixels.show();
    delay(delayVal);
  }

  // Green
  for (i = 0; i < NUMPIXELS; i++) {
    pixels.setPixelColor(i, pixels.Color(0, INTENSITY, 0));
    pixels.show();
    delay(delayVal);
  }

  // Blue
  for (i = 0; i < NUMPIXELS; i++) {
    pixels.setPixelColor(i, pixels.Color(0, 0, INTENSITY));
    pixels.show();
    delay(delayVal);
  }

  // White
  for (i = 0; i < NUMPIXELS; i++) {
    pixels.setPixelColor(i, pixels.Color(INTENSITY, INTENSITY, INTENSITY));
    pixels.show();
    delay(delayVal);
  }

  // Turn off all LED's
  for (i = 0; i < NUMPIXELS; i++) {
    pixels.setPixelColor(i, pixels.Color(0, 0, 0));
    pixels.show();
  }
}

// Main loop
//
// When sending LED signals, send the color code first, then the number of LED's to
// turn on.  For example 6 Green LED's would be h6, 11 Red LED's would be db, all
// 12 LED's to Black would be ic
void loop()
{
  uint8_t i;
  if (Serial.available())
  {
    char ch = Serial.read();
    // Serial.print("ch = ");
    // Serial.println(ch);
    int led = ch - '0';

    // Serial.print("led = ");
    // Serial.println(led);

    // Set Color of LED based on how many fails in a row
    //RED = 52(d)
    //ORANGE = 53(e)
    //YELLOW = 54(f)
    //YELLOW-GREEN = 55(g)
    //GREEN = 56(h)
    //BLACK = 57(i)

    switch (led) {
      // Set color to RED
      case 52: {
          r = INTENSITY;
          g = 0;
          b = 0;
        }
        break;

      // Set color to ORANGE
      case 53: {
          r = INTENSITY;
          g = (INTENSITY / 2);
          b = 0;
        }
        break;

      // Set color to YELLOW
      case 54: {
          r = INTENSITY;
          g = INTENSITY;
          b = 0;
        }
        break;

      // Set color to YELLOW-GREEN
      case 55: {
          r = (INTENSITY / 2);
          g = INTENSITY;
          b = 0;
        }
        break;

      // Set color to GREEN
      case 56: {
          r = 0;
          g = INTENSITY;
          b = 0;
        }
        break;

      // Set color to BLACK
      case 57: {
          r = 0;
          g = 0;
          b = 0;
        }
        break;

      // To save on code, if we receive a 0 through a 9, turn on that
      // number of LED's
      case 0 ... 9:
        for (i = 0; i < led; i++) {
          pixels.setPixelColor(i, pixels.Color(r, g, b));
          pixels.show();
        }
        break;

      // If we receive an "a", turn on 10 LED's
      case 49:
        for (i = 0; i < 10; i++) {
          pixels.setPixelColor(i, pixels.Color(r, g, b));
          pixels.show();
        }
        break;

      // If we receive a "b", turn on 11 LED's
      case 50:

        for (i = 0; i < 11; i++) {
          pixels.setPixelColor(i, pixels.Color(r, g, b));
          pixels.show();
        }
        break;

      // If we receive a "c", turn on 12 LED's
      case 51:

        for (i = 0; i < 12; i++) {
          pixels.setPixelColor(i, pixels.Color(r, g, b));
          pixels.show();
        }
        break;

      // For testing, insert a delay if we see a ,
      case -4:
        delay(delayVal * 10);
        break;

      default:
        // if nothing else matches, do the default
        // default is optional
        break;
    }
    // I had to add this bit of code to fix a problem where the Arduino buffer
    // apparently filled up after a very short time.  It would set the LED's on
    // but then pause for 2-3 seconds before it would receive the next command.
    // This tells the Arduino to flush out the buffer immediately.
    Serial.flush();
  }
}

If it’s not already evident, I’m not very adept at either Python or Arduino coding. I’m just starting out. The most frustrating thing for me is stumbling across syntax issues with code. 9 times out of 10, I know it’s possible to do something but I just can’t get the syntax right or use the correct modules. All this comes with time so maybe in a year, this code would be half or 1/3 the size it is right now.

Once you have everything installed and tested (you can turn on and off the LED’s), you have to connect the Pi to your network. I would consider this device to be a single purpose device and not put anything else on it that could interfere with the script and timing. They’re cheap enough that you should be able to justify this.

 

You can find this code on Github at https://github.com/esteed/Internet-Ping-Meter.  Please feel free to make modifications and generate a pull request- I’m always looking for a better mousetrap!

 

I hope this has helped you even a little bit. I had a great time setting it up and I look forward to making enhancements. The first one will be to indicate current upstream and downstream throughput using white and blue LED’s basically overlaid on the top of the latency indicators. Wish me luck!!

Internet Ping Meter (part 1 of 2)

THE INTERNET IS DOWN!!

How many of you “home IT support technicians” have heard this before?  I hear it a lot, so I decided to create a device that would notify me visually when problems occur.  Sometimes it winds up being a flaky wifi router that either reboots or just needs to take a breath.  Other times, it’s our Comcast connection in which case I can’t do anything other than call and file an outage.  The kids seem to have a hard time with understanding that even though I’ve explained it to them a hundred times.

A little background on the reason for this project.  At the company I work for, we employ a WAN load balancer which uses a series of pings to major internet presences such as google, AT&T or OpenDNS servers.  Basically the device pings each of those addresses once per second and based on specific criteria, can determine if one of the two internet connections is down and can take appropriate action.

This is what made me decide to develop my version of the ping meter.  There are a number of projects like this for the raspberry pi that involve some sort of visual representation.  I wanted to put together a project that incorporated both the raspberry pi, an arduino board and the NeoPixel ring.  This was mainly a project for me to learn how to integrate multiple devices.  Honestly I could probably have done this without the Arduino but I wanted to challenge myself a little.

At this point, I have the device working the way I want.  My next challenge is to package the device into something more aesthetically pleasing.  WAF (Wife Acceptance Factor) is an important aspect to any geek project like this if it’s gonna be displayed somewhere that’s visible.  I’m thinking maybe a small picture frame or maybe some sort of glass object that looks nice.

Here is a list of the parts you’ll need:

  • Raspberry Pi (any model should work)
  • Arduino board (I used an UNO but even that is overkill)
  • NeoPixel LED ring (12 LED segments)
  • Micro SD card (at least 4gb)
  • USB Type A to USB type B (printer/scanner cable)
  • 5V Micro USB power source (iPad charging brick is perfect)

I haven’t tested using a Pi Zero yet but I don’t see why it wouldn’t work.  I also have an Arduino Trinket (5v version) that I’m trying to use for this however out of the box it doesn’t support serial communication.  For size reasons, this combination would be perfect for just about any implementation where room is an issue.  You could just as easily use a larger NeoPixel ring or even a strip with some very minor code modifications.

There are two programs that are used to make this system work.  One is the “firmware” that you load onto the Arduino board itself.  The other is the python script that runs on the Pi.  Basically I use the Pi to ping 3 different IP addresses, and use the NeoPixel ring to display the average ping latency in LED segments.  If I can’t ping all three then I start to progressively change the color of the LED’s from green to red.  Throughout this project I learned a lot about programming in python, Arduino and interacting with external physical devices.  I first started by just getting the LED’s to turn on and off.  I borrowed a lot of code from examples and implemented the same routines to get the NeoPixel to do what I wanted to.

I tried to sprinkle comments throughout the code to explain what I’m doing and why.  Most of these were added after I made a breakthrough in something that was kicking my ass for awhile so I would know how to fix the problem the next time around.  I won’t focus a lot on how to install the OS on your Pi or how to download code to the Arduino- there are a LOT of helpful resources on the internet that can walk you through it.  Also, in the spirit of this being a learning exercise for me- I think it’s valuable for someone starting out fresh to do the research and have a basic understanding of what’s going on rather than just copying and pasting code.  If you’re trying to put this together and run into problems, feel free to comment on the article and I’ll do my best to answer questions.

In the next article, I’ll show you the code and how it all works.  Stay tuned!