Luke

Comparing Cucimoc 8.0(1) to 7.1.x

Cucimoc 7.1.x was and is a decent product for the feature set it offers, but as of the beginning of June 2010, Cisco have released Cucimoc 8.0(1).
There are some significant differences between the 2 products.

  • Cucimoc 8 no longer uses the TabUrl area to display the applet/pane, instead it ‘bolts’ itself to the bottom of the screen, like this: Excellent improvement.
  • TabUrl can now be set to a unc file share or URL to a centrally held config (strictly speaking you could do this in 7, but cucimoc had to be part of it)
  • Conversation History now displays a an alert for missed calls, with the number of them missed.
  • The options for device selection move from the OCS Tools menu, to the options button on the cucimoc pane itself, much better and quicker to get to.
  • You can now connect to MeetingPlace or Cisco Unified MeetingPlace Server -CUMS (though Meetingplace Express won’t work for me, the notes show Cisco Unified MeetingPlace Express VT 2.0 is supported) from within cucimoc.
  • Place and receive video calls, with greater video support not only from the front pane, with ability to answer as video or voice only from the prompt.

  • You can also connect through to Voicemail and Visual Voicemail, this is essentially done using IMAP.
  • The park feature which I had some trouble with in 7, works perfectly in 8.

Windows 7 support is there for 32 bit, but there is still a Q2/2011 being suggested on some Cisco documents for full 64 bit support. However, the release notes suggest support for 64 bit already being there with the exception stating [On 64-bit editions of Windows 7, you cannot use video when you have Cisco UC Integration for Microsoft Office Communicator set to use your desk phone for phone calls.] (pg8 Table6)
That said, I have it working on 64 bit, on both version 7.1.x and 8.0, but drag and drop calling would not initially work on 7.1.x. This seems common based on the technet msg boards having similar questions. We have got it working however by installing both the x86 and x64 C++ 2008 redistributable packages. I will continue to work on this, as it’s a little messy. In addition to this, more testing shows that on 64 bit versions it’s best to install using the .exe rather than the .msi as it has C++ and .Net as bundled stubs.

The release notes can be found here:
http://www.cisco.com/en/US/docs/voice_ip_comm/cucimoc/8_0/english/release/cucimocReleaseNote.html
The Installation Guide can be found here:
http://www.cisco.com/en/US/docs/voice_ip_comm/cucimoc/8_0/english/installguide/Installation_Guide_for_Cisco_UC_Integration_for_Microsoft_Office_Communicator_Release_80.pdf

Annoyances? Well, maybe just 1 or 2 :) . If you use extension mobility (EM) and login to an alternate deskphone you get an alert message saying you have selected an unknown device. This happens in either version 7.1.x and 8.0. You get a handy little instruction to go the the Communicator menu, Tools ->Select device. However in version 8, they have moved the ‘select device’ to the options tab on the cuci pane, it’s just that the alert message still says the exact same thing… a little QA missing.
It can also be a little sluggish on low bandwidth/dsl links (phone call pop; login etc)
Finally, on a few XPSP3 installs I see this when I use alt-tab to flick between apps. Again, poor QA.

Lastly, the voicemail feature, it changes your voicemail icon to be red when you have voicemail, nice, but, it is slow to react, and doesn’t extinguish until there is a state change (i.e hard phone to softphone switch etc)

All that said, I like it, just want to tweak a few more bits.

Sun STK 6140 firmware upgrade

On Saturday we jumped from 06.xx.xx.xx software to the latest 07 software (crystal). We had done a lot of preparation for this event, including practical stuff like vmotioning hosts to alternate storage (even local) and checking backups (repeatedly!). There was also the less practical stuff like talking about it/worrying about it etc.. hot air.

Some pre-requisites:
* Multi-path software: rdac is only supported on 06 series firmware up to and including 06.60.xx.xx; conversely MPIO (Sun) is only supported on 06.60.xx.xx and above. You either need to use 06.60 as an intermediate firmware or plan to migrate your Windows hosts from rdac to mpio on the day. Practically, I opted for MPIO in advance (it worked, but isn’t supported)
* VMWare: Only 3.5Update5 and above is supported.
* Array: Configuration- save; Profile- save; Full Support capture- save. The array has to be ‘green’
* Backups: Make sure they work!

When you move to 07.xx.xx.xx there is a VMWARE host region created. It’s recommended that you move your host regions from linux to this new VMWARE zone. However before you do this, you need to delete all the access volumes (LUN 31), if you do all out-of-band management you do not need them in any case (for any host region), but in band or not, with the new firmware you don’t need these luns to run scripts etc for VMWare communication/management. Most importantly when you use the VMWARE host region they cause VMWARE problems. VMWare attempts to mount them etc.
 
Remember, the HBA’s on the servers that have VMWARE on them would have used a LINUX host region and hence the LINUX HBA recommended settings – you now use the VMWARE settings which are mostly to leave everything as default (with Update5 and beyond)

Make the host region change BEFORE you bring your array back online, making this change whilst VMWare hosts are attached will cause a VMWare failure and at worst could result in data loss.

Go into Santricity, locate the mappings view, expand your hosts group for VMWARE and right-click on each host. Select Change Host region, select VMWARE.

Now go ahead, connect the array.
I opted to fire up VMWare hosts, check them, and then Windows. We then put each VMWare host through a maintenance reboot for good measure.

Sun Storagetek and VMWare

Anyone who deals with Sun/Storagetek SAN hardware will know all about firmware upgrades and firmware versioning. It’s tricky to get the right level for you at the right time, and of course just like any firmware, there’s always a newer one to fix bugs you’ll “probably” never have.
At the same time, there are also updates which you need, you just don’t know that until you somehow envoke them.
Now as I explained to my boss and colleagues, I could spend days reading Sun bug reports and still be very little wiser, but truth is, they just aren’t published, as a lot of what they hold is commercially sensitive/damaging.

We have 7 Storagetek (Sun->Oracle) arrays, 1 is retired, 2 are now off maintenance and used as scratch area (trade in just doesn’t get you much, it’s more useful working for us). The other 4 are very much live, 3 x 6140 and 1 6540.

Across our 3 5140 arrays we run pretty old firmware, 6.19.xx.xx. Why? Well partly the old adage of it aint broke.. and also we simply don’t require a lot of the newer feature sets.

Going back 3 years nearly, Sun introduced the ‘crystal’ firmware, the 7.YY.xx.xx range of firmware. It introduced many bug fixes, but also removed the 2TB lun limit with the 6 series firmware had. The upgrade wasn’t (isn’t) trivial, and as we had little reason to trip 2Tb, we elected to stay put. This is a fully supported thing to do.

All was fine until, we hit 4 conditions.
1. We use RVM (remote volume mirroring). 2 of our SAN’s replicate certain luns to each other. I’d always been dubious about this setup as various things had been done badly (mirror db on same spindles etc). It wa an inherited config.
2. We have firmware 6.19.xx.xx
3. We use VMWare 3.5U4.
4. A disk failed in a RAID10 group, where the RVM and VMWare storage was held.

This triggered a lesser known fault where VMWare fails to receive the correct heartbeat SCSI bus reponse from the array via the vmfs driver, and it corrupts the MFT (master file table).

How?.. Well that’s partly a mystery as neither Sun or VMWare will give us detail..Why? It’s commercially sensitive and embarrasing to both the chip manufacturer and VMWare.

So, I’m faced with a fault where though VMWare can see it’s volumes, browsing them shows no files. The file/data is there, it’s just that the failed heartbeat commands have caused the MFT to be overwritten. The result is host OS’s on VMWare just stop, or report they can’t read their disk, BSOD etc. IMagine it, a 7 host HA farm, which loses 48 of it’s 138 guest OS’s…. finance systems, email, SQL, you name it.. it died.
Now without backups and a neat tool that replicates the vmdk files off site (and on site) we would have been f*cked. Even so, it’s a mountain of work to start recovering that many systems, overnight, to be online. On top of this, you are restoring them to a setup which just caused the problem, but where else, this isn’t trivial storage size.
We did it, but that’s not my point here…

So, we discover more detail. Sun engaged Vmware and LSI to look into the issue. Vmware analysed that the corruption occurs in the metadata and heartbeat records, as the vmfs driver has a pending heartbeat update but fails to find the heartbeat slot.  This indicates corruption.. You can find indications of the heartbeat corruption in the logs with an error along the lines of “Waiting for timed-out heartbeat”.  LSI have also looked into why this happens and identified that VMware is not following the SCSI specification regarding handling Aborted commands at certain levels of code.

We gathered logs for VMWare and logs for Sun, and in essence the answer back from both parties (who in 1 way or another blame each other… or in reality LSI the chip manufacturer) is to upgrade to the crystal firmware.
The testing done by Sun/LSI/VMWare reproduced the bug, but showed it only happened every 4 hours (a cycle based thing) and only for 1-2 seconds. So if you blow out a disk in those 12 seconds across 24 hrs, you trigger this bug… LUCK.

The next twist of this, is that at code 6.19.xx.xx, you run Suns own multipathing software, rdac, which makes sure you only ‘see’ your lun the once, as opposed to 4 times (depending on cabling/path redundancy you put in) we always see the lun 4 times without rdac (2 hba’s in the host, and 1 link to each controller in the array)
Code 6.60.xx.xx and above allows you to run both rdac and Microsofts MPIO (for windows hosts), howerver version 7 code onwards, only supports MPIO.
I’m not saying that rdac won’t work on the high code, or MPIO won’t work on the low code, but they aren’t SUPPORTED.. magic words in a support agreement.

To get from 6.19.xx.xx to 7.60.xx.xx we have to go to 6.60.xx.xx in between, else we can’t get a means of converting our windows hosts to MPIO (and testing!!, this is production kit remember in a 24/7/365 operation) Yes, I have to arrange to take down 138 VM Hosts and about 16 windows hosts..twice.

Another important detail is this hasn’t been ‘fixed’ by VMWare, even in VSphere 4, they don’t regard it as their fault (despite it being a scsi bus standards issue or their lack thereof) LSI had to write in a seperate VMWare host region to take it away from ‘Linux’, so there is a specific region just for VMWare with what are clearly non-standard responses to SCSI based commands requested from the VM hosts…. I find that.. ODD.

The moral of the story? There isn’t one, there is no right/wrong way here in firmware terms, horse for course. It’s pure chance we hit a disk failure in that situation. We get a fair amount of regular disk failures, it’s the nature of the beast, and on this array. This 1 time, we triggered the event.

OCS 2007 R2 XMPP to Google federation failing

I’ve been spending some time getting the Microsoft OCS2007R2 XMPP gateway working. In essence, it provides OCS users with Jabber/XMPP connectivity outside of the organisation. Nominally, we want to use it to connect to Google, so GTalk/gmail/googlemail users. We could of course use it to connect to others via stds based XMPP, jabber.org users etc.
I had hoped to use XMPP to connect to facebook chat, but although facebook have provided 1 bugfix to allow the use of xmpp clients to connect to it, there is no server to server (S2S) support. More detail here.

So, for now, it’s Google. I went through my config, and provisionally i went with a simple solution of an Edge server in the DMZ with and XMPP (single nic) also in the DMZ. Both boxes are non-domain integrated for security.

My biggest issue was in getting the MTLS connection between Edge(outside NIC) and XMPP. I just couldn’t get it to create the connection, Edge would ignore the cert provided by XMPP. I solved this eventually by installing the respective certificates on each server as trusted roots and presto, it worked.

I was using my own @googlemail.com account for testing and it just wouldn’t work. I went over and over my config to no avail. So I went back to the web. Low and behold a patch for XMPP. KB979311.
In particular it resolves: XMPP federation to gmail.com works. However, XMPP federation to googlemail.com does not work.

So, I install the patch, and 1 reboot later, it works! Phone the boss, sit back and grin.

End? No, of course not, about 2 hours later it stops. Random, it just stops. I checked the config, despite knowing I had changed nothing. I find nothing untoward, of course.
So I install wireshark and start to watch traffic. Eventually i fixate on DNS (port 53). This is based on part experience and partly because it’s the only variable beyond my control as such.
The XMPP session (5269) to Google is done via TCP dialback. In essence, your XMPP server does an service location record lookup (SRV) based on the destination email address suffix (googlemail.com or gmail.com etc), so it does a DNS query for _xmpp-server._tcp.googlemail.com
This then returns an address (or in Googles case a cluster of addresses). Your server then does an A name lookup for the address supplied from that lookup and attempts to connect to the resulting IP address.
At the same time as you’re doing this, the google server does a reverse lookup based on your source email suffix, and again IT does an SRV lookup for _xmpp-server._tcp.lukedarby.co.uk and then based on the the resulting name an A name record, then compares this to the source IP address, if they match, you have connection.. a dialback.

I watched these lookups, they seem fine, an SRV lookup for google gets:

_xmpp-server._tcp.google.com    SRV service location:
          priority       = 5
          weight         = 0
          port           = 5269
          svr hostname   = xmpp-server.l.google.com
_xmpp-server._tcp.google.com    SRV service location:
          priority       = 20
          weight         = 0
          port           = 5269
          svr hostname   = xmpp-server1.l.google.com
_xmpp-server._tcp.google.com    SRV service location:
          priority       = 20
          weight         = 0
          port           = 5269
          svr hostname   = xmpp-server2.l.google.com
_xmpp-server._tcp.google.com    SRV service location:
          priority       = 20
          weight         = 0
          port           = 5269
          svr hostname   = xmpp-server3.l.google.com
_xmpp-server._tcp.google.com    SRV service location:
          priority       = 20
          weight         = 0
          port           = 5269
          svr hostname   = xmpp-server4.l.google.com

google.com      nameserver = ns1.google.com
google.com      nameserver = ns2.google.com
google.com      nameserver = ns3.google.com
google.com      nameserver = ns4.google.com
xmpp-server.l.google.com        internet address = 74.125.47.125
xmpp-server1.l.google.com       internet address = 74.125.155.125
xmpp-server2.l.google.com       internet address = 74.125.47.125
xmpp-server3.l.google.com       internet address = 74.125.45.125
xmpp-server4.l.google.com       internet address = 74.125.45.125

As you can see there are 5 entries returned, which only ever seems to come from 3 differing ip addresses:
74.125.155.125
74.125.47.125
74.125.45.125

Although all of this looks ok, I suspected these are load balanced addressed to a farm of real servers, but 1 or some of these servers are legacy gmail configured boxes and just don’t account for googlemail.com addressing. After all, googlemail.com was an after thought for them when they had rights issues introducing gmail to the UK
Hey!, why can’t Google be fallable!

I decided to try and prove this, and decided a quick and dirty way was to use a host file, so that when my xmpp server did the A names lookup it got the result I staged in the hosts file.

I chose 1 of the addresses above, and dropped it in (155) for all 5 names.
Sure enough, googlemail.com federation works, simply sprung into life.

Next thing is to try and work out which of the above don’t work, I suspect it’s 45, but I don’t know… yet.

So there you go, if like me you’re struggling, that’s why, Google infrastructure folk are as lazy as the rest of us :D

Note: Another more simple solution is to convert your @googlemail.com account to @gmail.com, as Google is moving away from googlemail over to gmail now. You can do so here, you’ll need to sign into your account.

CUCIMOC ldap tribulations

I have spent the best part of 3 weeks wrestling with CUCIMOC. It’s fair to say I haven’t been the biggest supporter of this particular piece of software during this time. I respect the feature set, but I can dial a colleague with almost as few clicks on the handset as easily as I can through cucimoc, and the same goes for creating conference calls etc.
One document I would say is prescribed reading is this article. It holds loads of information, but imho is not very clear about valuable points.

Out of the box, getting the integration with CM7 was quite simple, we put the necessary framework devices in place, logged into CUCIMOC with telephonenumber and ‘pin’. All good, or so we thought….
Then we went through the process of integration the CUCM7 servers with AD, opting to use telephonenumber as the primary login mechanism for handsets (who wants to tap out first.last on their 7960 when they use extension mobility!!)
Straight after that, the CTI control of the hard phone (7940/7960) instantly broke. The softphone option would work on occasion, but we simple couldn’t get the  hard phone to work again.
A long week of trying various things in our test lab it all came down to the selection of login choice, pin number and password. We are currently CUCM4.x users and in that environment we use pin and password interchangeably, but in CM7 with ldap/AD integration, they become 2 separate items, your pin logs you into a hard phone device and your password is integral to anything you sign into under software emulation of phone devices.
Armed with this in our heads, we went back through our CUCM7 (with AD integration) config, placed all the framework services into the system, then logged into CUCIMOC with telephone number, and password. Hooray RCC/CTI works!

So, that working reliably and predictably, we moved onto the final section of getting ldap to work from the client for CSF data. The CSF data comes into play when someone rings you who isn’t in your Outlook contacts, isn’t a MOC user, but is held in your directory. CSF facilitates that you get a name to reflect against an incoming phone number. This is done via your client talking ldap to AD to retrieve a name for the phone extension. I have done several attempts at getting this to work, but each time I ended up with a disconnected session in the ‘server status’ section of CUCIMOC.
I used wireshark to sniff this conversation, and saw I was getting: W80090308: LdapErr: DSID-0C090334, comment: AcceptSecurityContext error, data 52e, vece.
Several Googles later I was left confused, was this indeed a context auth error, a password error or an invalid Kerberos token. I went over the wireshark packet trace again and noted that although my username etc was parsing correctly, the password had ’123456′ in clear text. This is the pin i was using in the test lab! So here it was passing correct AD creds with pin number. I changed the login field to use telephone number etc and got variants of these pair of pin/password/extension no/sAMAccount. Never the combo I needed!
I kept putting ldap://<ldapservername> into a browser and would get an error like this:

I then went over the sample data offered with the CUCIMOC client (cucimoc-Admin-ffr.7-1-3.zip) and in particular the file held in ..\Config\SampleCUCIMOC-CUCSFAdminData.bat file.
This clearly defines what entries you will need for stand alone or ADM configured machines via policy. Not only this, but it provides a means (via the batch file) for deploying these settings in basic login scripts etc.
I studied these values, comparing them over and over again with my own, held in my HKCU registry. I could see nothing that helped me, but I keenly tried any variant I could think of. One key kept jumping out at me though, as something I would need to give careful consideration, namely: POLICY_CREDENTIALS_IsLdapSynchronizedWithCucm. Now I’d always assumed that as we had integated CUCM integrated into AD, I would have to have that set to true, and so I did. Again, rebooting between each change to be sure they were taking effect, I was unsuccessful.
So I went back to wireshark/thinking/reading and discussing. A chance conversation with our CUCM Admin, got me closer to the pin vs password conundrum I highlighted above. they are 2 different things in CUCM7 integrated to AD. I was used to CUCM4.
I went back to CUCIMOC and logged it in correctly, with my phone extension as the username and my AD password as the password. Hoorah, MOC logs in, phone control works for CTI and softphone, but… ldap is still disconnected.
I started to read the documentation again, thinking about pins/passwords/samaccountname/userprinciplename/telephone number etc. I then re-read this article which made me start to think about the POLICY_CREDENTIALS_IsLdapSynchronizedWithCucm string value. What if I changed that, that would allow me to specify ldap creds surely.
Changing this to ‘false’ then provides exactly the change highlighted in the document, specify samaccountname and password and bingo.. ldap working at last!
Something I was struggling to find during this little process, was a WORKING example of the registry settings, so hopefully to save you some pain, here are mine.

Why isn’t this documented more clearly, if you make the seemingly inane choice to use telephonenumber as your login mechanism of choice whilst integrating CUCM with AD, you set in place your inability to get ldap to auth properly without having to specify a username and password seperately for client ldap, and you HAVE to set POLICY_CREDENTIALS_IsLdapSynchronizedWithCucm=”false”

Hurrah it works! I’ll not get those tedious hours of my life back though….

Luke Darby
Technology Infrastructure | Media | Communication | Broadcasting
United Kingdom

Luke Darby

  Facebook LinkedIn Feed

Categories