Sunday, October 29, 2017

Upgrading Virtual Container Host (VCH) to a newer version

VMware recently released VIC 1.2.1 with some bug fixes. The process of  upgrading vSphere Integrated Containers from 1.1.x to 1.2.x, or from 1.2.x to 1.2.y is pretty straight forward. You can refer to the documentation and follow the pre and port upgrade tasks.

This post will demonstrate how to upgrade existing VCH to a newer version using vic-machine upgrade command line utility.

Assumptions -

1) You have upgraded VIC successfully.
2) You have downloaded the latest VIC Engine bundle which provides the vic-machine utility.

Lets start with listing all running VCHs.-

             ./vic-machine-linux ls

As you can see Project_A and Project_B are on version 1.2.0 and swarm_test and vch_test are on v1.2.1.

Lets upgrade Project_A to v 1.2.1.

       ./vic-machine-linux upgrade --name Project_A --compute-resource XXX

Here is what is happening in the background - 
  • Validation of whether the configuration of existing VCH is compatible with the new version.
  • Uploading the new appliance.iso and bootstrap.iso files to VCH
  • Create a snapshot
  • Power off VCH
  • Boot from new appliance.iso
  • Delete snapshot
Here is what you see in the vSphere client - 



Note: I tried to upgrade several VCHs and the upgrade went successfully but the snapshot was never deleted. This seems to be some kind of a bug that maybe fixed in the later version. Manually deleting the snapshot had no adverse effect on the VCH. 

Also note that if you are mapping container ports to the VCH, you will have an outage. If the containers are running on a container network, there will be no outage. 

After you upgrade the VCH any new containers will boot from the new bootstrap.iso.

For further troubleshooting please refer to the documentation.

Saturday, October 21, 2017

vSphere Integrated Containers (Docker Swarm) - Part 2


In my last post we created a Docker Swarm using DCH now lets look at a few cool things you can do with it.

The setup remains the same as the last post.


Scaling up and down


In our original example we had 6 replicas running.

Run the following to check the running processes -


             docker -H 10.156.134.141 service ps web


Lets bump it up to 10. To do this we use the docker service scale command

            docker -H 10.156.134.141 service scale web=10

And then check to see if the instances have been scaled to 10




To scale down just run the same command with the desired number of instances

            docker -H 10.156.134.141 service scale web=6

And check to see if the instances have been scaled down back to 6.




Draining the Node


Before draining a node let us make sure the number of running processes/instances and the corresponding worker nodes they are running on.


  • 1 on manager1
  • 1 on worker1
  • 2 on worker2
  • 2 on worker3
Run the following command to drain worker3

            docker -H 10.156.134.141 node update --availability drain worker3


As you can see web.8 and web.9 are shutdown on worker3 and are scheduled on manager1 and worker1 respectively.

Applying Rolling Updates 

The simplest of all commands. If the nginx image you deployed needs to be updated, just run - 

docker service update --image <imagename>:<version> web

Removing a service

Be careful using this command. It does not ask for confirmation.

docker service rm web


vSphere Integrated Containers (Docker Swarm)

vSphere Integrated Containers v1.2 includes the ability to provision native Docker container hosts (DCH). The DCH is distributed by VMware through Docker Hub.

VIC v1.2 does not support docker build and docker push and developers can use this dch-photon image to perform these operations as well as deploy a swarm (not natively supported in VIC)

dch-photon is also pre-loaded in the default-project in vSphere Integrated Containers Registry. If you are not familiar with dch-photon, please read the VIC Admin Guide for reference.

Lets jump right in to see how a Developer can use DCH to create a Docker Swarm and deploy an application.

You can write a simple shell script that deploys Docker swarm manager node and then create and join worker nodes to the swarm. In this example I will deploy the manager and worker nodes manually.

Create a Virtual Container Host (VCH) - This will act as an endpoint for deploying the master and worker DCH. I have deployed a VCH named swarm_test with a container network IP range so the containers will pick an IP from the range provided. The VCH IP in 10.156.134.35.

Creating a docker volume for the master image cache

docker -H 10.156.134.35 volume create --opt Capacity=10GB --name registrycache

Creating a volume for each worker image cache

                docker -H 10.156.134.35 volume create --opt Capacity=10GB --name worker1
                docker -H 10.156.134.35 volume create --opt Capacity=10GB --name worker2
                docker -H 10.156.134.35 volume create --opt Capacity=10GB --name worker3

Lets create a Master instance
               docker -H 10.156.134.35 create -v registrycache:/var/lib/docker \
--net VIC-Container \
--name manager1 \
--hostname=manager1 \
vmware/dch-photon:17.06        
Create the worker instances worker 1,2,3
docker -H 10.156.134.35 create -v worker1:/var/lib/docker \
--net VIC-Container \
--name worker1 \
--hostname=worker1 \
vmware/dch-photon:17.06

Here is how the deployed setup looks like -









Connect the master and worker nodes to the appropriate Bridge network.
            docker -H 10.156.134.35 network connect bridge manager1
docker -H 10.156.134.35 network connect bridge worker1
docker -H 10.156.134.35 network connect bridge worker2
docker -H 10.156.134.35 network connect bridge worker3 

This is where I spent most of the time trying to figure out why my worker nodes were not talking to the manager. I highly recommend reading the network use cases in the documentation. As per my network setup I had to combine bridge networks with a container network.

Now, start the master and all worker nodes.

                docker –H 10.156.134.35 start manager1
                docker –H 10.156.134.35 start worker1 worker2 worker3

Create a Swarm on the Master (note my manager node IP in the screenshot above)
                docker -H 10.156.134.141 swarm init --advertise-addr 10.156.134.141

I am advertising the manager IP so the nodes can communicate on it. This is your eth0 in Docker networking world.The output of the above command will give you a Token.


This Token is required for the worker nodes to join the swarm. Don't panic if you missed copying the token string. You can get it back by running the following command 

               docker -H 10.156.134.141 swarm join-token worker  (make sure you get the worker token. If you replace worker my manager you will get the manager token. This is useful if you want to have more than one manager in your swarm)

Add each worker to the swarm

      docker -H 10.156.134.142 swarm join 4stkz3wrziufhq8qjwkszxpzet6o3tlut1lf9o9ijqkhsvb5va-dtopt9a6bp9r03q52q2ea6mo4 10.156.134.141:2377

Once all worker nodes are added, run the following to list the nodes - 

                      docker -H 10.156.134.141 node ls


Now that the nodes are up lets create a simple nginx web service.

Create a service 
                     docker -H 10.156.134.141 service create --replicas 6 -p 80:80 --name web nginx

Check the status of the service - 
                 docker –H 10.156.134.141 service ls
                    docker –H 10.156.134.141 service ps web

You can see the replicas are preparing and not ready yet. 













Whats happening in the background is the manager node is distributing the nginx image on the worker nodes and orchestration layer is scheduling containers on the manager and worker nodes. If you run docker service ps web again you will see the service is now running which means the nginx daemon has been launched and ready to serve requests.










Go to your favorite browser and hit the IP of any worker or manager nodes. Even if a container is not scheduled on one of the nodes, you should still be able to get to the Welcome to nginx! webpage. Thats the whole idea of a swarm. 

In my next blog I will talk about Scaling up and down, inspecting nodes, draining nodes, removing a service and applying rolling updates to a service.

Saturday, January 23, 2016

Batch rename files (Useful for renaming track numbers from a mp3 file)



I have around 600 mp3 song files in a folder. All the tracks have a number before their names.
How to remove the track-number from all the files.

The easiest way to do this is with rename on the command-line. For example: For example to remove a prefix abcd from abcd1.txt, abcd2.txt, abcd3.txt etc. 

In order to get 1.txt, 2.txt, 3.txt simply use

rename "abcd*.txt" "////*.txt"

You need the same number of / as the number of initial characters you would like to remove. Do place double quotes for both arguments.

Friday, December 20, 2013

How to fix Server Manager Errors after installing updates (HRESULT:0x800F0818 / HRESULT:0x800B0100)


Symptoms
You install several Updates. After the successful installation, you notice you cannot add or remove features/roles in the Server Manager.

Resolution
First you run the Microsoft Update Readiness Tool located here: http://support.microsoft.com/kb/947821
After the scan has completed check: C:\Windows\logs\CBS\Checksur.log. You should see the following errors:

Checking Package Manifests and Catalogs
(f) CBS MUM Corrupt 0x00000000 servicing\Packages\Package_for_KB978601~31bf3856ad364e35~amd64~~6.0.1.0.mum  Expected file name Package_for_KB978601_server~31bf3856ad364e35~amd64~~6.0.1.0.mum does not match the actual file name
(f) CBS MUM Corrupt 0x00000000 servicing\Packages\Package_for_KB979309~31bf3856ad364e35~amd64~~6.0.1.0.mum  Expected file name Package_for_KB979309_server~31bf3856ad364e35~amd64~~6.0.1.0.mum does not match the actual file name

Or 

(f) CBS MUM Corrupt 0x800B0100 servicing\Packages\Package_for_KB978601~31bf3856ad364e35~amd64~~6.0.1.0.mum servicing\Packages\Package_for_KB978601~31bf3856ad364e35~amd64~~6.0.1.0.cat Package manifest cannot be validated by the corresponding catalog
(f) CBS MUM Corrupt 0x800B0100 servicing\Packages\Package_for_KB979309~31bf3856ad364e35~amd64~~6.0.1.0.mum servicing\Packages\Package_for_KB979309~31bf3856ad364e35~amd64~~6.0.1.0.cat Package manifest cannot be validated by the corresponding catalog

Or
(f) CBS MUM Missing 0x00000002 servicing\packages\Package_114_for_KB955839~31bf3856ad364e35~amd64~~6.0.1.0.mum
(f) CBS MUM Missing 0x00000002 servicing\packages\Package_83_for_KB955839~31bf3856ad364e35~amd64~~6.0.1.0.mum


Further down you will see:

Unavailable repair files:
servicing\packages\Package_for_KB978601~31bf3856ad364e35~amd64~~6.0.1.0.mum
servicing\packages\Package_for_KB979309~31bf3856ad364e35~amd64~~6.0.1.0.mum
servicing\packages\Package_for_KB978601~31bf3856ad364e35~amd64~~6.0.1.0.cat
servicing\packages\Package_for_KB979309~31bf3856ad364e35~amd64~~6.0.1.0.cat
These files need to be copied into: %systemroot\Windows\Servicing\Packages

1. You first need to gain control over that folder. In order to do this use the following commands:

This makes the current logged on user (needs to have Administrative privileges) owner of that folder:
takeown /F c:\Windows\Servicing\Packages /D y /R

Then assign full control using:
cacls c:\Windows\Servicing\Packages /E /T /C /G "UserName":F

This will grant you full control over the directory.
Optionally you can download this ZIP. Inside you have 2 REG Files. If you install TakeOwnership.reg you will have a handy Take Ownership entry in the right click menu every time you use it on a Folder.






2. Now you need to gather the missing or corrupted files from the checksur log:

- Download the KB Files for the missing files:
servicing\packages\Package_for_KB978601~31bf3856ad364e35~amd64~~6.0.1.0.mum

- Unpack them using the following command:
Expand -F:* UpdateKBXXXX.msu x:\DestinationDirectory

After you expand you will see a UpdateKBXXXX.cab File. Expand it as well:
Expand -F:* UpdateKBXXXX.CAB x:\DestinationDirectoryCAB

Inside of this cab you will need to grab 2 files: update.mum and update.cat

3. Rename the gathered update.mum and update.cab files exactly as they are specified in the checksur.log:
Ex.: update.mum for KB978601 will be:
Package_for_KB978601~31bf3856ad364e35~amd64~~6.0.1.0.mum
Do the same for all the other missing/corrupt files and place them into the directory specified in checksur.log (/servicing/packages)
After these steps the problem should be fixed. No reboot required.
If the Server Manager is not working even after doing these steps, run the Update Readiness Tool again and double-check the steps described above.


Booting Into the System Recovery Options Screen

First you will need to boot your computer into the System Recovery Options screen. This is usually done with the installation DVD, which should be inserted into the optical drive. When the computer boots, Press any key to boot from CD or DVD as requested, select your language preference and then click Repair your computer. A list of installed operating systems should be displayed – select Windows 7 and click Next.

The System Recovery Options screen will appear. Select the first option, Use recovery tools that can help fix problems with Windows, and then select Startup Repair.
(If your computer has a pre-installed recovery partition, the process is a little difference. In this case, boot to the Advanced Boot Options screen, select Repair your computer and tap Enter. Next, select the keyboard language type, then your username and password before selecting Startup Repair in the System Recovery Options screen.)
With Startup Repair selected, Windows will attempt to automate the repair; this might work – otherwise, further action will be required.

Preparing Windows 7 Recovery

If the Startup Repair option fails, you will receive a message reading Windows cannot repair this computer automatically. At the bottom of the message, click View advanced options for system recovery and support to return to System Recovery Options, and instead click Command Prompt.
The black command line interface will open with X:\ selected by default; this is the Windows internal RAM disk that is used by System Repair. You will need to navigate to your Windows system drive, which will by default be on the C: drive.
To open this, type C: and press Enter. Type DIR and press Enter to check that you are in the right drive – the contents listed should include the Program Files, Users and Windows folders.

You will then need to change directory. Enter CD \windows\system32\config and then DIR to check that the correct files and folders are listed:
  • RegBack
  • DEFAULT
  • SAM
  • SECURITY
  • SOFTWARE
  • SYSTEM
With access to the correct directory and the required folders present, enter MD mybackup to create a backup folder. Enter copy *.* mybackup to copy everything to this location, agreeing to the overwrite warnings when they appear.

The RegBack folder stores automatic Windows registry backups. To check if these can be used in restoring your system, enter CD RegBack and then DIR to view the contents. In the folder, you should have the following:
  • DEFAULT, SAM and SECURITY files, each around 262,000 bytes
  • SOFTWARE file, around 26,000,000 bytes
  • SYSTEM file, around 9,900,000 bytes
Note that these figures are approximate, but recognise that if any of these files display a size of zero bytes then you will have to resort to another method of restoring Windows 7.

Running the Windows 7 Recovery

With your RegBack folder containing the data you need to restore Windows 7 and rescue it from the reboot loop, you will be able to copy the contents and use them to get the operating system back up and running again.

Begin by entering copy *.* .. – note the two trailing dots. These indicate that the contents should be pasted to the level above – the Config folder. Agree to all prompts concerning whether you want to overwrite files, and once the process has completed enter exit to close the command prompt.

On the System Recovery Options screen, click Restart to reboot your PC – if everything has gone as it should, Windows 7 should now start correctly!