Wednesday 3 August 2016

Creating HDF5 files for caffe

In this post, I will describe how to create HDF5 files for caffe using a set of images. For a full list of the formats supported by caffe, visit this url. Take into account that this is a very general way of creating the HDF5 files, you should be able to easily adapt the code to fulfil your needs (e.g. adding another dataset to the file, changing the names of the dataset).

The 3 major ways to input files to caffe are:

  1. Using an HDF5 file. 
  2. Using an LMDB file. 
  3. Creating a list of the paths to the images in a txt file. 
I strongly recommend creating either HDF5 files or LMDB to train your caffe network for a fast training in comparison with using a list. The Python notebook with the code to create the HDF5 files is here.

For this section, the code contained in the Python notebook is very self-explicative and most of the methods have less than 10 lines of code, so reading them should be straight forward. I'll just explain the high level idea of what the code is doing:
  1. Using a directory with images.
  2. Read all the images.
  3. Scale the images from 0 to 1.
  4. Create subimages of all the images.
  5. Create a numpy array with all the subimages.
  6. Give the array the shape required by caffe:
    • Number of items.
    • Number of channels.
    • Height.
    • Width.
  7. Create an H5 with this array.
This logic is coded in the method getimagesforcaffe. Step 4 is required in most cases, we usually don't work with complete images when training a network, we create small sub-images and input them to the network.

Most of the times we also want to create sub-images using a stride to get more samples. The method I provided to create the sub-images is getsubimages. You can configure how to create the sub-images using the parameters, by default, it creates 30 x 30 pixel images, with a stride of 10 and without cropping the image.

The python notebook also has a method to show the images contained in the h5 file, just for testing purposes, the method is called showh5images.

Creating a diagram for a caffe neural network architecture

In this tutorial, I will explain how to save an image with the diagram of a caffe neural network architecture The Python notebook with the code is in this url.

If we want to create a visualisation for the caffe neural network, we can use the Python library. The first requirement is to install GraphViz an pydot. You can find instructions here for GraphViz.

Now, these are the Python imports required:

import sys

from google.protobuf import text_format

caffe_library = 'caffelibrary/'
sys.path.insert(0, caffe_library)
import caffe;
import caffe.draw

This is the method we can use to create the diagram:

def create_network_architecture_diagram(networkfilepath, fileextextension='png'):
    with open(networkfilepath, "r") as f:
        net_architecture_text = f.read()

    net_architecture = caffe.proto.caffe_pb2.NetParameter()
    text_format.Merge(net_architecture_text, net_architecture)
    caffe.draw.draw_net_to_file(net_architecture, networkfilepath + '.' + fileextextension, "LR")


We can optionally select the format of the output, any file extension supported by GraphViz is valid, e.g.: png, jpg, svg, eps. An example of how to use it is shown here:

create_network_architecture_diagram("Path to the prototxy network file", "eps")

The following diagram corresponds to the architecture of the network "net surgery" provided in the caffe examples:


Monday 1 August 2016

Testing a Caffe Model using Python

I've been completely absorbed by my master degree at UCL and my part time job at Satalia, but it is time to start a series of posts about the work I'm developing for my dissertation. 




I'll start with this posts related to the caffe framework

This first tutorial is about how to test a pre-trained model of Caffe using caffe's Python library. Using Python for testing Caffe is very useful, however in contrast with the installation docs, there is not much documentation on how to use Caffe’s Python library apart from the Python notebooks found in the Github repository. 

The python notebook using this code can be found here.



Let's start with the Python library imports that we will need:

from pylab import *
caffe_library = '../caffelibrary/'

import sys
sys.path.insert(0, caffe_library)
import caffe
import os

import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
from sklearn import preprocessing
import scipy.misc

In general, caffe networks are trained using H5 or LMDB files containing multiple images. However, for testing, it is much simpler to input one image every time and check the results. In order to do this, it's better to work with a copy of the prototxt network file and remove all the data layers used for training and modify the data input layers used for testing in the following way:


layer {
  name: "data"
  type: "Input"
  top: "inputdata"
  
  input_param {
    shape {
      dim: 1
      dim: # of channels
      dim: height
      dim: width
    }
  }
  
    include { phase: TEST }
}


The height and width of the image can be changed later to match the input size that you want to test. In order to load a pre-trained model, we need the caffe model and the modified prototxt file with the architecture of the network. The following code shows how to load the network:


MODEL_ARQUITECTURE = 'PATH TO prototxt FILE'
MODEL = 'PATH TO PRE-TRAINED MODEL'
net = caffe.Net(MODEL_ARQUITECTURE, MODEL, caffe.TEST)


Because we want to test the current state of the network, we have to use as thelast parameter for the method: caffe.TEST. That tells the process that we want to run the test phase of the network. Once we have the model in memory, we want to test it using images. In order to do this, we need to scale the input images to the format we used for the training. Caffe training is done using numpy arrays with 8 bit unsigned int or images scaled into floats that go from 0 to 1. To do the later, we can use the following method:
scaler = preprocessing.MinMaxScaler()

def scaleimage(image):    
    ascolumns = image.reshape(-1, 3)
    t = scaler.fit_transform(ascolumns)
    transformed = t.reshape(image.shape)     
        
    return transformed


After scaling, we can use the following method to create the inputs for testing the network:


def createcaffeinput(imagepath):
    # Open the image
    image = np.array(Image.open(imagepath))
    # Scale (0 - 1)
    image = scaleimage(image)
    
    image = image.transpose(2, 0, 1)

    image = image[np.newaxis, :, :]
    
    return image


Note that the image is scaled first, and then the shape is transformed. When loading images in python, the shape is loaded in this way: [height, width, channels], but caffe needs the shape in this fashion: [channels, height, width]. After creating the inputs for caffe, we need to add them to the data blobs of the network. The following method has 3 parameters, net which is the network, layername corresponding to the layer where the image will be added, and finally imagepath containing the path to the image that will be loaded into the caffe blob.


def inputimage(net, layername, imagepath):
    caffeinput = createcaffeinput(imagepath)
    net.blobs[layername].reshape(*caffeinput.shape)
    net.blobs[layername].data[...] = caffeinput


So, if our network has two data layer inputs: clean and dirty, the following code will fill the data blobs and run a forward pass:


inputimage(net, 'clean', IMAGE_FILE)
inputimage(net, 'dirty', IMAGE_FILE_DIRTY)
out = net.forward()


The variable out will contain the output that generates the network (the value of the loss function). After running a forward pass, we are interested in looking at the results. If we suppose that the last layer containing such results is called result, we can access it in the following way:


netcolor.blobs['result'].data


This python object will be a numpy array with the same shape that your result layer has in caffe. If the result layer output are images, the array will be a 4 dimension array. The first dimension contains the number of elements processed and the other dimensions correspond to the actual result image, so the first element of the result blob is theresult image. This is the code to access such element:


result = netcolor.blobs['result'].data[0]


Note how we are only interested in the first element of the array. We can also save this blob to our hard drive as an image with the following code:


resultimage = result.transpose(1, 2, 0)
scipy.misc.imsave('result.jpg', resultimage)


We need to do a transpose before saving to have the right shape required by the method.


This tutorial has come to an end, please let me know about any comments!