Monday, 1 August 2016

Testing a Caffe Model using Python

I've been completely absorbed by my master degree at UCL and my part time job at Satalia, but it is time to start a series of posts about the work I'm developing for my dissertation. 




I'll start with this posts related to the caffe framework

This first tutorial is about how to test a pre-trained model of Caffe using caffe's Python library. Using Python for testing Caffe is very useful, however in contrast with the installation docs, there is not much documentation on how to use Caffe’s Python library apart from the Python notebooks found in the Github repository. 

The python notebook using this code can be found here.



Let's start with the Python library imports that we will need:

from pylab import *
caffe_library = '../caffelibrary/'

import sys
sys.path.insert(0, caffe_library)
import caffe
import os

import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
from sklearn import preprocessing
import scipy.misc

In general, caffe networks are trained using H5 or LMDB files containing multiple images. However, for testing, it is much simpler to input one image every time and check the results. In order to do this, it's better to work with a copy of the prototxt network file and remove all the data layers used for training and modify the data input layers used for testing in the following way:


layer {
  name: "data"
  type: "Input"
  top: "inputdata"
  
  input_param {
    shape {
      dim: 1
      dim: # of channels
      dim: height
      dim: width
    }
  }
  
    include { phase: TEST }
}


The height and width of the image can be changed later to match the input size that you want to test. In order to load a pre-trained model, we need the caffe model and the modified prototxt file with the architecture of the network. The following code shows how to load the network:


MODEL_ARQUITECTURE = 'PATH TO prototxt FILE'
MODEL = 'PATH TO PRE-TRAINED MODEL'
net = caffe.Net(MODEL_ARQUITECTURE, MODEL, caffe.TEST)


Because we want to test the current state of the network, we have to use as thelast parameter for the method: caffe.TEST. That tells the process that we want to run the test phase of the network. Once we have the model in memory, we want to test it using images. In order to do this, we need to scale the input images to the format we used for the training. Caffe training is done using numpy arrays with 8 bit unsigned int or images scaled into floats that go from 0 to 1. To do the later, we can use the following method:
scaler = preprocessing.MinMaxScaler()

def scaleimage(image):    
    ascolumns = image.reshape(-1, 3)
    t = scaler.fit_transform(ascolumns)
    transformed = t.reshape(image.shape)     
        
    return transformed


After scaling, we can use the following method to create the inputs for testing the network:


def createcaffeinput(imagepath):
    # Open the image
    image = np.array(Image.open(imagepath))
    # Scale (0 - 1)
    image = scaleimage(image)
    
    image = image.transpose(2, 0, 1)

    image = image[np.newaxis, :, :]
    
    return image


Note that the image is scaled first, and then the shape is transformed. When loading images in python, the shape is loaded in this way: [height, width, channels], but caffe needs the shape in this fashion: [channels, height, width]. After creating the inputs for caffe, we need to add them to the data blobs of the network. The following method has 3 parameters, net which is the network, layername corresponding to the layer where the image will be added, and finally imagepath containing the path to the image that will be loaded into the caffe blob.


def inputimage(net, layername, imagepath):
    caffeinput = createcaffeinput(imagepath)
    net.blobs[layername].reshape(*caffeinput.shape)
    net.blobs[layername].data[...] = caffeinput


So, if our network has two data layer inputs: clean and dirty, the following code will fill the data blobs and run a forward pass:


inputimage(net, 'clean', IMAGE_FILE)
inputimage(net, 'dirty', IMAGE_FILE_DIRTY)
out = net.forward()


The variable out will contain the output that generates the network (the value of the loss function). After running a forward pass, we are interested in looking at the results. If we suppose that the last layer containing such results is called result, we can access it in the following way:


netcolor.blobs['result'].data


This python object will be a numpy array with the same shape that your result layer has in caffe. If the result layer output are images, the array will be a 4 dimension array. The first dimension contains the number of elements processed and the other dimensions correspond to the actual result image, so the first element of the result blob is theresult image. This is the code to access such element:


result = netcolor.blobs['result'].data[0]


Note how we are only interested in the first element of the array. We can also save this blob to our hard drive as an image with the following code:


resultimage = result.transpose(1, 2, 0)
scipy.misc.imsave('result.jpg', resultimage)


We need to do a transpose before saving to have the right shape required by the method.


This tutorial has come to an end, please let me know about any comments!

2 comments: