Image similarity search with LIRE

Content-based image similarity search — determining the images in a database visually similar to a search image on the basis of pixel data alone — is a hot topic. Applications are numerous. Just recently, amazon.com has introduced a visual product similarity search for some products, and google has introduced a search-by-image feature.

In addition to hands-on applications, there is a lot of research on the topic. The abilities of university maintained image search engines grow constantly. Unfortunately, implementing current techniques requires a bunch of specialist knowledge and some mathematical sophistication. Fortunately, however, there are effective simple techniques for which there exist open source libraries.

One such library for JAVA is the LIRE API. In this short article, I demonstrate how to implement a basic linear content-based visual image similarity search using this library.

The aim

Today’s aim is to retrieve all the images in a database of JPEG images which are sufficiently similar to a given search image (also in JPEG format). The sample application will indirectly serve this aim as follows:

(USER STORY) As a user, I can put JPEG images into a directory and then call a console application with the directory path, and a full file path of a JPEG image. The application then lists the names of all the images in the directory, sorted in such a way that the most similar image comes up first and the most dissimilar image comes up last.

A simple similarity search algorithm

How does this work? In general, how can we determine which images are more similar to the search image than others?

The basic idea is to use a function that maps images into an n-dimensional Euclidean vector space (R^n) such that similar images are mapped to points close to each other and dissimilar images are mapped to points far away from each other. A vector distance such as the simple Euclidian vector distance then provides a similarity metric that can be used to determine which images are more similar to each other. The smaller the distance value, the more similar the images are. To build the result list for our sample application, we just need to sort the array containing all the images in a directory by their distance to the specified search image.

A very simple example for such a mapping is provided by the function mapping each image to a three-dimensional vector representing the image’s average color in RGB (i.e. the arithmetic mean of all RGB pixel colors). Such a mapping will look somewhat like this (I have not calculated the real means of the pixel colors, the values below are just examples that make enough sense to grasp the idea):

Assigned average color
[138,157,111]
[158,175,105]
[194,92,100]

The first and the second image are perceived as similar to each other, whereas the third image is perceived as dissimilar to the first two images. The mapping respects this perception. The first two vectors have an Euclidean distance of approximately 28 (check), whereas the first and the third image have a distance of approximately 86 (check) and the second and the third image have a distance of approximately 91 (check).

If our database (directory) contained only the last two images and the first image was the specified search image, our application (algorithm) would print a list with two entries in which the second image comes up first and the third image comes up second. To find the images sufficiently similar to the search image, we empirically determine a threshold. In the case at hand, the threshold should be greater than 28, but less than 86. In this case, we correctly identify the second image as similar to the search image, and the third image as dissimilar to it.

Very well. Now, of course this method only works because the third, dissimilar image has totally different colors. However, there are cases where images are dissimilar albeit being similar colorwise. The method above will not be good enough in such cases.

Implementation

How do better functions mapping images to vectors look like? Vectors that represent images are called feature vectors. There is a variety of types of feature vectors on the market place. Very popular are two vectors of the MPEG7 standard which represent color and a second feature: CEDD (Color and edge directivity descriptor) and FCTH (Fuzzy Color and Texture histogram). Both descriptors have the desired properties (visually similar images are mapped to close points in an n-dimensional vector space). An overview of popular feature vectors can be found in Chatzichristofis’s book on feature vectors.

Both types of feature vectors are supported by LIRE. In this article, I use FCTH, because it delivered the best results in our experiments. Strictly speaking, the FCTH of an image is a 192-bin histogram, with three bits for each bin. For our purposes, FCTH can also be conceived of as a feature vector of 192 double values, where the values are constrained to be one of 0.0, 1.0, …, 7.0. Here is an example:

FCTH vector (first 10 values)
[7.0, 6.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 7.0, …]
[6.0, 6.0, 0.0, 0.0, 1.0, 1.0, 1.0, 2.0, 1.0, 7.0, …]
[1.0, 0.0, 0.0, 6.0, 6.0, 4.0, 5.0, 3.0, 0.0, 0.0, …]

We obtain the desired result: the optically similar images get assigned a similar feature vector, and the optically dissimilar images get assigned a very different feature vector such that the Euclidean distances between the vectors assigned to the first two images are smaller than the Euclidean distances between the vectors assigned to the first two and the third image.

So, as above, we can assign such a vector to every image, and then calculate Euclidean distances to sort the results.

Now let’s write some code. The LIRE library supports the extraction of FCTH vectors as arrays of double values from images in a very straightforward way. All you need to do is to create the correct DocumentBuilder object, which in turn builds a Lucene document containing the feature vector data. This data can be handled by creating a FCTH object, which can be used to extract a double array by the getDoubleHistogram member function. Here is how to get the feature vector from a file path to a JPEG image:


The remainder of the code of our sample application is straightforward. All we need to do is to build an array containing all the feature vectors of the JPEG images in a directory and sort it by the Euclidean distance between the respective feature vector and the feature vector of the search image. In JAVA, this can be done very quickly. I have created a sample project with Maven on github showing how to implement the sketched idea.

Here is the link: https://github.com/aoldemeier/ImageSimilarityWithLIRE

Enjoy!

Final notes

For large databases of images, the provided sample code is too slow. Fortunately, there are more effective search algorithms for high-dimensional data such as image feature vectors. A popular technique for large-scale content-based image search, which we employ at Mayflower, is locality sensitive hashing. However, this has to be left for another day.

Für neue Blogupdates anmelden:


6 Gedanken zu “Image similarity search with LIRE

  1. Pingback: Adventskalender am 2. Dezember « F-LOG-GE

  2. Hello
    I could not execute page till I update code as follows:
    change
    fcthDescriptor.setByteArrayRepresentation(doc.getFields().get(0).getBinaryValue());
    to
    fcthDescriptor.setByteArrayRepresentation(doc.getFields().get(1).binaryValue().bytes);

    Thanks
    San.

  3. This application requires two parameters: the name of a directory containing JPEG images, and a file name of a JPEG image.
    After Running it Simply prints this message..
    Plzz help ..
    Thanku

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind markiert *