opencv图物识别深度学习-推荐系统-CV-NLP

OpenCV实现图像搜索引擎

2016-05-21  本文已影响1754人  胡哈哈哈

简单介绍一下OpenCV

OpenCV was designed for computational efficiency and with a strong focus on real-time applications. Written in optimized C/C++, the library can take advantage of multi-core processing. Enabled with OpenCL, it can take advantage of the hardware acceleration of the underlying heterogeneous compute platform. Adopted all around the world, OpenCV has more than 47 thousand people of user community and estimated number of downloads exceeding 9 million. Usage ranges from interactive art, to mines inspection, stitching maps on the web or through advanced robotics.

OpenCV(Open Source Computer Vision Library)的计算效率很高且能够完成实时任务。OpenCV库由优化的C/C++代码编写而成,能够充分发挥多核处理和硬件加速的优势。OpenCV有大量技术社区和超过900万的下载量,它的使用范围极为广泛,如人机互动、资源检查、拼接地图等。

0.Python+OpenCV实现图像搜索引擎

之前看到谷歌和百度出了图像搜索引擎,查阅了相关资料深入了解了图像搜索引擎的算法原理。一部分参考了用Python和OpenCV创建一个图片搜索引擎的完整指南。决定自己实现一个简单的图像搜索引擎,也可以让自己更快地查找mac中的图片。为什么使用OpenCV+Python实现图像搜索引擎呢?

1. 图像搜索原理

图像搜索算法基本可以分为如下步骤:

2. 图片搜索引擎算法及框架设计

基本步骤

所需模块

封装类及驱动程序

  1. 类成员bins记录HSV色彩空间生成的色相、饱和度及明度分布直方图的最佳bins分配。bins分配过多则可能导致程序效率低下,匹配难度和匹配要求过分苛严;bins分配过少则会导致匹配精度不足,不能表证图像特征。
  2. 成员函数getHistogram(self, image, mask, isCenter)。生成图像的色彩特征分布直方图。image为待处理图像,mask为图像处理区域的掩模,isCenter判断是否为图像中心,从而有效地对色彩特征向量做加权处理。权重weight5.0。采用OpenCV的calcHist()方法获得直方图,normalize()方法归一化。
  3. 成员函数describe(self, image)。将图像从BGR色彩空间转为HSV色彩空间(此处应注意OpenCV读入图像的色彩空间为BGR而非RGB)。生成左上、右上、左下、右下、中心部分的掩模。中心部分掩模的形状为椭圆形。这样能够有效区分中心部分和边缘部分,从而在getHistogram()方法中对不同部位的色彩特征做加权处理
class ColorDescriptor:
    __slot__ = ["bins"]
    def __init__(self, bins):
        self.bins = bins
    def getHistogram(self, image, mask, isCenter):
        # get histogram
        imageHistogram = cv2.calcHist([image], [0, 1, 2], mask, self.bins, [0, 180, 0, 256, 0, 256])
        # normalize
        imageHistogram = cv2.normalize(imageHistogram, imageHistogram).flatten()
        if isCenter:
            weight = 5.0
            for index in xrange(len(imageHistogram)):
                imageHistogram[index] *= weight
        return imageHistogram
    def describe(self, image):
        image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
        features = []
        # get dimension and center
        height, width = image.shape[0], image.shape[1]
        centerX, centerY = int(width * 0.5), int(height * 0.5)
        # initialize mask dimension
        segments = [(0, centerX, 0, centerY), (0, centerX, centerY, height), (centerX, width, 0, centerY), (centerX, width, centerY, height)]
        # initialize center part
        axesX, axesY = int(width * 0.75) / 2, int (height * 0.75) / 2
        ellipseMask = numpy.zeros([height, width], dtype="uint8")
        cv2.ellipse(ellipseMask, (centerX, centerY), (axesX, axesY), 0, 0, 360, 255, -1)
        # initialize corner part
        for startX, endX, startY, endY in segments:
            cornerMask = numpy.zeros([height, width], dtype="uint8")
            cv2.rectangle(cornerMask, (startX, startY), (endX, endY), 255, -1)
            cornerMask = cv2.subtract(cornerMask, ellipseMask)
            # get histogram of corner part
            imageHistogram = self.getHistogram(image, cornerMask, False)
            features.append(imageHistogram)
        # get histogram of center part
        imageHistogram = self.getHistogram(image, ellipseMask, True)
        features.append(imageHistogram)
        # return
        return features
  1. 类成员dimension。将所有图片归一化(降低采样)为dimension所规定的尺寸。由此才能够用于统一的匹配和构图空间特征的生成。
  2. 成员函数describe(self, image)。将图像从BGR色彩空间转为HSV色彩空间(此处应注意OpenCV读入图像的色彩空间为BGR而非RGB)。返回HSV色彩空间的矩阵,等待在搜索引擎核心中的下一步处理。
class StructureDescriptor:
    __slot__ = ["dimension"]
    def __init__(self, dimension):
        self.dimension = dimension
    def describe(self, image):
        image = cv2.resize(image, self.dimension, interpolation=cv2.INTER_CUBIC)
        # image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
        return image
  1. 类成员colorIndexPathstructureIndexPath。记录色彩空间特征索引表路径和结构特征索引表路径。
  2. 成员函数solveColorDistance(self, features, queryFeatures, eps = 1e-5)。求featuresqueryFeatures特征向量的二范数eps是为了避免除零错误
  3. 成员函数solveStructureDistance(self, structures, queryStructures, eps = 1e-5)。同样是求特征向量的二范数eps是为了避免除零错误。需作统一化处理,color和structure特征向量距离相对比例适中,不可过分偏颇。
  4. 成员函数searchByColor(self, queryFeatures)。使用csv模块的reader方法读入索引表数据。采用re的split方法解析数据格式。用字典searchResults存储query图像与库中图像的距离,键为图库内图像名imageName,值为距离distance
  5. 成员函数transformRawQuery(self, rawQueryStructures)。将未处理的query图像矩阵转为用于匹配的特征向量形式
  6. 成员函数searchByStructure(self, rawQueryStructures)。类似4。
  7. 成员函数search(self, queryFeatures, rawQueryStructures, limit = 3)。将searchByColor方法和searchByStructure的结果汇总,获得总匹配分值,分值越低代表综合距离越小,匹配程度越高。返回前limit个最佳匹配图像。
class Searcher:
    __slot__ = ["colorIndexPath", "structureIndexPath"]
    def __init__(self, colorIndexPath, structureIndexPath):
        self.colorIndexPath, self.structureIndexPath = colorIndexPath, structureIndexPath
    def solveColorDistance(self, features, queryFeatures, eps = 1e-5):
        distance = 0.5 * numpy.sum([((a - b) ** 2) / (a + b + eps) for a, b in zip(features, queryFeatures)])
        return distance
    def solveStructureDistance(self, structures, queryStructures, eps = 1e-5):
        distance = 0
        normalizeRatio = 5e3
        for index in xrange(len(queryStructures)):
            for subIndex in xrange(len(queryStructures[index])):
                a = structures[index][subIndex]
                b = queryStructures[index][subIndex]
                distance += (a - b) ** 2 / (a + b + eps)
        return distance / normalizeRatio
    def searchByColor(self, queryFeatures):
        searchResults = {}
        with open(self.colorIndexPath) as indexFile:
            reader = csv.reader(indexFile)
            for line in reader:
                features = []
                for feature in line[1:]:
                    feature = feature.replace("[", "").replace("]", "")
                    findStartPosition = 0
                    feature = re.split("\s+", feature)
                    rmlist = []
                    for index, strValue in enumerate(feature):
                        if strValue == "":
                            rmlist.append(index)
                    for _ in xrange(len(rmlist)):
                        currentIndex = rmlist[-1]
                        rmlist.pop()
                        del feature[currentIndex]
                    feature = [float(eachValue) for eachValue in feature]
                    features.append(feature)
                distance = self.solveColorDistance(features, queryFeatures)
                searchResults[line[0]] = distance
            indexFile.close()
        # print "feature", sorted(searchResults.iteritems(), key = lambda item: item[1], reverse = False)
        return searchResults
    def transformRawQuery(self, rawQueryStructures):
        queryStructures = []
        for substructure in rawQueryStructures:
            structure = []
            for line in substructure:
                for tripleColor in line:
                    structure.append(float(tripleColor))
            queryStructures.append(structure)
        return queryStructures
    def searchByStructure(self, rawQueryStructures):
        searchResults = {}
        queryStructures = self.transformRawQuery(rawQueryStructures)
        with open(self.structureIndexPath) as indexFile:
            reader = csv.reader(indexFile)
            for line in reader:
                structures = []
                for structure in line[1:]:
                    structure = structure.replace("[", "").replace("]", "")
                    structure = re.split("\s+", structure)
                    if structure[0] == "":
                        structure = structure[1:]
                    structure = [float(eachValue) for eachValue in structure]
                    structures.append(structure)
                distance = self.solveStructureDistance(structures, queryStructures)
                searchResults[line[0]] = distance
            indexFile.close()
        # print "structure", sorted(searchResults.iteritems(), key = lambda item: item[1], reverse = False)
        return searchResults
    def search(self, queryFeatures, rawQueryStructures, limit = 3):
        featureResults = self.searchByColor(queryFeatures)
        structureResults = self.searchByStructure(rawQueryStructures)
        results = {}
        for key, value in featureResults.iteritems():
            results[key] = value + structureResults[key]
        results = sorted(results.iteritems(), key = lambda item: item[1], reverse = False)
        return results[ : limit]
python index.py --dataset dataset --colorindex color——index.csv --structure structure_index.csv

dataset为图片库路径。color_index.csv为色彩空间特征索引表路径。structure_index.csv为构图空间特征索引表路径。

import color_descriptor
import structure_descriptor
import glob
import argparse
import cv2

searchArgParser = argparse.ArgumentParser()
searchArgParser.add_argument("-d", "--dataset", required = True, help = "Path to the directory that contains the images to be indexed")
searchArgParser.add_argument("-c", "--colorindex", required = True, help = "Path to where the computed color index will be stored")
searchArgParser.add_argument("-s", "--structureindex", required = True, help = "Path to where the computed structure index will be stored")
arguments = vars(searchArgParser.parse_args())

idealBins = (8, 12, 3)
colorDesriptor = color_descriptor.ColorDescriptor(idealBins)

output = open(arguments["colorindex"], "w")

for imagePath in glob.glob(arguments["dataset"] + "/*.jpg"):
    imageName = imagePath[imagePath.rfind("/") + 1 : ]
    image = cv2.imread(imagePath)
    features = colorDesriptor.describe(image)
    # write features to file
    features = [str(feature).replace("\n", "") for feature in features]
    output.write("%s,%s\n" % (imageName, ",".join(features)))
# close index file
output.close()

idealDimension = (16, 16)
structureDescriptor = structure_descriptor.StructureDescriptor(idealDimension)

output = open(arguments["structureindex"], "w")

for imagePath in glob.glob("dataset" + "/*.jpg"):
    imageName = imagePath[imagePath.rfind("/") + 1 : ]
    image = cv2.imread(imagePath)
    structures = structureDescriptor.describe(image)
    # write structures to file
    structures = [str(structure).replace("\n", "") for structure in structures]
    output.write("%s,%s\n" % (imageName, ",".join(structures)))
# close index file
output.close()

python searchEngine.py -c color_index.csv -s structure_index.csv -r dataset -q query/pyramid.jpg 

dataset为图片库路径。color_index.csv为色彩空间特征索引表路径。structure_index.csv为构图空间特征索引表路径,query/pyramid.jpg为待搜索图片路径。

searchArgParser = argparse.ArgumentParser()
searchArgParser.add_argument("-c", "--colorindex", required = True, help = "Path to where the computed color index will be stored")
searchArgParser.add_argument("-s", "--structureindex", required = True, help = "Path to where the computed structure index will be stored")
searchArgParser.add_argument("-q", "--query", required = True, help = "Path to the query image")
searchArgParser.add_argument("-r", "--resultpath", required = True, help = "Path to the result path")
searchArguments = vars(searchArgParser.parse_args())

idealBins = (8, 12, 3)
idealDimension = (16, 16)

colorDescriptor = color_descriptor.ColorDescriptor(idealBins)
structureDescriptor = structure_descriptor.StructureDescriptor(idealDimension)
queryImage = cv2.imread(searchArguments["query"])
colorIndexPath = searchArguments["colorindex"]
structureIndexPath = searchArguments["structureindex"]
resultPath = searchArguments["resultpath"]

queryFeatures = colorDescriptor.describe(queryImage)
queryStructures = structureDescriptor.describe(queryImage)

imageSearcher = searcher.Searcher(colorIndexPath, structureIndexPath)
searchResults = imageSearcher.search(queryFeatures, queryStructures)

for imageName, score in searchResults:
    queryResult = cv2.imread(resultPath + "/" + imageName)
    cv2.imshow("Result Score: " + str(int(score)) + " (lower is better)", queryResult)
    cv2.waitKey(0)

cv2.imshow("Query", queryImage)
cv2.waitKey(0)

3. 搜索引擎测试

Qeury: fish.jpg

fish

Result(匹配分值越低越好):

  1. Score: 0


    fish
  2. Score: 17


    fish
  3. Score: 21


    fish

Qeury: forest.jpg

forest

Result(匹配分值越低越好):

  1. Score: 0


    forest
  2. Score: 33


    forest
  3. Score: 33


    forest

Qeury: trip.jpg

trip

Result(匹配分值越低越好):

  1. Score: 0


    trip
  2. Score: 23


    trip
  3. Score: 24


    trip

Qeury: zebra.jpg

zebra

Result(匹配分值越低越好):

  1. Score: 0


    zebra
  2. Score: 23


    zebra
  3. Score: 25


    zebra

总结:总能搜索到完全一致的图像(即原图)。搜索得到的图像与原图基本符合。测试成功。

4. Python源代码

color_descriptor.py

import cv2
import numpy

class ColorDescriptor:
    __slot__ = ["bins"]
    def __init__(self, bins):
        self.bins = bins
    def getHistogram(self, image, mask, isCenter):
        # get histogram
        imageHistogram = cv2.calcHist([image], [0, 1, 2], mask, self.bins, [0, 180, 0, 256, 0, 256])
        # normalize
        imageHistogram = cv2.normalize(imageHistogram, imageHistogram).flatten()
        if isCenter:
            weight = 5.0
            for index in xrange(len(imageHistogram)):
                imageHistogram[index] *= weight
        return imageHistogram
    def describe(self, image):
        image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
        features = []
        # get dimension and center
        height, width = image.shape[0], image.shape[1]
        centerX, centerY = int(width * 0.5), int(height * 0.5)
        # initialize mask dimension
        segments = [(0, centerX, 0, centerY), (0, centerX, centerY, height), (centerX, width, 0, centerY), (centerX, width, centerY, height)]
        # initialize center part
        axesX, axesY = int(width * 0.75) / 2, int (height * 0.75) / 2
        ellipseMask = numpy.zeros([height, width], dtype="uint8")
        cv2.ellipse(ellipseMask, (centerX, centerY), (axesX, axesY), 0, 0, 360, 255, -1)
        # initialize corner part
        for startX, endX, startY, endY in segments:
            cornerMask = numpy.zeros([height, width], dtype="uint8")
            cv2.rectangle(cornerMask, (startX, startY), (endX, endY), 255, -1)
            cornerMask = cv2.subtract(cornerMask, ellipseMask)
            # get histogram of corner part
            imageHistogram = self.getHistogram(image, cornerMask, False)
            features.append(imageHistogram)
        # get histogram of center part
        imageHistogram = self.getHistogram(image, ellipseMask, True)
        features.append(imageHistogram)
        # return
        return features

structure_descriptor.py

import cv2

class StructureDescriptor:
    __slot__ = ["dimension"]
    def __init__(self, dimension):
        self.dimension = dimension
    def describe(self, image):
        image = cv2.resize(image, self.dimension, interpolation=cv2.INTER_CUBIC)
        # image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
        return image

searcher.py

import numpy
import csv
import re

class Searcher:
    __slot__ = ["colorIndexPath", "structureIndexPath"]
    def __init__(self, colorIndexPath, structureIndexPath):
        self.colorIndexPath, self.structureIndexPath = colorIndexPath, structureIndexPath
    def solveColorDistance(self, features, queryFeatures, eps = 1e-5):
        distance = 0.5 * numpy.sum([((a - b) ** 2) / (a + b + eps) for a, b in zip(features, queryFeatures)])
        return distance
    def solveStructureDistance(self, structures, queryStructures, eps = 1e-5):
        distance = 0
        normalizeRatio = 5e3
        for index in xrange(len(queryStructures)):
            for subIndex in xrange(len(queryStructures[index])):
                a = structures[index][subIndex]
                b = queryStructures[index][subIndex]
                distance += (a - b) ** 2 / (a + b + eps)
        return distance / normalizeRatio
    def searchByColor(self, queryFeatures):
        searchResults = {}
        with open(self.colorIndexPath) as indexFile:
            reader = csv.reader(indexFile)
            for line in reader:
                features = []
                for feature in line[1:]:
                    feature = feature.replace("[", "").replace("]", "")
                    findStartPosition = 0
                    feature = re.split("\s+", feature)
                    rmlist = []
                    for index, strValue in enumerate(feature):
                        if strValue == "":
                            rmlist.append(index)
                    for _ in xrange(len(rmlist)):
                        currentIndex = rmlist[-1]
                        rmlist.pop()
                        del feature[currentIndex]
                    feature = [float(eachValue) for eachValue in feature]
                    features.append(feature)
                distance = self.solveColorDistance(features, queryFeatures)
                searchResults[line[0]] = distance
            indexFile.close()
        # print "feature", sorted(searchResults.iteritems(), key = lambda item: item[1], reverse = False)
        return searchResults
    def transformRawQuery(self, rawQueryStructures):
        queryStructures = []
        for substructure in rawQueryStructures:
            structure = []
            for line in substructure:
                for tripleColor in line:
                    structure.append(float(tripleColor))
            queryStructures.append(structure)
        return queryStructures
    def searchByStructure(self, rawQueryStructures):
        searchResults = {}
        queryStructures = self.transformRawQuery(rawQueryStructures)
        with open(self.structureIndexPath) as indexFile:
            reader = csv.reader(indexFile)
            for line in reader:
                structures = []
                for structure in line[1:]:
                    structure = structure.replace("[", "").replace("]", "")
                    structure = re.split("\s+", structure)
                    if structure[0] == "":
                        structure = structure[1:]
                    structure = [float(eachValue) for eachValue in structure]
                    structures.append(structure)
                distance = self.solveStructureDistance(structures, queryStructures)
                searchResults[line[0]] = distance
            indexFile.close()
        # print "structure", sorted(searchResults.iteritems(), key = lambda item: item[1], reverse = False)
        return searchResults
    def search(self, queryFeatures, rawQueryStructures, limit = 3):
        featureResults = self.searchByColor(queryFeatures)
        structureResults = self.searchByStructure(rawQueryStructures)
        results = {}
        for key, value in featureResults.iteritems():
            results[key] = value + structureResults[key]
        results = sorted(results.iteritems(), key = lambda item: item[1], reverse = False)
        return results[ : limit]

index.py

import color_descriptor
import structure_descriptor
import glob
import argparse
import cv2

searchArgParser = argparse.ArgumentParser()
searchArgParser.add_argument("-d", "--dataset", required = True, help = "Path to the directory that contains the images to be indexed")
searchArgParser.add_argument("-c", "--colorindex", required = True, help = "Path to where the computed color index will be stored")
searchArgParser.add_argument("-s", "--structureindex", required = True, help = "Path to where the computed structure index will be stored")
arguments = vars(searchArgParser.parse_args())

idealBins = (8, 12, 3)
colorDesriptor = color_descriptor.ColorDescriptor(idealBins)

output = open(arguments["colorindex"], "w")

for imagePath in glob.glob(arguments["dataset"] + "/*.jpg"):
    imageName = imagePath[imagePath.rfind("/") + 1 : ]
    image = cv2.imread(imagePath)
    features = colorDesriptor.describe(image)
    # write features to file
    features = [str(feature).replace("\n", "") for feature in features]
    output.write("%s,%s\n" % (imageName, ",".join(features)))
# close index file
output.close()

idealDimension = (16, 16)
structureDescriptor = structure_descriptor.StructureDescriptor(idealDimension)

output = open(arguments["structureindex"], "w")

for imagePath in glob.glob("dataset" + "/*.jpg"):
    imageName = imagePath[imagePath.rfind("/") + 1 : ]
    image = cv2.imread(imagePath)
    structures = structureDescriptor.describe(image)
    # write structures to file
    structures = [str(structure).replace("\n", "") for structure in structures]
    output.write("%s,%s\n" % (imageName, ",".join(structures)))
# close index file
output.close()

searchEngine.py

import color_descriptor
import structure_descriptor
import searcher
import argparse
import cv2

searchArgParser = argparse.ArgumentParser()
searchArgParser.add_argument("-c", "--colorindex", required = True, help = "Path to where the computed color index will be stored")
searchArgParser.add_argument("-s", "--structureindex", required = True, help = "Path to where the computed structure index will be stored")
searchArgParser.add_argument("-q", "--query", required = True, help = "Path to the query image")
searchArgParser.add_argument("-r", "--resultpath", required = True, help = "Path to the result path")
searchArguments = vars(searchArgParser.parse_args())

idealBins = (8, 12, 3)
idealDimension = (16, 16)

colorDescriptor = color_descriptor.ColorDescriptor(idealBins)
structureDescriptor = structure_descriptor.StructureDescriptor(idealDimension)
queryImage = cv2.imread(searchArguments["query"])
colorIndexPath = searchArguments["colorindex"]
structureIndexPath = searchArguments["structureindex"]
resultPath = searchArguments["resultpath"]

queryFeatures = colorDescriptor.describe(queryImage)
queryStructures = structureDescriptor.describe(queryImage)

imageSearcher = searcher.Searcher(colorIndexPath, structureIndexPath)
searchResults = imageSearcher.search(queryFeatures, queryStructures)

for imageName, score in searchResults:
    queryResult = cv2.imread(resultPath + "/" + imageName)
    cv2.imshow("Result Score: " + str(int(score)) + " (lower is better)", queryResult)
    cv2.waitKey(0)

cv2.imshow("Query", queryImage)
cv2.waitKey(0)

searchEngineTest.py

import cv2
import glob
import csv
import re
import numpy
import structure_descriptor

idealDimension = (16, 16)
structureDescriptor = structure_descriptor.StructureDescriptor(idealDimension)

testImage = cv2.imread("query/forest.jpg")
rawQueryStructures = structureDescriptor.describe(testImage)

# index
output = open("structureIndex.csv", "w")

for imagePath in glob.glob("dataset" + "/*.jpg"):
    imageName = imagePath[imagePath.rfind("/") + 1 : ]
    image = cv2.imread(imagePath)
    structures = structureDescriptor.describe(image)
    # write structures to file
    structures = [str(structure).replace("\n", "") for structure in structures]
    output.write("%s,%s\n" % (imageName, ",".join(structures)))
# close index file
output.close()

# searcher

def solveStructureDistance(self, structures, queryStructures, eps = 1e-5):
    distance = 0
    for index in xrange(len(queryFeatures)):
        for subIndex in xrange(len(queryFeatures[index])):
            a = features[index][subIndex]
            b = queryFeatures[index][subIndex]
            distance += (a - b) ** 2 / (a + b + eps)
    return distance / 5e3

queryStructures = []
for substructure in rawQueryStructures:
    structure = []
    for line in substructure:
        for tripleColor in line:
            structure.append(float(tripleColor))
    queryStructures.append(structure)
searchResults = {}
with open("structureIndex.csv") as indexFile:
    reader = csv.reader(indexFile)
    for line in reader:
        structures = []
        for structure in line[1:]:
            structure = structure.replace("[", "").replace("]", "")
            structure = re.split("\s+", structure)
            if structure[0] == "":
                structure = structure[1:]
            structure = [float(eachValue) for eachValue in structure]
            print len(structure)
            structures.append(structure)
        distance = solveDistance(structures, queryStructures)
        searchResults[line[0]] = distance
    indexFile.close()
searchResults = sorted(searchResults.iteritems(), key=lambda item: item[1], reverse=False)

print searchResults

上一篇下一篇

猜你喜欢

热点阅读