人工智能

tensorflow人脸识别(自己的数据集)

2018-10-03  本文已影响80人  yanghedada

可以在云盘下载打包文件包括API,数据
把原有的文件夹下面的object_detection删掉,这里面的(__init____.py)文件百度云盘上传不了,全都没成功,所以在把文件下来之后object_detection/object_detection/下的内容删掉,把object_detection.zip解压到object_detection里面。
链接:https://pan.baidu.com/s/1BkMpGOF1cVjJl2Hpip-Hpg
提取码:9stc

首先先下载图片

网友ACLJW的爬虫代码简单高效


pachong.py
# @File    : pachong.py

import requests
import re
import os
from pypinyin import pinyin, lazy_pinyin
def getHTMLText(url):
    try:
        r = requests.get(url,timeout=30)
        r.raise_for_status()
        r.encoding = r.apparent_encoding
        return r.text
    except:
        print("")

def getPageUrls(text,name):
    re_pageUrl=r'href="(.+)">\s*<img src="(.+)" alt="'+name
    return re.findall(re_pageUrl,text)

def downPictures(text,root,name,L):
    pageUrls=getPageUrls(text,name)
    titles=re.findall(r'alt="'+name+r'(.+)" ',text)
    for i in range(len(pageUrls)):
        pageUrl=pageUrls[i][0]
        path = root + titles[i]+ "//"
        if not os.path.exists(path):
            os.mkdir(path)
        if not os.listdir(path):
            pageText=getHTMLText(pageUrl)
            totalPics=int(re.findall(r'<em>(.+)</em>)',pageText)[0])
            downUrl=re.findall(r'href="(.+?)" class="">下载图片',pageText)[0]
            cnt=1;
            while(cnt<=totalPics):
                L += 1
                picPath=path+"%s.jpg"%str(L)
                r=requests.get(downUrl)
                with open(picPath,'wb') as f:
                    f.write(r.content)
                    f.close()
                print('{} - 第{}张下载已完成\n'.format(titles[i],L))
                cnt+=1
                nextPageUrl=re.findall(r'href="(.+?)">下一张',pageText)[0]
                pageText=getHTMLText(nextPageUrl)
                downUrl=re.findall(r'href="(.+?)" class="">下载图片',pageText)[0]
    return L

def main():
    name=input("请输入你喜欢的明星的名字:")
    nameUrl="http://www.win4000.com/mt/"+''.join(lazy_pinyin(name))+".html"
    L  = 0
    try:
        text=getHTMLText(nameUrl)
        if not re.findall(r'暂无(.+)!',text):
            root = "C:/Users/yanghe/Desktop/data/"+name+"//"
            if not os.path.exists(root):
                os.mkdir(root)
            L = downPictures(text,root,name, L)
            try:
                nextPage=re.findall(r'next" href="(.+)"',text)[0]
                while(nextPage):
                    nextText=getHTMLText(nextPage)
                    L = downPictures(nextText,root,name,L)
                    nextPage=re.findall(r'next" href="(.+)"',nextText)[0]
            except IndexError:
                print("已全部下载完毕")
    except TypeError:
        print("不好意思,没有{}的照片".format(name))
    return

if __name__ == '__main__':
    main()


打上标签

1.打标签用的软件是是labelImg.exe,这款软件操作简单。
labelImg.exe的快捷键


2.这里需要设置类别:

这里有一个open_dir是照片文件打开的目录,还有一个Ctrl+R更改默认xml文件地址。这里是为了生成和Pascal voc2007数据集一样格式的文件。
每打一张图片就保存一下,点下ok就行了,好像是自动保存的,超级简单。

给戚薇打上标签

图片爬到的质量有问题,大部分是侧脸,柳岩的全是戴帽子的照片,哎!!这明星的写真集照片看的眼都花了

Pascal voc2007数据集简单介绍

具体细节查看点这里:数据集:Pascal voc2007数据集分析
labelImg
在Pascal voc2007中(对于2007_000392.jpg)对于这张图有如下的对应xml文件。(2007_000392.jpg图在下面)

#2007_000392.xml
<annotation>
    <folder>VOC2012</folder>                           
    <filename>2007_000392.jpg</filename>                               //文件名
    <source>                                                           //图像来源(不重要)
        <database>The VOC2007 Database</database>
        <annotation>PASCAL VOC2007</annotation>
        <image>flickr</image>
    </source>
    <size>                                             //图像尺寸(长宽以及通道数)                      
        <width>500</width>
        <height>332</height>
        <depth>3</depth>
    </size>
    <segmented>1</segmented>                                   //是否用于分割(在图像物体识别中01无所谓)
    <object>                                                           //检测到的物体
        <name>horse</name>                                         //物体类别
        <pose>Right</pose>                                         //拍摄角度
        <truncated>0</truncated>                                   //是否被截断(0表示完整)
        <difficult>0</difficult>                                   //目标是否难以识别(0表示容易识别)
        <bndbox>                                                   //bounding-box(包含左下角和右上角xy坐标)
            <xmin>100</xmin>
            <ymin>96</ymin>
            <xmax>355</xmax>
            <ymax>324</ymax>
        </bndbox>
    </object>
    <object>                                                           //检测到多个物体
        <name>person</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>198</xmin>
            <ymin>58</ymin>
            <xmax>286</xmax>
            <ymax>197</ymax>
        </bndbox>
    </object>
</annotation>

在2007_000392.jpg这张图里面有个人在骑马。在xml文件里面object有两个(person和horse),包括信息有是否识别困难,截断,角度等,左下角和右上角的坐标。

2007_000392.jpg

我的类别如下:


把xml文件生成csv文件

这里的path 就是保存xml文件的目录,data是你要保存csv文件的目录。

具体查看请点这里:TensorFlow Object Detection API教程——利用自己制作的数据集进行训练预测和测试
这里记得设置一下你的训练集和测试集的大小,这里是0.67

# -*- coding: utf-8 -*-
import os
import glob
import pandas as pd
import xml.etree.ElementTree as ET

def xml_to_csv(path):
    xml_list = []
    # 读取注释文件
    for xml_file in glob.glob(path + '/*.xml'):
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for member in root.findall('object'):
            value = (root.find('filename').text + '.jpg',
                     int(root.find('size')[0].text),
                     int(root.find('size')[1].text),
                     member[0].text,
                     int(member[4][0].text),
                     int(member[4][1].text),
                     int(member[4][2].text),
                     int(member[4][3].text)
                     )
            xml_list.append(value)
    column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']

    # 将所有数据分为样本集和验证集,一般按照3:1的比例
    train_list = xml_list[0: int(len(xml_list) * 0.67)]
    eval_list = xml_list[int(len(xml_list) * 0.67) + 1: ]

    # 保存为CSV格式
    train_df = pd.DataFrame(train_list, columns=column_name)
    eval_df = pd.DataFrame(eval_list, columns=column_name)
    train_df.to_csv('data/train.csv', index=None)
    eval_df.to_csv('data/eval.csv', index=None)


def main():
    path = './xml'
    xml_to_csv(path)
    print('Successfully converted xml to csv.')

main()

把csv生成TFrecord文件

import os
import io
import pandas as pd
import tensorflow as tf

from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict

flags = tf.app.flags
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS


# 将分类名称转成ID号
def class_text_to_int(row_label):
    if row_label == 'damimi':
        return 1
    elif row_label == 'fanbingbing':
        return 2
    elif row_label == 'liuyan':
        return 3
    elif row_label == 'nazha':
        return 4
    elif row_label == 'xiaowei':  
        return 5
    else:
        print('NONE: ' + row_label)
        None


def split(df, group):
    data = namedtuple('data', ['filename', 'object'])
    gb = df.groupby(group)
    return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]


def create_tf_example(group, path):
    print(os.path.join(path, '{}'.format(group.filename)))
    with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
        encoded_jpg = fid.read()
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    width, height = image.size

    filename = (group.filename + '.jpg').encode('utf8')
    image_format = b'jpg'
    xmins = []
    xmaxs = []
    ymins = []
    ymaxs = []
    classes_text = []
    classes = []

    for index, row in group.object.iterrows():
        xmins.append(row['xmin'] / width)
        xmaxs.append(row['xmax'] / width)
        ymins.append(row['ymin'] / height)
        ymaxs.append(row['ymax'] / height)
        classes_text.append(row['class'].encode('utf8'))
        classes.append(class_text_to_int(row['class']))

    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example


def main(csv_input, output_path, imgPath):
    writer = tf.python_io.TFRecordWriter(output_path)
    path = imgPath
    examples = pd.read_csv(csv_input)
    grouped = split(examples, 'filename')
    for group in grouped:
        tf_example = create_tf_example(group, path)
        writer.write(tf_example.SerializeToString())

    writer.close()
    print('Successfully created the TFRecords: {}'.format(output_path))


if __name__ == '__main__':

    imgPath = './xml' #你的图片路径

    # 生成train.record文件
    output_path = 'data/train.record' #你的record保存路径
    csv_input = 'data/train.csv' #你的csv文件路径
    main(csv_input, output_path, imgPath)

    # 生成验证文件 eval.record
    output_path = 'data/eval.record'
    csv_input = 'data/eval.csv'
    main(csv_input, output_path, imgPath)


设置一下图片路径和record的保存路径就行了

修改ssd_inception_v2_coco.config文件

就是修改一下目录,训练次数,record文件,和lable文件等信息:
num_classes: 5
num_steps: 10000
batch_size: 20
fine_tune_checkpoint:"ssd_inception_v2_coco_2018_01_28/model.ckpt"
train_input_reader:的下面{
input_path: "record/train.record"
label_map_path: "record/label_map.pbtxt"}
test_input_reader:的下面{
input_path: "record/val.record"
label_map_path: "record/label_map.pbtxt"}

# SSD with Inception v2 configuration for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  ssd {
    num_classes: 5
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
        reduce_boxes_in_lowest_layer: true
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 3
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_inception_v2'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
          anchorwise_output: true
        }
      }
      localization_loss {
        weighted_smooth_l1 {
          anchorwise_output: true
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  batch_size: 20
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 10000
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  fine_tune_checkpoint: "ssd_inception_v2_coco_2018_01_28/model.ckpt"
  from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 1000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "record/train.record"
  }
  label_map_path: "record/label_map.pbtxt"
}

eval_config: {
  num_examples: 4952
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "record/val.record"
  }
  label_map_path: "record/label_map.pbtxt"
  shuffle: false
  num_readers: 1
  num_epochs: 1
}

训练

这是我的目录图



在cmd下执行,cpu训练一个晚上才666次,这个精度不咋滴。

python train.py \
--logtostderr  \
--train_dir=train \
--pipeline_config_path=ssd_inception_v2_coco.config

生成pb文件

训练666次了。早上起来直接Ctrl+c关掉,如果想继续在666上继续训练,直接执行上面的就可以了,它会读取train 下面的训练文件的。

python export_inference_graph.py  
--pipeline_config_path ssd_inception_v2_coco.config
 --trained_checkpoint_prefix "pb/model.ckpt-666" 
--output_directory pb

生成pb的时候出错了

ValueError: Protocol message RewriterConfig has no "layout_optimizer" field

在/object_detection/exporter.py”文件,将第72行的layout_optimizer与相互更改optimize_tensor_layout就解决问题啦。

测试

test_image.py

import matplotlib.pyplot as plt
import numpy as np
import os 
import tensorflow as tf
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util
from PIL import Image


def test():
    #重置图
    tf.reset_default_graph()
    '''
    载入模型以及数据集样本标签,加载待测试的图片文件
    '''
    #指定要使用的模型的路径  包含图结构,以及参数
    PATH_TO_CKPT = 'pb/frozen_inference_graph.pb'
    
    #测试图片所在的路径
    PATH_TO_TEST_IMAGES_DIR = './test_images'
    
    TEST_IMAGE_PATHS = [os.path.join(PATH_TO_TEST_IMAGES_DIR,'{}.jpg'.format(i)) for i in range(1,11) ]
    
    #数据集对应的label pascal_label_map.pbtxt文件保存了index和类别名之间的映射
    PATH_TO_LABELS = "record/label_map.pbtxt"
    
    NUM_CLASSES = 5
     
    #重新定义一个图
    output_graph_def = tf.GraphDef()
    
    with tf.gfile.GFile(PATH_TO_CKPT,'rb') as fid:
        #将*.pb文件读入serialized_graph
        serialized_graph = fid.read()
        #将serialized_graph的内容恢复到图中
        output_graph_def.ParseFromString(serialized_graph)
        #print(output_graph_def)
        #将output_graph_def导入当前默认图中(加载模型)
        tf.import_graph_def(output_graph_def,name='')
        
    print('模型加载完成')    
    
    #载入coco数据集标签文件
    label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
    categories = label_map_util.convert_label_map_to_categories(label_map,max_num_classes = NUM_CLASSES,use_display_name = True)
    category_index = label_map_util.create_category_index(categories)
    
    
    '''
    定义session
    '''
    def load_image_into_numpy_array(image):
        '''
        将图片转换为ndarray数组的形式
        '''
        im_width,im_height = image.size
        return np.array(image.getdata()).reshape((im_height,im_width,3)).astype(np.uint0)
    
    #设置输出图片的大小
    IMAGE_SIZE = (12,8)
    
    #使用默认图,此时已经加载了模型
    detection_graph = tf.get_default_graph()
    
    with tf.Session(graph=detection_graph) as sess:
        for image_path in TEST_IMAGE_PATHS:
            image = Image.open(image_path)
            #将图片转换为numpy格式
            image_np = load_image_into_numpy_array(image)
            
            '''
            定义节点,运行并可视化
            '''
            #将图片扩展一维,最后进入神经网络的图片格式应该是[1,?,?,3]
            image_np_expanded = np.expand_dims(image_np,axis = 0)
            
            '''
            获取模型中的tensor
            '''
            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
                        
            #boxes用来显示识别结果
            boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
            
            #Echo score代表识别出的物体与标签匹配的相似程度,在类型标签后面
            scores = detection_graph.get_tensor_by_name('detection_scores:0')
            classes = detection_graph.get_tensor_by_name('detection_classes:0')
            num_detections = detection_graph.get_tensor_by_name('num_detections:0')
            
            #开始检查
            boxes,scores,classes,num_detections = sess.run([boxes,scores,classes,num_detections],
                                                           feed_dict={image_tensor:image_np_expanded})
            
            #可视化结果
            vis_util.visualize_boxes_and_labels_on_image_array(
                    image_np,
                    np.squeeze(boxes),
                    np.squeeze(classes).astype(np.int32),
                    np.squeeze(scores),
                    category_index,
                    use_normalized_coordinates=True,
                    line_thickness=8)
            plt.figure(figsize=IMAGE_SIZE)
            print(type(image_np))
            print(image_np.shape)
            image_np = np.array(image_np,dtype=np.uint8)            
            plt.imshow(image_np)
            plt.show()
    
    
                
if __name__ == '__main__':
    test()

如下执行结果:


范冰冰



这是大咪咪




其他人的识别都不是很高,可能和训练图片有关
上一篇下一篇

猜你喜欢

热点阅读