tensorflow基础
一. 基本教程
官方推荐搭建模型的时候用更高层的api,但是知道tensorflow的低层api更有利于知道其运行原理,也是很必要的。
1. tensorflow中的基本结构
Tensor
tensor(张量)是tensorflow的核心单元,表示原始数据,内部用n维numpy.ndarray表示,rank表示其维度,shape表示其形状, 0维为常数,1维为向量,2维为矩阵,可以向更高维拓展。
3. # a rank 0 tensor; a scalar with shape [],
[1., 2., 3.] # a rank 1 tensor; a vector with shape [3]
[[1., 2., 3.], [4., 5., 6.]] # a rank 2 tensor; a matrix with shape [2, 3]
[[[1., 2., 3.]], [[7., 8., 9.]]] # a rank 3 tensor with shape [2, 1, 3]
my_image = tf.zeros([10, 299, 299, 3]) # batch x height x width x color
可以用tf.rank()得到tensor的维度
r = tf.rank(my_image)
# After the graph runs, r will hold the value 4.
可以用tf.shape()得到tensor的形状,tf.reshape(tensor, newShape)可以改变tensor的形状
rank_three_tensor = tf.ones([3, 4, 5])
matrix = tf.reshape(rank_three_tensor, [6, 10]) # Reshape existing content into
# a 6x10 matrix
可以用tf.cast(ori_tensor, dtype)进行tensor的数据类型转换
# Cast a constant integer tensor into floating point.
float_tensor = tf.cast(tf.constant([1, 2, 3]), dtype=tf.float32)
可用tensor.eval得到tensor的值
constant = tf.constant([1, 2, 3])
tensor = constant * constant
print tensor.eval() #返回ndarray格式类型
tf程序可以看成是构建一个利用输入tensor得到输出tensor关系的图,然后运行程序得到想要的结果。
Tensor组成
数据类型:float32, int32, or string
等,tensor的数据类型都是已知的,并且里面的元素类型都是相同的。
形状:可以只知道一部分,有些只有在计算的时候才能知道完整的形状是多少,如shape = [3,?,3]
一些特殊的Tensor
- tf.Variable
- tf.constant
- tf.placeholder
- tf.SparseTensor
除了Variable外,其余的都是不可变的,也就是说在一次上下文执行中,它的数值都是固定的,但是不同的执行不一定相同,比如tf.placeholder的feeding是一组随机数,每次执行的初始值都是不变的,但是后面的运行过程中数值是不会改变的。
需要注意的是feed是可以对所有的tensor进行赋值的
tf.Variable
tf.Variable可以指定一个名字,通过这个名字来复用这个Variable
my_variable = tf.get_variable("my_variable", [1, 2, 3]) #[1,2,3]指shape
my_int_variable = tf.get_variable("my_int_variable", [1, 2, 3], dtype=tf.int32,
initializer=tf.zeros_initializer)
other_variable = tf.get_variable("other_variable", dtype=tf.int32,
initializer=tf.constant([23, 42]))
在创建一个variable时可以设置其trainable参数来控制它是否会在训练时被修改
低层api创建了一个variable必须要显式的对它进行初始化,可以自己同时将所有的初始化也可以自己手动初始化(尤其在有依赖的时候)
session.run(tf.global_variables_initializer())
# Now all variables are initialized.
session.run(my_variable.initializer) #手动初始化my_variable
有依赖情况时的赋值,用v.initialized_value()
v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer())
w = tf.get_variable("w", initializer=v.initialized_value() + 1)
在执行时得到variable的值
v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer())
assignment = v.assign_add(1)
with tf.control_dependencies([assignment]):
w = v.read_value() # w is guaranteed to reflect v's value after the
# assign_add operation.
因为tf.get_variable可以创建也可以复用,所以在同一个上下文中对同一个名称进行两次调用会报错,这个时候可以指定变量域,就可以避免这个问题。
因为是在不同的域中,所以两次都会进行创建
def my_image_filter(input_images):
with tf.variable_scope("conv1"):
# Variables created here will be named "conv1/weights", "conv1/biases".
relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
with tf.variable_scope("conv2"):
# Variables created here will be named "conv2/weights", "conv2/biases".
return conv_relu(relu1, [5, 5, 32, 32], [32])
也可以设置scope的reuse属性为true,同一个域下就会进行复用而不会有复用还是创建的矛盾
with tf.variable_scope("model"):
output1 = my_image_filter(input1)
with tf.variable_scope("model", reuse=True):
output2 = my_image_filter(input2)
或者
with tf.variable_scope("model") as scope:
output1 = my_image_filter(input1)
scope.reuse_variables()
output2 = my_image_filter(input2)
防止打错名字,可以将scope设为一个变量
with tf.variable_scope("model") as scope:
output1 = my_image_filter(input1)
with tf.variable_scope(scope, reuse=True):
output2 = my_image_filter(input2)
计算图
低层api的tf编程分为两部分,构建计算图(tf.Graph)和运算计算图(tf.Session),和spark中的RDD相似。不执行运算的话图就只是简单的图结构,没有tensor流过。
图的构成
图结构
边 - tensors
点 - 操作(计算或者定义)
如:
a = tf.constant(3.0, dtype=tf.float32)
b = tf.constant(4.0) # also tf.float32 implicitly
total = a + b
print(a)
print(b)
print(total)
不执行直接打印,是没有值的
Tensor("Const:0", shape=(), dtype=float32)
Tensor("Const_1:0", shape=(), dtype=float32)
Tensor("add:0", shape=(), dtype=float32)
每一个操作都有一个独特的名字,这个名字和Python程序中给定标记没有关系,比如定义了a,但是在图结构中,它的名字是Const
collections
提供了一系列的collections,不同的collection具备不同的功能,如tf.Variable创建后就会加入到“global variables" and "trainable variables“中,前者表示全局,后者表示可以被训练。
也可以对操作命名,便于debug和观察,输出Tensor的时候也会带上相应操作的名字
c_0 = tf.constant(0, name="c") # => operation named "c"
# Already-used names will be "uniquified".
c_1 = tf.constant(2, name="c") # => operation named "c_1"
# Name scopes add a prefix to all operations created in the same context.
with tf.name_scope("outer"):
c_2 = tf.constant(2, name="c") # => operation named "outer/c"
# Name scopes nest like paths in a hierarchical file system.
with tf.name_scope("inner"):
c_3 = tf.constant(3, name="c") # => operation named "outer/inner/c"
# Exiting a name scope context will return to the previous prefix.
c_4 = tf.constant(4, name="c") # => operation named "outer/c_1"
# Already-used name scopes will be "uniquified".
with tf.name_scope("inner"):
c_5 = tf.constant(5, name="c") # => operation named "outer/inner_1/c"
Session
tf.Session用于执行计算图,根据情况可以返回Tensor值或者为None(如train操作),用with会在退出block后自动关闭,否则需要调用tf.Session.close
释放资源。
# Create a default in-process session.
with tf.Session() as sess:
# ...
# Create a remote session.
with tf.Session("grpc://example.org:2222"):
# ...
tf.Session(args)可以指定执行的机器、图等其他参数
x = tf.constant([[37.0, -23.0], [1.0, 4.0]])
w = tf.Variable(tf.random_uniform([2, 2]))
y = tf.matmul(x, w)
output = tf.nn.softmax(y)
init_op = w.initializer
with tf.Session() as sess:
# Run the initializer on `w`.
sess.run(init_op)
# Evaluate `output`. `sess.run(output)` will return a NumPy array containing
# the result of the computation.
print(sess.run(output))
# Evaluate `y` and `output`. Note that `y` will only be computed once, and its
# result used both to return `y_val` and as an input to the `tf.nn.softmax()`
# op. Both `y_val` and `output_val` will be NumPy arrays.
y_val, output_val = sess.run([y, output])
在run的时候给placeholder赋值
# Define a placeholder that expects a vector of three floating-point values,
# and a computation that depends on it.
x = tf.placeholder(tf.float32, shape=[3])
y = tf.square(x)
with tf.Session() as sess:
# Feeding a value changes the result that is returned when you evaluate `y`.
print(sess.run(y, {x: [1.0, 2.0, 3.0]})) # => "[1.0, 4.0, 9.0]"
print(sess.run(y, {x: [0.0, 0.0, 5.0]})) # => "[0.0, 0.0, 25.0]"
# Raises `tf.errors.InvalidArgumentError`, because you must feed a value for
# a `tf.placeholder()` when evaluating a tensor that depends on it.
sess.run(y)
# Raises `ValueError`, because the shape of `37.0` does not match the shape
# of placeholder `x`.
sess.run(y, {x: 37.0})
可以对run进行配置,比如记录下它的执行过程和时间
y = tf.matmul([[37.0, -23.0], [1.0, 4.0]], tf.random_uniform([2, 2]))
with tf.Session() as sess:
# Define options for the `sess.run()` call.
options = tf.RunOptions()
options.output_partition_graphs = True
options.trace_level = tf.RunOptions.FULL_TRACE
# Define a container for the returned metadata.
metadata = tf.RunMetadata()
sess.run(y, options=options, run_metadata=metadata)
# Print the subgraphs that executed on each device.
print(metadata.partition_graphs)
# Print the timings of each operation that executed.
print(metadata.step_stats)