2. 张量概述

2022-07-26 本文已影响0人 hdszzwy

张量是同一类型的数字或字符串构成的多维数组。若是你对numpy比较熟悉的话，那么张量与np.arrays是类似的。与python中的数字和字符串一样，所有的张量都是不可变的。即张量不可更改，只能创建新的张量。

张量基础操作

让我们从创建一些简单的张量开始吧！

一个标量或者说是一个秩为0的一个张量，它只包含一个值，并且没有维度。

rank_0_tensor = tf.constant(4)
print(rank_0_tensor)

结果为：

tf.Tensor(4, shape=(), dtype=int32)

一个向量或者说秩为1的张量像是一个列表的值，一个向量只有一个维度。

rank_1_tensor = tf.constant([2.0, 3.0, 4.0])
print(rank_1_tensor)

结果为：

tf.Tensor([2. 3. 4.], shape=(3,), dtype=float32)

一个矩阵或者说秩为2的张量有两个维度

rank_2_tensor = tf.constant([[1, 2], [3, 4], [5, 6]], dtype=tf.float16)
print(rank_2_tensor)

结果为：

tf.Tensor(
[[1. 2.]
 [3. 4.]
 [5. 6.]], shape=(3, 2), dtype=float16)

张量可以有很多的维度，下面是一个三维张量：

rank_3_tensor = tf.constant([
  [[0, 1, 2, 3, 4],
   [5, 6, 7, 8, 9]],
  [[10, 11, 12, 13, 14],
   [15, 16, 17, 18, 19]],
  [[20, 21, 22, 23, 24],
   [25, 26, 27, 28, 29]]])

print(rank_3_tensor)

结果为：

tf.Tensor(
[[[ 0  1  2  3  4]
  [ 5  6  7  8  9]]

 [[10 11 12 13 14]
  [15 16 17 18 19]]

 [[20 21 22 23 24]
  [25 26 27 28 29]]], shape=(3, 2, 5), dtype=int32)

你可以使用np.array或tensor.numpy()将一个张量转换为一个NumPy数组

print(np.array(rank_2_tensor))
print(rank_2_tensor.numpy())

结果均为

[[1. 2.]
 [3. 4.]
 [5. 6.]]

大多数的时候，张量里面保存的是整数或者小数。但是张量也可以保存其他类型：如复数和字符串。
tf.Tensor要求张量类似于矩阵一样每个坐标上的元素又都有相同数量的元素。但是，特殊张量可以处理子元素不相同的情况。这些特殊张量有：不规则张量(Ragged tensor)和稀疏张量(sparse tensor)
张量有一些基本的操作，如逐元素相加，逐元素相乘和矩阵相乘。

a = tf.constant([[1, 2],
                 [3, 4]])
b = tf.ones([2, 2], dtype=tf.int32)

print(a)
print(b)
print(tf.add(a, b))
print(tf.multiply(a, b))
print(tf.matmul(a, b))

结果为：

tf.Tensor(
[[1 2]
 [3 4]], shape=(2, 2), dtype=int32)
tf.Tensor(
[[1 1]
 [1 1]], shape=(2, 2), dtype=int32)
tf.Tensor(
[[2 3]
 [4 5]], shape=(2, 2), dtype=int32)
tf.Tensor(
[[1 2]
 [3 4]], shape=(2, 2), dtype=int32) 
tf.Tensor(
[[3 3]
 [7 7]], shape=(2, 2), dtype=int32)

print(a + b)
print(a * b)
print(a @ b)

结果为：

tf.Tensor(
[[2 3]
 [4 5]], shape=(2, 2), dtype=int32) 
tf.Tensor(
[[1 2]
 [3 4]], shape=(2, 2), dtype=int32) 
tf.Tensor(
[[3 3]
 [7 7]], shape=(2, 2), dtype=int32)

TensorFlow还支持了其他各种运算。

c = tf.constant([[4.0, 5.0], [10.0, 1.0]])

print(tf.reduce_max(c))
print(tf.math.argmax(c))
print(tf.nn.softmax(c))

结果为：

tf.Tensor(10.0, shape=(), dtype=float32)
tf.Tensor([1 0], shape=(2,), dtype=int64)
tf.Tensor(
[[2.6894143e-01 7.3105860e-01]
 [9.9987662e-01 1.2339458e-04]], shape=(2, 2), dtype=float32)

Tensor的shape属性

以下式关于张量的一些概念词汇：

Shape：张量各个维度的尺寸的集合。
Rank（秩）: 张量坐标系的个数，一个标量的rank是0，向量的rank是1，矩阵的秩是2.
Axis或Dimension：张量中的某一个维度。
Size：张量中的元素个数，规则矩阵的size等于所有维度大小的乘积

rank_4_tensor = tf.zeros([3, 2, 4, 5])

print("Type of every element:", rank_4_tensor.dtype)
print("Number of axes:", rank_4_tensor.ndim)
print("Shape of tensor:", rank_4_tensor.shape)
print("Elements along axis 0 of tensor:", rank_4_tensor.shape[0])
print("Elements along the last axis of tensor:", rank_4_tensor.shape[-1])
print("Total number of elements (3*2*4*5): ", tf.size(rank_4_tensor).numpy())

运算结果为：

Type of every element: <dtype: 'float32'>
Number of axes: 4
Shape of tensor: (3, 2, 4, 5)
Elements along axis 0 of tensor: 3
Elements along the last axis of tensor: 5
Total number of elements (3*2*4*5):  120

张量的坐标系轴在某些文献中也被称为索引，你需要注意辨认其真正含义。坐标系轴的排序是从宏观到局部，各个维度依次为batch，height，width， feature。

索引

一维索引
TensorFlow顺承了Python的索引基本原则：
-- 索引从0开始；
-- 负数意味着从后从前计数
-- 冒号用于切片（开始点：结束点：步长）
多维索引
当需要对多维张量进行检索时，你需要指明多个索引。与一维索引相同的原则各自独立地应用到每个维度上。

对Shape的操作

张量的shape重塑相当有用处。你可以将某个张量重塑为其他的shape。由于底层数据不需要移动，所以tf.reshape操作速度快消耗低。

x = tf.constant([[1], [2], [3]])
reshaped = tf.reshape(x, [1, 3])
print(x.shape)
print(reshaped.shape)

结果为：

(3, 1)
(1, 3)

当进行reshape操作时，内存中的数据不变。新的张量按照指定的shape被创建后指向原来的数据。TensorFlow使用C风格的行优先方式进行内存管理，即最右的索引对应于内存中的步长。将一个三维张量flatten之后，你就可以看出内存中的存放方式。

rank_3_tensor = [[[0, 1, 2, 3, 4],
                  [5, 6, 7, 8, 9]],
                 [[10, 11, 12, 13, 14],
                  [15, 16, 17, 18, 19]],
                 [[20, 21, 22, 23, 24],
                  [25, 26, 27, 28, 29]]]
# 向reshape传参-1即可将张量flatten
print(tf.reshape(rank_3_tensor, [-1]))

结果为：

tf.Tensor(
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29], shape=(30,), dtype=int32)

tf.reshape的典型应用场景是将坐标轴进行合并或切分。将一个325的张量reshape成一个(32)5或3(25)的操作如下：

print(tf.reshape(rank_3_tensor, [3*2, 5]))
print(tf.reshape(rank_3_tensor, [3,-1]))

运行结果为：

tf.Tensor(
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]
 [25 26 27 28 29]], shape=(6, 5), dtype=int32)
tf.Tensor(
[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]
 [20 21 22 23 24 25 26 27 28 29]], shape=(3, 10), dtype=int32)

只要张量的总数保持不变，tensor可以将张量reshape为任意的shape。但若是总数发生了变化，那TensorFlow就无能为力了。
尽管下列的代码可以执行，但是实在没有什么意义结果也并不如想象中的那样。

# 将rank_3_tensor交换坐标系的结果，应该是
# [[[0, 1, 2, 3, 4],
# [10, 11, 12, 13, 14],
# [20, 21, 22, 23, 24]],
# [[5, 6, 7, 8, 9],
# [15, 16, 17, 18, 19],
# [25, 26, 27, 28, 29]]]
print(tf.reshape(rank_3_tensor, [2, 3, 5]))
# 这个操作毫无意义，尽管有结果
print(tf.reshape(rank_3_tensor, [5, 6]))

上述代码的运行结果为：

[[[ 0  1  2  3  4]
  [ 5  6  7  8  9]
  [10 11 12 13 14]]

 [[15 16 17 18 19]
  [20 21 22 23 24]
  [25 26 27 28 29]]], shape=(2, 3, 5), dtype=int32)
tf.Tensor(
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]], shape=(5, 6), dtype=int32)

可以看出交换坐标系的代码尽管不报错，但是结果并不是想象中的那样的结果。无意义的reshape操作也不会报错。
而下列代码会报错，原因是张量的总数发生了变化。

try:
    tf.reshape(rank_3_tensor, [7, -1])
except Exception as e:
    print(f"{type(e).__name__}: {e}")

报错如下：

InvalidArgumentError: Input to reshape is a tensor with 30 values, but the requested shape requires a multiple of 7 [Op:Reshape]

你可以将在创建维度是不指定张量的维度，即将张量的shape的某一个维度设置为None或将整个张量的维度直接指定为None。

DTypes

你可以是用Tensor.dtype属性来查看Tensor的数据类型。
当创建tf.Tensor时，你可以选择指定数据类型。若时不指定数据类型，TensorFlow将会选择一个数据类型来表示你的数据，TensorFlow会将Python的整型转化为tf.int32，将Python的浮点型转换为tf.float32。其他类型TensorFlow遵循如Numpy的规则。
你也可以对Tensor的类型进行强制转换

the_f64_tensor = tf.constant([2.2, 3.3, 4.4], dtype=tf.float64)
the_f16_tensor = tf.cast(the_f64_tensor, dtype=tf.float16)
# Now, cast to an uint8 and lose the decimal precision
the_u8_tensor = tf.cast(the_f16_tensor, dtype=tf.uint8)
print(the_u8_tensor)

运行结果为：

tf.Tensor([2 3 4], shape=(3,), dtype=uint8)

2. 张量概述

张量基础操作

Tensor的shape属性

索引

对Shape的操作

DTypes

广播

猜你喜欢

热点阅读