BigQuery中拆解字符串并平展为多行

2021-02-25  本文已影响0人  Houtasu

我在介绍bigquery的半结构化数据中介绍过我们总表的结构,
但是有些命令的参数个数是不固定的,比如玩家在一次战斗中击杀的敌人种类,
有可能有一种,也有可能有两种三种...
这种在表中是直接用#号拼接在一起的,如下所示:


当需要统计海盗的数量时,这种拼接的就没法统计。
所以需要先把这个拼接的字符串变成两行数据,
这里要用到bigquery的平展数组。可以参考这篇文档
WITH sequences AS
  (SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id, flattened_numbers
FROM sequences, sequences.some_numbers AS flattened_numbers;

+------+-------------------+
| id   | flattened_numbers |
+------+-------------------+
|    1 |                 0 |
|    1 |                 1 |
|    1 |                 1 |
|    1 |                 2 |
|    1 |                 3 |
|    1 |                 5 |
|    2 |                 2 |
|    2 |                 4 |
|    2 |                 8 |
|    2 |                16 |
|    2 |                32 |
|    3 |                 5 |
|    3 |                10 |
+------+-------------------+

上面这个例子中,本来是3行数据,some_numbers是一个数组,
然后直接把some_number接到from的后面,就可以把这个数组展开,并且和前面的id是绑定的。
参照这个例子,我们只需要使用split()函数把#号拆成数组,在接到from后面就可以变成两行了。

WITH
  t1 AS (
  SELECT
    uid,
    SPLIT(p.str,'#') AS pirates
  FROM
    `命令表`,
    UNNEST(params) AS p
  WHERE
    cmd='PVE Win'
    AND p.vn = 'v1')
SELECT
  uid,
  pirates
FROM
  t1,
  t1.pirates AS pirates
上一篇 下一篇

猜你喜欢

热点阅读