Hadoop

110.用StreamSets实现数据实时写入Kudu

2022-03-04  本文已影响0人  大勇任卷舒

110.1 演示环境介绍

110.2 操作演示

1.环境布置

server-id=1
log-bin=mysql-bin
binlog_format=ROW
[root@ip-168-31-16-68 ~]# systemctl restart mariadb
[root@ip-168-31-16-68 ~]# systemctl status mariadb
GRANT ALL on maxwell.* to 'maxwell'@'%' identified by '123456';
GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE on *.* to 'maxwell'@'%';
FLUSH PRIVILEGES;
create database test;
create table cdc_test (
       id int,
       name varchar(32)
);
create table cdc_test (
       id int,
       name String,
       primary key(id)
)
       PARTITION BY HASH PARTITIONS 16
STORED AS KUDU; 

2.创建Pipline

for(var i = 0; i < records.length; i++) {
  try { 
    var newRecord = sdcFunctions.createRecord(true);
    newRecord.value = records[i].value['OldData'];
    newRecord.value.Type = records[i].value['Type'];
    newRecord.value.Database = records[i].value['Database'];
    newRecord.value.Table = records[i].value['Table'];
    log.info(records[i].value['Type'])
    output.write(newRecord);
  } catch (e) {
    // Send record to error
    error.write(records[i], e);
  }
}
for(var i = 0; i < records.length; i++) {
  try { 
    var newRecord = sdcFunctions.createRecord(true);
    newRecord.value = records[i].value['Data'];
    newRecord.value.Type = records[i].value['Type'];
    newRecord.value.Database = records[i].value['Database'];
    newRecord.value.Table = records[i].value['Table'];
    log.info(records[i].value['Type'])
    output.write(newRecord);
  } catch (e) {
    // Send record to error
    error.write(records[i], e);
  }
}

3.Pipeline测试

insert into cdc_test values(1, 'fayson');
update cdc_test set name='fayson-update' where id=1;
delete from cdc_test where id=1;

4.总结
1.在Kudu插入数据时指定Kudu表名需要注意,如使用Impala创建的表,则需要加上impala的前缀格式impala:<database>:<table>
2.实现MySQL CDC的前提是需要开启MySQL的Binary Log日志,并且需要创建复制账号,SreamSets中MySQL-Binary Log实际充当的为MySQL的一个Slave
3.向Kudu实时写入数据的前提是Kudu的表已存在,否则无法正常写入数据
4.需要去确保组装的Map数据中Key与Kudu表中的column字段一致

大数据视频推荐:
腾讯课堂
CSDN
大数据语音推荐:
企业级大数据技术应用
大数据机器学习案例之推荐系统
自然语言处理
大数据基础
人工智能:深度学习入门到精通

上一篇 下一篇

猜你喜欢

热点阅读