周蓬勃Kylin 专题我爱编程

[转]Kylin中Segments overlap的解决办法

2018-06-15  本文已影响17人  步闲

一. 背景

在使用kylin增量构建Cube时,出现这么一个情况:

在2018.06.27早晨上班后突然发现2018.06.26自动增量构建的Cube任务失败了。于是打开 Kylin WebUI 界面想手动 resume 该 Job,但在 Kylin WebUI Monitor 界面却找不到错误构建的历史 Job.

二. 尝试 Rebuild Job

于是决定 rebuild 该 Job,然而,无法重新构建,报 Segments overlap 错误。如下图:

原因是 Kylin 元数据中已经有相同时间段的 Segment 存在,顾无法再构建。

三. 尝试 Resume Job

那既然 rebuild job 失败,UI 界面也无历史Job信息,那就调用 restful API resume job 吧。但 API 要求提供 jobid,哪里去获取呢?

谷歌。。。

https://issues.apache.org/jira/browse/KYLIN-2795

该 JIRA 表明,在 kylin2.1 之前官方是没有获取 jobid 的 restful api 提供的,在2.2版本才补充上。而公司刚好用的是2.1版本,哭到在厕所。

然后只能扒 Kylin 的元数据了,查看 hbase 下 kylin_metadata 表,研究了下其 Rowkey 设计方式主要有如下几种:

/acl/ffd1b1fd-eae0-4c2f-9808-26b58b1c21ac
/cube/kylin_cube.json
/cube_desc/kylin_cube.json
/cube_statistics/kylin_cube/3f7de744-8703-4b1d-9ca4-cd283ec9667c.seq
/dict/LOG.DW_APP_KYLIN_VIEW/ACITON_SOURCE/1335fd4c-bb7d-447b-a678-4972d19449aa.dict

我们通过如下命令拿到指定 Cube 的信息:

get 'kylin_metadata','/cube/kylin_cube.json'

这样可以看到此cube的元数据信息,打印出来,找到指定的segment元数据所在的位置。

{
    "uuid" : "5877617f-5fd1-40b5-bdc9-fd59945135af",
    "name" : "20180626000000_20180627000000",
    "storage_location_identifier" : "KYLIN_8SJTX5SBNM",
    "date_range_start" : 1528416000000,
    "date_range_end" : 1528502400000,
    "source_offset_start" : 0,
    "source_offset_end" : 0,
    "status" : "NEW",
    "size_kb" : 0,
    "input_records" : 0,
    "input_records_size" : 0,
    "last_build_time" : 0,
    "last_build_job_id" : null,
    "create_time_utc" : 1529030216244,
    "cuboid_shard_nums" : { },
    "total_shards" : 0,
    "blackout_cuboids" : [ ],
    "binary_signature" : null,
    "dictionaries" : null,
    "snapshots" : null,
    "rowkey_stats" : [ ]
  }

然而,正如你所见,"last_build_job_id" : null,瞬间感觉不爱了,原来构建不成功的任务是不存储 Jobid 的。

四. 尝试删除 Segment 重新构建任务

于是想通过 Resume Job 的方式再Append 构建 Job 任务不可行了。

那怎么办呢?

于是想直接手动删除该 Segment 再重新构建新任务,但当调用 Restful API 删除时却报错了,如下:

{
    "code": "999",
    "data": null,
    "msg": "Cannot delete segment '20180603000000_20180604000000' as it is neither the first nor the last segment.",
    "stacktrace": "org.apache.kylin.rest.exception.InternalErrorException: Cannot delete segment '20180603000000_20180604000000' as it is neither the first nor the last segment.\n\tat org.apache.kylin.rest.controller.CubeController.deleteSegment(CubeController.java:247)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:606)\n\tat org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:221)\n\tat org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136)\n\tat org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:110)\n\tat org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:832)\n\tat org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:743)\n\tat org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)\n\tat org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:961)\n\tat org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:895)\n\tat org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:967)\n\tat org.springframework.web.servlet.FrameworkServlet.doDelete(FrameworkServlet.java:891)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:656)\n\tat org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:843)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:731)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)\n\tat org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:316)\n\tat org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:126)\n\tat org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:90)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)\n\tat org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:114)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)\n\tat org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:122)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)\n\tat org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:111)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)\n\tat org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:169)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)\n\tat org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:48)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)\n\tat org.springframework.security.web.authentication.www.BasicAuthenticationFilter.doFilterInternal(BasicAuthenticationFilter.java:213)\n\tat org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)\n\tat org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:205)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)\n\tat org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:120)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)\n\tat org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:64)\n\tat org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)\n\tat org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:53)\n\tat org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)\n\tat org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:91)\n\tat org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)\n\tat org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:213)\n\tat org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:176)\n\tat org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:346)\n\tat org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:262)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)\n\tat com.thetransactioncompany.cors.CORSFilter.doFilter(CORSFilter.java:209)\n\tat com.thetransactioncompany.cors.CORSFilter.doFilter(CORSFilter.java:244)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)\n\tat org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)\n\tat org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)\n\tat org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505)\n\tat org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)\n\tat org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)\n\tat org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:956)\n\tat org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)\n\tat org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:436)\n\tat org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1078)\n\tat org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:625)\n\tat org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)\n\tat java.lang.Thread.run(Thread.java:745)\n",
    "exception": "Cannot delete segment '20180603000000_20180604000000' as it is neither the first nor the last segment.",
    "url": "http://10.15.184.21:7071/kylin/api/cubes/kylin_lc_log_cube_new_2/segs/20180603000000_20180604000000"
}

如报错信息所示,我们不能够删除它,因为它既不是第一个也不是最后一个Segment。原来,删除Segment是有限制的,只能删除首尾 Segment .

五. 直接修改元数据

那怎么办呢?有了一个大胆的想法:既然API不支持删除,那就手动修改元数据彻底删除该 Segment。

1. 备份

未防止错误操作,我们先备份元数据,以便操作失误后 Kylin 集群还可以再恢复到之前的状态。

# 备份元数据
$ ./bin/metastore.sh backup
# 备份目录为:$KYLIN_HOME/meta_backups/meta_2018_06_27_08_30_49
2. 清理

后来想,如果先将 Hbase 中无用的 Segment 清理下是否可以呢?执行如下命令:

检查元数据:./bin/metastore.sh clean
清除无效数据: ./bin/metastore.sh clean --delete true
3. 修改

结果不起作用,那就改元数据吧!

$ cd  $KYLIN_HOME/meta_backups/meta_2018_06_27_08_30_49/cube
$ vim  kylin_cube.json

找到如下数据,删除,保存。

{
    "uuid" : "5877617f-5fd1-40b5-bdc9-fd59945135af",
    "name" : "20180626000000_20180627000000",
    "storage_location_identifier" : "KYLIN_8SJTX5SBNM",
    "date_range_start" : 1528416000000,
    "date_range_end" : 1528502400000,
    "source_offset_start" : 0,
    "source_offset_end" : 0,
    "status" : "NEW",
    "size_kb" : 0,
    "input_records" : 0,
    "input_records_size" : 0,
    "last_build_time" : 0,
    "last_build_job_id" : null,
    "create_time_utc" : 1529030216244,
    "cuboid_shard_nums" : { },
    "total_shards" : 0,
    "blackout_cuboids" : [ ],
    "binary_signature" : null,
    "dictionaries" : null,
    "snapshots" : null,
    "rowkey_stats" : [ ]
  }

4. 恢复

最后再用最新的元数据恢复集群,命令如下:

$ ./bin/metastore.sh reset

$ ./bin/metastore.sh restore $KYLIN_HOME/meta_backups/meta_2018_06_27_08_30_49

在 WebUI 界面刷新元数据,重新 Build 任务,构建成功!

搞定!

上一篇 下一篇

猜你喜欢

热点阅读