工作中遇到的一个有趣的边界合并问题

2019-03-24 本文已影响0人 alonwang

前言

更新 Guava的RangeSet完美解决这个问题

实际场景

前端会持续向后端传递一些数字,这些数字的特征是可重复的,通常是连续的.在某些时候后端需要返回这些数据给前端

eg. 前端传递了 1,2,3,4 此时后端的返回应该是1,2,3,4

抽象问题

概念定义

image.png

上面[0,10]称为边界,表示从0到10这个范围.

问题特征

按照最简单的设计,后端将这些数据存在set里,前端需要时返回所有数据即可,但是这就忽略了这些数据的特征.

这些数字通常是连续的

举个例子前端传递了 1,2,3,4,...500 ,那么后端存储1,2,3,4,...500,这不合理,后端浪费了很多存储空间,也增大了网络传输的消耗,理想情况下后端存储的应该是[1,500]

上面说的是通常情况下,那么不通常时呢? 前端传递了 1,2,5,6,4 后端存储的应该是[1,2],[4,6],

基于前端持续传递的特征,后端会取出处理好的边界组(初始时边界组为空),将这个数字插进去,更新边界组.设边界组为list,插入数字为num.会有以下三种情况

num 已经在list中的某个边界内了,不需要更新
num和某个边界相连,eg.list为[1,2],num为3 那么边界组应该更新为[1,3]
num和左右边界相连,eg list为[1,2],[4,5],num为3 那么边界组应该更新为[1,5]

经过上面的处理list肯定是有序(3),不交叉(2,3),不重复(1)的.

抽象出算法

根据上面的分析,问题就可以抽象为向一组有序不交叉不重复边界中插入一个数字

输入: 一组有序不交叉不重复边界list,一个待插入数字num

输出: 变化后的边界组list

放码

主代码如下:

package com.github.alonwang;

import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;

import java.util.List;

public class RangeMergeUtil {

    /**
     * 在一组有序不交叉不重复的数字边界中,插入一个新的数字
     * 1. 如果列表为空,新增边界
     * 2. 如果已存在,无变化
     * 3. 如果相连(左相连,右相连,左右均相连),融合
     * 4. 如果不相连,新增一个边界
     * 详见测试用例
     *
     * @param list 不为null
     * @param num  插入数字
     * @return 插入成功 true, 插入失败 false
     */
    public static boolean insert(List<Bound> list, int num) {
        final Bound newBound = new Bound(num, num);
        if (list.isEmpty()) {
            list.add(newBound);
            return true;
        }

        //num之前的Bound下标
        int previous = -1;
        //num之后的Bound下标
        int next = -1;
        for (int i = 0; i < list.size(); i++) {
            //已包含无需再处理
            int compare = list.get(i).compareTo(num);
            if (compare == 0) {
                return false;
            }

            if (compare == -1) {
                previous = i;
            } else if (next == -1) {
                next = i;
            }
        }

        boolean mergeSuccess = false;
        if (previous != -1 && next != -1) {
            mergeSuccess = mergeBound(list, num, previous, next);
        } else if (previous != -1) {
            mergeSuccess = mergePrevious(num, list, previous);
        } else {
            mergeSuccess = mergeNext(num, list, next);
        }
        //两侧都未发生merge,插入新Bound
        if (!mergeSuccess) {
            list.add(previous + 1, newBound);
        }
        return true;
    }

    private static boolean mergeBound(List<Bound> list, int i, int previous, int next) {
        boolean mergePrevious = mergePrevious(i, list, previous);
        boolean mergeNext = mergeNext(i, list, next);
        //两边都merge成功了,合并previous和next
        if (mergePrevious & mergeNext) {
            Bound previousBound = list.get(previous);
            Bound nextBound = list.get(next);
            list.add(previous + 1, new Bound(previousBound.getStart(), nextBound.getEnd()));
            list.remove(previousBound);
            list.remove(nextBound);
        }
        return mergePrevious | mergeNext;
    }

    private static boolean mergeNext(int i, List<Bound> list, int next) {
        Bound nextBound = list.get(next);
        if (nextBound.getStart() == i + 1) {
            nextBound.setStart(i);
            list.set(next, nextBound);
            return true;
        }
        return false;
    }

    private static boolean mergePrevious(int i, List<Bound> list, int previous) {
        Bound previousBound = list.get(previous);
        if (previousBound.getEnd() == i - 1) {
            previousBound.setEnd(i);
            list.set(previous, previousBound);
            return true;
        }
        return false;
    }


    @AllArgsConstructor
    @NoArgsConstructor
    @Data
    public static class Bound {
        private int start;
        private int end;

        public static Bound of(int start, int end) {
            return new Bound(start, end);
        }

        public int compareTo(int i) {
            if (start > i) {
                return 1;
            } else if (end < i) {
                return -1;
            } else {
                return 0;
            }
        }
    }

}

测试代码如下:

import com.github.alonwang.RangeMergeUtil
import spock.lang.Specification

import static com.github.alonwang.RangeMergeUtil.Bound.of

class RangeMergeUtilTest extends Specification {

    def "test bound"() {

        when:

        def result = RangeMergeUtil.insert(array, i)
        then:
        array[idx] == bound
        array.size() == size
        result == success

        where:
        array                | i  | idx | bound      | size | success
        //为空
        []                   | 3  | 0   | of(3, 3)   | 1    | true
        //已存在
        [of(1, 2), of(7, 9)] | 2  | 0   | of(1, 2)   | 2    | false
        //存在merge,边界变化
        [of(1, 2), of(7, 9)] | 3  | 0   | of(1, 3)   | 2    | true
        [of(1, 2), of(7, 9)] | 6  | 1   | of(6, 9)   | 2    | true
        [of(1, 2), of(7, 9)] | 10 | 1   | of(7, 10)  | 2    | true
        //无merge  新增bound
        [of(3, 4), of(6, 9)] | 1  | 0   | of(1, 1)   | 3    | true
        [of(3, 4), of(6, 9)] | 11 | 2   | of(11, 11) | 3    | true
        //左右merge,bound减少
        [of(3, 4), of(6, 7)] | 5  | 0   | of(3, 7)   | 1    | true


    }
}

## 总结

反思

看到这个问题我的第一想法是用redis的bitmap来搞,这些存储逻辑会很简单,取出时用bitpos来获取左右边界值就行了,但是限于项目中jar包版本,有个指令无法使用.只能放弃(后面leader提醒bitpos是O(n)的,在我这个场景里,最坏情况下会是O(n^2),)

由于这是个临时需求,马上就要验收了,就想着简单点搞全存全取好了,奈何leader说这样不行,虽然当时说了会改,内心还是有点不情愿的.等写出这个算法后,只想说:"leader英明",感觉很幸运,有个靠谱的leader真的会学到很多.

更进一步

写完之后发现这个问题还可以再抽象一下: 向一组有序不交叉不重复边界中插入一个新的边界,这个处理起来会更麻烦,后续有时间会尝试一下.

直觉告诉我肯定有更好的方法去处理这个问题,哪位知晓的请告知

源码已上传github