Storm实例:提交大地算法Topology
大地算法简介
- 相关概念
地轴:即为地球斜轴,又称地球自转轴。是指地球自转所绕的轴,北端与地表的交点是北极,南端与地表的交点是南极
赤道:地球表面的点随地球自转产生的轨迹中周长最长的圆周线,赤道半径6,378.137km。
纬度:纬度是指某点与地球球心的连线和地球赤道面所成的线面角,其数值在0至90度之间。位于赤道以北的点的纬度叫北纬,记为N,位于赤道以南的点的纬度称南纬,记为S。
经度:一般指球面坐标系的纵坐标,具体来说就是地球上一个地点离一根被称为本初子午线的南北方向走线以东或以西的度数。
本初子午线:即0度经线,亦称格林威治子午线或格林尼治子午线,是位于英国格林尼治天文台的一条经线(亦称子午线)。本初子午线的东西两边分别定为东经和西经,于180度相遇。
-
大地算法:
在实际应用中,我们计算2个位置的距离,通常做法就是获取2个位置中心处的经纬度,然后根据经纬度计算它们之间地表弧线距离。计算地表弧线距离的方法即称为大地算法。 -
算法推导
如图所示,我们计算A点和B点之间的弧线距离,因此我们的目标要获取OA、OB之间的夹角,记为C。
设第一点A的经 纬度为(jA, wA),第二点B的经纬度为(jB, wB),按照0度经线的基准,东经取经度的正值(Longitude),西经取经度负值(-Longitude),北纬取90-纬度值(90- Latitude),南纬取90+纬度值(90+Latitude),则经过上述处理过后的两点被计为(MLonA, MLatA)和(MLonB, MLatB)。那么根据三角推导,可以得到计算两点距离的如下公式:
C = sin(MLatA)*sin(MLatB)*cos(MLonA-MLonB) + cos(MLatA)*cos(MLatB)
Distance = R*Arccos(C)*Pi/180
如果仅对经度作正负的处理,而不对纬度作90-Latitude(假设都是北半球,南半球只有澳洲具有应用意义)的处理,那么公式将是:
C = sin(wA)*sin(wB) + cos(wA)*cos(wB)*cos(jA-jB)
Distance = R*Arccos(C)*Pi/180
针对这种情况,可以简单演示下推导过程:
推导过程- Java实现
public static double Distance(Location loc1, Location loc2) {
double a, b, R;
R = 6378137; // 地球半径,单位:米
double lat1 = loc1.latitude;
double long1 = loc1.longitude;
double lat2 = loc2.latitude;
double long2 = loc2.longitude;
a = (lat1 - lat2) * Math.PI / 180.0;
b = (long1 - long2) * Math.PI / 180.0;
double d;
double sa2, sb2;
sa2 = Math.sin(a / 2.0);
sb2 = Math.sin(b / 2.0);
d = 2 * R * Math.asin(Math.sqrt(sa2 * sa2 + Math.cos(lat1) * Math.cos(lat2) * sb2 * sb2));
return d;
}
Spout/Bolt编程
我们的目标是本地不断的随机生成一个坐标点,然后计算这个点到一个固定位置的距离。
- Location类
首先,为了书写方便,我们先创建一个Location的Class。
public class Location {
public double longitude;
public double latitude;
public Location(double lon, double lat) {
this.longitude = lon;
this.latitude = lat;
}
public String locationInfo() {
String info = "location:( " + longitude + "," + latitude + " ) ";
return info;
}
}
- RandomLocationSpout类
创建RandomLocationSpout类,继承BaseRichSpout,并重写基类的基本方法。
public class RandomLocationSpout extends BaseRichSpout {
SpoutOutputCollector spoutOutputCollector;
@Override
public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) {
// TODO Auto-generated method stub
spoutOutputCollector = collector;
}
@Override
public void nextTuple() {
// TODO Auto-generated method stub
double lat = 39 + (Math.random()*2);
double lon = 116 + Math.random();
String loc = lon + "," + lat;
spoutOutputCollector.emit(new Values(loc));
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// TODO Auto-generated method stub
declarer.declare(new Fields("spout"));
}
}
在nextTuple()
方法中,我们随机生成一个纬度在北纬39度-41度之间,经度在东京116度-117度之间的一个坐标,然后将该坐标发射出去。
- CalculateDistantBolt类
创建CalculateDistantBolt类,并重写IRichBolt接口的相关方法
public class CalculateDistantBolt implements IRichBolt {
private OutputCollector outputCollector;
public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) {
// TODO Auto-generated method stub
outputCollector = collector;
}
public void execute(Tuple input) {
// TODO Auto-generated method stub
String loc = input.getString(0);
String[] s1 = loc.split(",");
Location location = new Location(Double.parseDouble(s1[0]), Double.parseDouble(s1[1]));
Location center = new Location(116.360664, 40.007614);
double d = Distance(location, center);
System.out.println("************\\n" +location.locationInfo() + "between" + center.locationInfo() + ":\\n" + "Distant: " + d +"\\n ***********");
}
public void cleanup() {
// TODO Auto-generated method stub
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// TODO Auto-generated method stub
}
public Map<String, Object> getComponentConfiguration() {
// TODO Auto-generated method stub
return null;
}
public static double Distance(Location loc1, Location loc2) {
double a, b, R;
R = 6378137; // 地球半径
double lat1 = loc1.latitude;
double long1 = loc1.longitude;
double lat2 = loc2.latitude;
double long2 = loc2.longitude;
a = (lat1 - lat2) * Math.PI / 180.0;
b = (long1 - long2) * Math.PI / 180.0;
double d;
double sa2, sb2;
sa2 = Math.sin(a / 2.0);
sb2 = Math.sin(b / 2.0);
d = 2 * R * Math.asin(Math.sqrt(sa2 * sa2 + Math.cos(lat1) * Math.cos(lat2) * sb2 * sb2));
return d;
}
}
在execute(Tuple input)
方法中,我们获取Spout发射的坐标点,并计算该点到当前位置的地表距离(实例中center是我当前的位置)。** 打印输出计算结果**。
- DistantTopology类
最后是创建启动主类DistantTopology,进行拓扑构建。在main
方法中设置好Spout和Bolt,然后Topology任务提交到Storm上。
public class DistantTopology {
private static TopologyBuilder builder = new TopologyBuilder();
public static void main(String[] args) {
// TODO Auto-generated method stub
Config config = new Config();
builder.setSpout("RandomLocationSpout", new RandomLocationSpout(), 2);
builder.setBolt("CalculateDistantBolt", new CalculateDistantBolt(), 2).shuffleGrouping(
"RandomLocationSpout");
config.setDebug(true);
//通过是否有参数来控制是否启动集群,或者本地模式执行
if (args != null && args.length > 0) {
try {
config.setNumWorkers(1);
StormSubmitter.submitTopology(args[0], config,
builder.createTopology());
} catch (Exception e) {
e.printStackTrace();
}
} else {
config.setMaxTaskParallelism(1);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("wordcount", config, builder.createTopology());
}
}
}
提交Topology任务
- 使用mvn命令将工程打成jar包
- 上传jar包到集群的主机上
- 主机上,终端执行storm jar命令,提交Topology任务
- 提交成功后,在对应节点上查看worker日志
1041937 [Thread-8-RandomLocationSpout] INFO backtype.storm.daemon.task - Emitting: RandomLocationSpout default [116.61777982507657,39.25343303306309]
1041937 [Thread-14-CalculateDistantBolt] INFO backtype.storm.daemon.executor - Processing received message source: RandomLocationSpout:2, stream: default, id: {}, [116.72828785729372,39.204690956457334]
************
location:( 116.72828785729372,39.204690956457334 ) betweenlocation:( 116.360664,40.007614 ) :
Distant: 88969.36622189148
***********
1041937 [Thread-8-RandomLocationSpout] INFO backtype.storm.daemon.task - Emitting: RandomLocationSpout default [116.03394550808655,40.8784387953902]
1041937 [Thread-14-CalculateDistantBolt] INFO backtype.storm.daemon.executor - Processing received message source: RandomLocationSpout:2, stream: default, id: {}, [116.87086610893954,40.505468257874966]
************
location:( 116.87086610893954,40.505468257874966 ) betweenlocation:( 116.360664,40.007614 ) :
Distant: 71556.36956545932
***********
日志中会不断刷新计算距离,任务提交成功。