Wednesday, June 23, 2010

Mean Position Algorithm Breakdown

I'm between World Cup matches so I'll take some time to explain how the mean node position function works. Here's the important part of the function:

if ( !node.isExternal() ) {
node.getNodeData().setDistribution(new Distribution(""));
Distribution dist = data.getDistribution();

for (int i=0; i node.getNumberOfDescendants(); i++){ //do mean calcs
PhylogenyNode childNode = node.getChildNode(i);
NodeData childData = childNode.getNodeData();
Distribution childDist = childData.getDistribution();
BigDecimal childLat = childDist.getLatitude();
BigDecimal childLong = childDist.getLongitude();

childCoordsLat.add(childLat);
childCoordsLong.add(childLong);

latSum = latSum.add(childLat);
longSum = longSum.add(childLong);

c++;

}

BigDecimal count = new BigDecimal(c);
BigDecimal meanLat = latSum.divide(count,BigDecimal.ROUND_CEILING);
BigDecimal meanLong = longSum.divide(count,BigDecimal.ROUND_CEILING);

dist.setLatitude(meanLat);
dist.setLongitude(meanLong);

Now to break it down.

if ( !node.isExternal() ) {
node.getNodeData().setDistribution(new Distribution(""));
Distribution dist = data.getDistribution();

We're only assigning non external nodes for this function, so first we check to make sure the node is internal. Next, since this node doesn't have a distribution yet we need to give it an empty one. Then we can get at the distribution of the node, here called dist.

Next we iterate through all the descendants of the node collecting the lat and long of each descendent:

for (int i=0; i node.getNumberOfDescendants(); i++){
PhylogenyNode childNode = node.getChildNode(i);
NodeData childData = childNode.getNodeData();
Distribution childDist = childData.getDistribution();
BigDecimal childLat = childDist.getLatitude();
BigDecimal childLong = childDist.getLongitude();

childCoordsLat.add(childLat);
childCoordsLong.add(childLong);

Now we need to get the sum of all the children, we simply add:

latSum = latSum.add(childLat);
longSum = longSum.add(childLong);

and keep track of how many descendants we've seen: c++;

This last bit is just a convoluted way of getting the mean:

BigDecimal count = new BigDecimal(c);
BigDecimal meanLat = latSum.divide(count,BigDecimal.ROUND_CEILING);
BigDecimal meanLong = longSum.divide(count,BigDecimal.ROUND_CEILING);

dist.setLatitude(meanLat);
dist.setLongitude(meanLong);

It's convoluted because we're dealing with BigDecimal objects so we have to do a bunch of stuff to keep the data types happy.

No comments:

Post a Comment