phyloGeoRef Project Blog

Monday, August 9, 2010

Crossing the 180

I've implemented an algorithm to deal with branches that cross the 180 meridian. The basic algorithm is as follows:

1. get the mathematical midpoint of the two points (or mean if the node has more than 2 children)
2. subtract the midpoint from 180, or if the midpoint is negative subtract the absolute value of the midpoint
3. if the original midpoint was negative leave the value from 2 as is, else change the sign (if the midpoint is positive the new midpoint is (180-midpoint)*-1

I only use this method if one of the decedents of a node has a different sign than the others and is less than -90 or between 90 and 180. Otherwise I just use the regular midpoint. If a node has more than 2 children I use the mean rather than midpoint.

Monday, July 26, 2010

Cosmopolitan Trees

I've been playing around with globally expansive trees and noticed some funny behavior of Google Earth. When trees are very large the lines pass though the Earth instead of following the curvature of the Earth. In order to get curved lines I have to use tessellated lines. These are fine, but the altitude of these lines must be "clamped to earth." This means all 3-D structure of the tree is lost i.e. it produces a flat tree along the planet's surface. I have come up with a few ways to solve this issue:

1. Use the 3-D lines for all trees but if they get too large add points along the line to elevate it so it doesn't pass through the earth. One drawback of this is there will be a lot of meaningless points in the KML and it adds clutter to the tree when viewed in Google Earth.

2. Use the 3-D style tree for compact trees and the tessellated lines for cosmopolitan trees. This is the approach I'm partital to using because I think cosmopolitan trees would look too cluttered with the 3-D style.

3. A hybrid system of 3-D lines for the branches that are close together and tessellated lines for far reaching branches. One problem I see with this is attaching a tessellated line to a 3-D line. The 3-D line would have an absolute altitude above the earth whereas the tessellated line would be clamped to the earth. This would add discontinuity to the tree.

4. Create a tree far above the earth's surface with "drop lines" to mark the position of the leafs.

Thursday, July 8, 2010

KML tree and working code

Trees now look decent, check it out on github. I'm continually adding little touch ups and such so watch for changes. I still have to fix some bugs, but it's getting close to being 100% functional (at least with the test files).

Friday, July 2, 2010

First tree

I've got a very basic tree to appear on Google Earth. It's not much to look at, but the code is finally starting to come together. I'm leaving for Chicago for the holiday weekend so I won't be able to get much done over the weekend, but I'll put the code up on github so you can have a look.

EDIT: the KML file testfile.kml in the phyloGeoRef directory on github has the test tree in it. Just copy that file into Google Earth and the tree should appear.

Thursday, July 1, 2010

KML placemarks working

I've managed to get the KML placemarks working. There is now a pin for each node. Now I will be working on connecting the place marks and styling the nodes. Files will be up on github momentarily.

Monday, June 28, 2010

Jak working

I finally got Jak working (mostly, it compiles with errors, but if you ignore them the sample file will work and you get a very basic kml file, but it's kml non the less). To get it to work, JAXB2.xxx must be installed following the instructions here. This is not an ideal situation for my library as I would like it to be self contained and I don't want to make the user install other libraries in addition to phyloGeoRef. This may mean I'll have to find another KML writing library or write my own. Considering the time frame I think it is still early enough that I would be willing to give writing my own KML writers a shot. This would cut into time I would have otherwise spent writing more parsers to support more filetypes. This doesn't mean more filetypes won't be supported it just makes for more work. At the moment I feel motivated to add this extra bit of work and I think I can still work in additional parser writing. I'll work on it this week in addition to my planned work on the tree location algorithms and see how it goes.

EDIT: The more I play around with jak, the more I discover the cool things it can do. It might be a good idea to keep it after all.

Wednesday, June 23, 2010

Mean Position Algorithm Breakdown

I'm between World Cup matches so I'll take some time to explain how the mean node position function works. Here's the important part of the function:

if ( !node.isExternal() ) {
node.getNodeData().setDistribution(new Distribution(""));
Distribution dist = data.getDistribution();

for (int i=0; i node.getNumberOfDescendants(); i++){ //do mean calcs
PhylogenyNode childNode = node.getChildNode(i);
NodeData childData = childNode.getNodeData();
Distribution childDist = childData.getDistribution();
BigDecimal childLat = childDist.getLatitude();
BigDecimal childLong = childDist.getLongitude();

childCoordsLat.add(childLat);
childCoordsLong.add(childLong);

latSum = latSum.add(childLat);
longSum = longSum.add(childLong);

c++;

}

BigDecimal count = new BigDecimal(c);
BigDecimal meanLat = latSum.divide(count,BigDecimal.ROUND_CEILING);
BigDecimal meanLong = longSum.divide(count,BigDecimal.ROUND_CEILING);

dist.setLatitude(meanLat);
dist.setLongitude(meanLong);

Now to break it down.

if ( !node.isExternal() ) {
node.getNodeData().setDistribution(new Distribution(""));
Distribution dist = data.getDistribution();

We're only assigning non external nodes for this function, so first we check to make sure the node is internal. Next, since this node doesn't have a distribution yet we need to give it an empty one. Then we can get at the distribution of the node, here called dist.

Next we iterate through all the descendants of the node collecting the lat and long of each descendent:

for (int i=0; i node.getNumberOfDescendants(); i++){
PhylogenyNode childNode = node.getChildNode(i);
NodeData childData = childNode.getNodeData();
Distribution childDist = childData.getDistribution();
BigDecimal childLat = childDist.getLatitude();
BigDecimal childLong = childDist.getLongitude();

childCoordsLat.add(childLat);
childCoordsLong.add(childLong);

Now we need to get the sum of all the children, we simply add:

latSum = latSum.add(childLat);
longSum = longSum.add(childLong);

and keep track of how many descendants we've seen: c++;

This last bit is just a convoluted way of getting the mean:

BigDecimal count = new BigDecimal(c);
BigDecimal meanLat = latSum.divide(count,BigDecimal.ROUND_CEILING);
BigDecimal meanLong = longSum.divide(count,BigDecimal.ROUND_CEILING);

dist.setLatitude(meanLat);
dist.setLongitude(meanLong);

It's convoluted because we're dealing with BigDecimal objects so we have to do a bunch of stuff to keep the data types happy.