Recap and State of the Project, Week 11

Before I get started: As part of my masters’ degree, my program requires that I do a write-up of my internship experience. You can see the complete product here, but here’s an excerpt and a YouTube video demo-ing what Phylet looks like right now:

Impact/Outcomes of Internship:

Deliverables: The outcome was a mixed bag. I made some real advances in the visualization, and I learned a lot, but my progress did not even kind of live up to what I had planned in the schedule I originally proposed in my internship application – I had no idea what I was getting into, and I knew it, but it was still a surprise at how much longer every single step took. The outcome of the visualization itself is a version that is not entirely ready for a wider audience but is fairly useable as a tool for researchers to create static images, which is exciting. I intend to continue working on the visualization in the future, but I wish that I could have had a finished product. I’m like a D3 Ikea – I constructed all the pieces, but someone else has to put in the screws.

Learning: This was my first real experience trying to code a big project, and it was a valuable one. It helped me understand that getting stuck and frustrated is not necessarily a sign of failure, or that I’m not good enough – it’s just how coding new things works. Of course, I am still a novice programmer, so perhaps the roadblocks were more frequent than they would have been for someone with more experience. In exchange for the frustration, however, I get to be in a space where anything is theoretically possible and no one else really knows the right answer, either – I think it is a fair trade.

Implications: The Big Data paradigm is just beginning to leak into biology research, and being on the leading edge of that, however briefly, really felt like I was on a frontier. Even my half-working Phylet visualization is expanding the toolset of researchers – how awesome is that? The vastness of the space that evolutionary biology works in – both in time and in the size and complexity of the tree of life – feels tailor-made for the tools of big data, but those tools are still underutilized. It’s exciting that my data-crunching skillset opens doors into scientific research that would otherwise be closed to me. I never thought I would like coding as much as I do, but it is just incredible to be able to make real contributions of consequence. It is addictive enough to keep me going through weeks of pounding my head against the coding wall.

Now for the report on the week’s progress:

I’m in the middle of training as a grad student instructor for the University, so I didn’t work quite a full week this week and have some hours to make up.

Did:

Took a stab at solving the multiple parents issue again. Apparently, everyone else who has come across this problem has solved it by making the layout into a force layout instead of a tree layout. I don’t really like this solution, but I felt obligated to at least give it a good try. Here’s a video of how the force viz behaves:

I finally got the shapes right, but I still really dislike how the physics of the force layout cause the tree to move. I don’t like how much the nodes move after each click (I don’t want them to move at all, really, unless they are directly involved in the click action) and each subtree gets all tangled up in other subtrees.

39oux

(I just realized the gif-maker site I used made this into a non-looping gif. Apologies.)

Of course, there are constraints I can add to control these things somewhat, but the constraints that I did add took forever to figure out, and the way the viz behaves makes me suspect that my code was pretty buggy. In addition, I had a lot of trouble getting the viz to work with Neo4j-powered data – the viz in the video uses the ever-reliable stock data from d3, flare.json.

In short, there’s no reason that this can’t be a solution, but I don’t particularly like the product and would rather spend my time trying to hack the cluster layout to work with multiple parents. It was a valuable exercise, but I think it’s time to abandon this particular tangent.

To do:

Same as last week. Keep puzzling out the multiple-parents problem and return to finishing the two alternate views described in last week’s update.

Leave a comment