Documentation on how to set up Neo4j and Phylet is here.
All code will be uploaded to bitbucket here by September 23, 11:59 p.m. EST: https://bitbucket.org/cdphelan/opw-phylet/src/0241cb5e64370faed48ec9cbbc4b77f9c2a3ef01/final_phylet?at=default [sorry – WordPress doesn’t like this as a link for some reason]
The Phylet visualization is now deployable as a visualization tool for researchers. It will probably be most useful to those researchers who are at least somewhat comfortable with fiddling with code. I primarily used the bird database (available as atol.db) in my testing, but it appears to be fairly flexible, as long as the database has a few attributes (given below) Phylet has three primary new features: the tree form, the “breadcrumb” visualization option, and the “backwards” visualization option.
1) Tree form: the visualization is now hierarchical , rather than the free-form force layout from this spring. All nodes start with the life node and branch downwards to a set level; species nodes drop down to a level below all other child nodes, to help with visual organization. Red links indicate conflict, while teal nodes indicate no conflict; the opacity of the link indicates how many sources agree on that connection. Node size correlates with the number of children it has. (It’s fairly easy to rotate this to open from left to right, rather than top to bottom; there are comments in the code to guide this somewhat. )
2) Breadcrumb visualization: accessible by right-clicking (or ctrl-click, on a Mac) on a node. In this mode, all expanded nodes collapse except for the clicked node, leaving a single trail from the clicked node back to the root node, Life. One can re-expand any node by left-clicking (with the exception of the bug, see below), or extending the trail by continuing to right-click.
3) Backwards visualization: accessible by shift-clicking a node. This will prompt a separate visualization to pop up that allows a user to explore all of the clicked node’s parents – the clicked node becomes a root node that opens from the bottom up (rather than top down). This functionality is somewhat limited by the multiple-parents problem (see bug report below).
Overall, the result of my summer of coding is what I’d call an “intermediate deliverable”; it is definitely useful, but its functionality is not as broad as it is intended to be. Originally, the intent was to make a visualization that was accessible to both a scientific and general audience, but some work still needs to be done before it can be used by someone without prior understanding of the data structure, without getting them hopelessly confused. The multiple-parent problem causes the nodes and links to act a little strangely, and the number of nodes that are involved in many of these databases makes the full tree fairly overwhelming, visually.
Sometimes I feel a little guilty – I got so much out of this experience that I feel like it’s not a fair trade; I could never give back as much as I got. Regardless, it was a wonderful experience and I’m glad I had the chance. Thank you to NESCent, Gnome, and particularly Stephen and Gabe, who were wonderful and supportive mentors.
I kept the visualization within the framework of the original Phylet visualization, but of the sake of simplicity, these are the main places where I made changes:
- CSS: various mods to nodes/links
- index.html: uses a static version of d3.min, rather than the ones online; also erased the highlight toggle option
- service.py: primarily used children() and parents(). In children(), there is a count that artificially limits the number of visible children; when the counter reaches a certain number, the loop breaks. You can comment this out/in to turn it off/on (see the code for details).
- gol.js: primarily this.start() and update() and the functions they call; of these, the most important ones that I added are trail(), toggle(), toggleOff(), and load_node().
I made no changes to the search or recreateAction functionalities; I believe they don’t work, perhaps never worked, but making them internet-proof is beyond my expertise. I don’t really know how people use search bars to break sites, for instance.
Requirements for Neo4j format:
The service expects several attributes in a Neo4j node:
- common_name (could easily be replaced with just name)
The scripts for adding all of these, if they are not already in the database, are available on my bitbucket. My scripts are written in Python, but Neo4j’s cypher language can also be used to do stree_children and stree_parents.
- Losing children in breadcrumb mode: This happens fairly regularly but I’ve not been able to isolate what causes it to happen. When in “breadcrumb” mode, a user should be able to re-expand nodes by clicking on them, but sometimes this doesn’t work – d3 loses the stored children nodes at some point.
- Window resizing: Minor, but the Phylet website is responsive and dynamically resizes the svg element with the window size; however, the d3 visualization does not resize, which causes the site to look very messy if it’s not exactly the right size. I didn’t attempt to fix this, as I think Gabe has been working on the website look & feel, and didn’t want to fix a problem that may no longer be relevant.
- Space concerns: Not as much of a problem for the research functionality, so it was a low priority and ended up never getting fixed, but the visualization would be fairly unwieldy for a general audience. The visualization requires a lot of space to open fully – there are just too many nodes in a lot of these databases – but I couldn’t find a solution that made Phylet less overwhelming, visually, without impeding its usability for a researcher. I coded, but did not implement, two potential solutions: artificially limiting the number of children a node can have, and creating an additional “load more” node that, upon being clicked, would load n number of nodes more.
- Speed concerns: Opening a node with a lot of children takes about 5-10 seconds, which is a pretty big lag if you don’t know what’s going on. I don’t know how fixable this is.