Visualization research brain-dump

As progress advances in Phylet, I’ve moved into the experimentation phase—looking for alternate visualization styles of the tree of life. The way Phylet is visualized now does a good job of visualizing conflicts—indications that one or more of the source taxonomies do not agree on how nodes relate.  However, its readability is seriously restricted by its unordered format, which dissolves the structure of the tree and leaves a pretty but chaotic visualization.

From what I could find during preliminary research, life visualizations can be divided into three primary categories: network (unordered), tree (ordered), and circular (ordered). As is to be expected, there are some exceptions, but many of them do not improve on the tree’s readability, so I have put them to the side for now.

Unordered:

bird-phylet

The current Phylet visualization, showing a slightly-expanded bird database. The blue node is “Life,” the root node; red nodes are unresolved nodes (nodes with conflict) and green nodes are resolved nodes. Every node with a colored ring around it has more collapsed underneath.

What Phylet currently looks like. Its shape is like that of a network—order is not incorporated into the visualization. Current visualization tools are built for networks, not ordered trees like the tree of life, which is probably the primary reason for why this style is used. It is not particularly readable, though it does make it slightly easier to visualize conflicts.

Finding the root node in these visualizations necessitate careful searching. This visualization style is really only suited for getting detailed information—at first glance, the impression one gets is that there is a lot of meaning packed into the image, though what that meaning is is anything but clear. Clusters quickly get unreadably complex. They are more useful in academic publications, and run the risk of scaring off a more casual observer.

The network visualization style isn’t a new invention, however—scientists were using it at least as early as 1802, when August Johann George Carl Batsch published this “table of affinity of the vegetable kingdom”:

old-network-viz

According to Theodore W. Pietsch in Trees of Life, this visualization has been described as being “much like a demented spider’s web”—an indication, perhaps, that network visualizations are not entirely satisfactory (Pietsch 26).

Pros: easy to visualize with existing visualization tools, especially conflicts; allows for clustering (see above), which could be useful (though I’m still too much of a newbie to know for sure).

Cons: unordered, difficult/impossible to discern structure; really only imparts information on a small/detailed scale and isn’t particularly useful as overview.

Tree:

The classic style of life visualization. It is important to not overlook the fact that art depicting the tree of life nearly always takes this form, an indication that it captures peoples’ attention and imagination.

It allows for order, which is very important for life visualizations—it is, after all, extremely useful to be able to distinguish between species, families, phyla, etc. at a glance. It is also well-suited for what I’ll call “zoomability”—allowing for a viewer to get information at many levels of detail. One can get an understanding of the tree of life just by glancing at it, or by examining any of its branches, leaves, and twigs. This is an essential part of visualizing complex information well, as it allows one to walk the balance between presenting information in an approachable way without oversimplifying or dumbing it down.

For scientific researchers, there are a few difficulties to the tree form. Phylogenic trees become problematic when summarizing multiple trees, particularly when dealing with horizontal gene transfer, incomplete lineage sorting, and hybridization; conflict between datasets also presents issues. In constructing a visualization that is usable for both general audiences and researchers, I will have to incorporate what’s important to researchers—retaining the original structure and source information—without risking information overload, which would erode Phylet’s usefulness.

In addition to the problems researchers run into, there are two potential downsides to such a visualization. First, a static image of a complete tree of life would be far too large to be practical. Such images are usually of only a small portion of the tree—for example, Ernst Haeckel’s “Pedigree of Man,” published in 1879.

 647px-Tree_of_life_by_Haeckel

Second, the tree visualization doesn’t lend itself particularly well to visualizing conflicts. Its great strength—its relative simplicity—is lessened when conflicting trees are layered on top of one another. The force layout of d3, which powers the current Phylet visualization, automatically arranges nodes in the most organized way it can; a tree layout would not arrange itself so nicely, a technical challenge that will have to be overcome.

Ernst Haeckel, 1866.

Ernst Haeckel, 1866.

Virtual, dynamic visualizations of the tree of life are able to at least somewhat overcome the limitations of the tree. Having an infinitely zoomable tree makes it easy to see a complete tree at all levels of detail. For practical reasons, such a visualization would not be able to load everything in all at once—the massive amount of data would crash a browser and make traversing the graph painfully slow. There are several graphs currently online that take this zoomable approach. OneZoom, which was released to much fanfare, uses fractals:

one-zoom-out

I think the result is gorgeous, though I am not entirely sold on its practicality. OneZoom appears to be geared toward a general audience. One of the goals of the Phylet visualization—one that could be one of its greatest strengths, if done well—is to be versatile enough to be useful for both a general and a scientific audience. OneZoom’s design choices may exclude it from becoming a useful scientific tool, something for me to keep in mind.

OneZoom Pros: it can toggle between scientific and common names; when fully zoomed into a leaf, it provides a lot of information and a link to Wikipedia; like the movement through time, though probably not practical for Phylet in the short term and possibly not useful for scientific audience; each node has the number of species that branch from it.

OneZoom Cons: the fractal design results in a lot of wasted space—though avoiding information overload is a fine line to walk, it seems as though a little more information could be fit in a wide scale. This also makes it difficult for a scientist to use it as a tool, as very little information is conveyed without zooming. Things that, while not necessarily negative, does not fit Phylet’s intended audience well: it does not incorporate conflicts; much of the content (including the colors) is focused on conservation status.

More information about OneZoom here and here.

The Open Tree of Life (which powers Phylet) also has a dynamic tree of life visualization, which can be found here. The visualization itself is minimalist, and does a good job of presenting the information in a straightforward way. However, it is considered a little boring:

otol-viz

The boredom factor is an important thing to keep in mind while designing Phylet: it is not enough to present the information. The information must also, to an extent, be fun to absorb. This isn’t a trivial or silly thing—making the information actually interesting imbues it with meaning that would otherwise be absent. I believe there may be other technical concerns with this viz—if I remember correctly, it may not scale up well.

OToL Pros: Simple, lightweight, easy to understand structure of tree; has space in right sidebar for a lot more information.

OToL Cons: Boring; no further information (such as common names) that would make it more meaningful to a non-expert; may not scale well; hard to look at a large portion of the tree because of space constraints; not entirely sure how well it visualizes conflicts, if at all.

Other tree visualization: DeepTree, which is on a touch interface.

Circular:

The third major category of viz styles is a circular style, most of which are essentially trees rolled up into something more spatially efficient. One such example is iTOL, the Interactive Tree of Life.

circular-viz

Though it is certainly easier to view more data in a more compact space, I’m not a huge fan of this style. One of the big pitfalls of data visualization is using circles when not strictly necessary, as the way our brain processes them can cause problems (like comparing two circles of different sizes and appropriately understanding the difference in their scales). This is a different type of circular visualization, but I think it is still slightly more difficult to get oriented in this layout.

With a virtual tree, the primary advantage that I see with this style—its efficient use of space—is less important. In addition, it seems like scaling this up would present a technical difficulty. In the above visualization, the viz has to lay out everything, down to the species level, all at once. Perhaps this could be circumvented by having a series of concentric circles that a user can expand one at a time? In any case, I’d prefer to use a tree, though it’s good to have this one available as an example if things get too problematic with a tree for some reason.

Pros: compact, uses space efficiently.

Cons: circular pattern not as intuitive; scaling up may present technical and design challenges.

Other:

One example that does not fall under the above categories is German naturalist Johan Jakob Kaup’s pentagram visualization. It was published in 1854, five years before the publication of Charles Darwin’s On the Origin of Species.

star-life-viz

Kaup’s depiction of the structure of the crow family is all but impossible to scale up in any particularly useful way, and the meaning of the structure is not immediately apparent, either.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s