Visualizing the Specializations of Historians: An Experiment with Networks

ishot-7This starburst is the result of an experiment in visualizing the relationships between history specializations. The circles (or nodes) represent specializations, and are colored and sized by the number of historians in my sample who claimed that specialization. The lines (or edges) represent a connection between subfields, and are sized and colored by the number of historians who specialize in both of the connected specializations. The resulting network is beautifully dense with connections, and by plotting it in Tableau Public, we can explore this universe with filters, zooming, and highlighting (more information and instructions are below the viz. If the filters are slow, you can download the whole thing and open on your own computer with a free copy of Tableau Reader).

You reached the view limit for this month.
Please get the advanced iframe pro version.
Go to the administration for details.

The data is from the American Historical Association’s online directory, which I accessed in January 2015. In the directory, some 15,000 historians list up to three specializations. But this visualization breaks down each specialization even further. Here, a historian listed as specializing in the cultural history of modern Germany and in early modern Europe is in five categories—Germany, modern, cultural, early modern, and Europe.

Spec-Fig1

The largest circle belongs to the temporal category “modern,” which here includes all historians who listed “modern” in their specializations (Over 3,000) and the line between Germany and modern tells us (if we hover over that line—on the dashboard, not on this static image) that 462 historians in this sample have both of these categories in their specializations. And if we wish, we can compare how many historians of Germany specialize in early modern as opposed to modern—the thicker and darker line shows us a more frequent connection between Germany and Modern than between Germany and Early-Modern. Repeat this simple network for 15,000 historians, and you get the galaxy above.

This isn’t perfected by any stretch, and I really hope that users take the numbers as relative weights rather than exact literal measures of precisely how many historians study a given topic or topic pair. I’d like to think of this viz as a proof of concept foray rather than finished product. But I wanted to explore how to get beyond viewing these subjects in utter isolation (like I did here and here), and this diagram is encouraging. Isolation is the least of our problems in this visualization—the network is incredibly dense, even though I only included subjects that were listed by at least twenty historians and only included connections shared by at least two historians. Zoom into the center of the network to get an idea of how many lines radiate from each topic (zoom controls appear in the upper left if you hover over the image). Or use the filters to isolate a single topic. For example, what do environmental historians study in addition to the environment? Here’s what that looks like:

 

Spec-Fig2

 

 

It’s important to keep in mind that even though the node sizes change as we apply filters, the number of references to that topic in the data sample does not filter. So here the node size of “Latin-America” is determined by the total number of Latin America historians, not by the number of historians who work in both environmental and Latin American history—for that, look to the size and color of the line connecting the two.

I was inspired to do this graph during my participation in Indiana University’s information visualization online course. Using the Sci2 tool, I placed the nodes with a force-directed layout. So the circles act like charged particles, repelling each other, while the edges act like springs,  pulling them together. There are fewer connections between Environmental and Development than between Environmental and Canada, but Development is drawn closer to the center by other connections (not visible if we filter only by Environmental).

So that’s my layperson’s understanding of force-directed diagrams. Placement is not exact, but not merely random or aesthetic. And I think bringing physics into the layout gives the visualization an inviting beauty. I’ve only just begun to explore the connections here, but I find something new each time I try a different filter.

Removing less popular topics and topic pairs (number of occurrences for both set to minimum 100 in the example below) reveals the larger map’s underlying skeleton and a few distinct neighborhoods: The classical world in the lower left (the triangle formed by Greece, Rome, and Ancient) connects to the rest of the network via Medieval. The East Asia polygon  is more spread out, with Japan pushed further out than China and both tied to the center via Modern. An early US neighborhood is discernible at the top of the graph, and so on (you can see all this by hovering over the nodes in the interactive viz, above).

Spec-Fig3

 

Finally, I included a filter that reduces noise even further for focused looks at particular relationships ( use “…or build a network by…”). The first example above uses this filter, but we can add as many nodes as we like. Below, for example, is the network formed by major topical categories: Class, Cultural, Economic, Military, and so on.

Spec-Fig5

With that base network in place, we can quickly add geographies, and compare, for example, topical approaches to Germany vs. China (China specialists claim more international relations scholars, while Germany specialists have more historians who also specialize in women and gender. Both draw a good number of social historians). This is a lot easier to see on the interactive visualization, and these comparisons can be done quickly by slowly increasing the minimum number of connections.

Spec-Fig6

 

 

I hope someone finds this visualization interesting or useful, and I hope users will share what they find.

Tips on using the dashboard

  • If the online version is slow and unresponsive, you can download here and open the file in Tableau Reader, available for free here.
  • The zoom and selection controls appear in the upper left when you start moving the mouse pointer around. The Zoom Area tool is useful, but can distort the layout if not kept square. The Home button returns to the initial view.
  • You can pick three specializations to type into the filters, which will show each specialization as a hub at the center of all its associated specializations. To remove a filter, type “No Filter” into the box. There is a list of available topics in the box in the lower left or in the tabs above the dashboard.
  • The “build a network” filter allows you to pick multiple topics and show only those nodes you pick—in other words, not as a hub.
  • The Connections and Specializations filters can be used by sliding or by typing. Typing is a bit more precise. Click on a number to change it. There’s a small “remove filter” icon that appears when hovering over that area.
  • Undo and Reset buttons are at the bottom of the dashboard.
  • Selecting a circle in the diagram will filter the lists below, and hovering over the lists of connections will highlight the selection in the diagram. If the viz is slow or unresponsive, try downloading the viz and work with it on your own computer (download here,  viewer here).
  • Selecting a line in the viz will filter the two lists of connections at the bottom, allowing you to see other connections and compare their relative weights. Here again, hovering over and entry will highlight in the viz (sometimes this happens slowly).
  • Selecting a topic from the list on the bottom right will highlight that topic and filter the next two lists by connections. Hovering over those lists will illuminate those connections.
  • Click selections again to deselect, or click in a white space to clear all selections. Or use “reset.”
  • These interlocking filters can make it easy to get lost. Use the reset or undo button to back up!

About the data and the process

The data is all from the AHA’s online directory, and was downloaded using Google Sheets in January 2015. I then lightly massaged the specializations text, so we don’t have “diplomatic,” “foreign relations,” and “international relations,” for example, as separate categories. But I only did this in obvious cases (I think). Most of the text wrangling went on with the temporal categories—I decided to lump into whole centuries if a historian claimed a part of a century or a span that straddled two centuries. So “1877-1914” became 19th-century, 20th-century. This is questionable, I know, but since historians can often be so particular with their dates, I was ending up with lots of very small categories. However, I stayed faithful to descriptions like “modern” and “ancient” and did not force a set of years upon them.

Also important to keep in mind is that I didn’t assign container categories to historians who didn’t list them. So if  a historian only listed American South as a specialization, I did not list them under US, unless they also listed US as a specialization. I think it would be worth doing another analysis that forces hierarchical categories, but I haven’t undertaken that—yet.

I did all this refining and reshaping of the data in OpenRefine, then used Sci2 to create a co-relational word network. I then applied a DrL (VxOrd) layout and exported the network, along with the x-y coordinates. After some more wrangling in Open Refine, I imported into Tableau Public.

I welcome any questions!

Leave a Reply