Update: Graph has now been made interactive.
I was originally put onto network visualisation by Simon Raper by his fantastic post graphing the history of philosophy. I’ve learned a lot in the last week and decided to be ambitious. I wanted to see what the entire network would look like – with everyone on Wikipedia. Well, everyone with an infobox containing ‘influences’ and/or ‘influenced by’. For those unfamiliar to this work please see his post first – even if it is just for the pictures!
For those new to these types of graphs: the node size represents the number of connections. To create the following graph I used a database version of Wikipedia to extract all the people with known influences. I then scaled the nodes by their degree of influence. The bigger the node, the bigger influence that person had on the rest of the network. Nietzsche, Kant, Hegel, Hemingway, Shakespeare, Plato, Aristotle, Kafka, and Lovecraft all appear as the largest nodes. Around these nodes, cluster other personalities who are similarly related thinkers/authors. I used a module to highlight communities in different colours which revealed sub-networks within the total structure. You’ll notice common themes amongst similarly coloured authors.
- First I queried Snorql and retrieved every person who had a registered ‘influence’ or registered ‘influenced by’ value (restricted to people only so if they were influenced by ‘anime’, they were excluded).
- I then decoded these using a neat little URL decoder and imported them into Microsoft Excel for further processing (removing things like ‘(Musician)’ and other similar syntax).
- ‘Influenced By’ entries were also included by reversing the order of influence. Duplicates found in the ‘influences’ list were then removed. This ensured a more complete dataset.
- I then exported these as a csv and imported into Gephi and proceeded as usual. Fruchterman-Reingold algorithm followed by Force Atlas 2. I then identified communities using ‘Modularity’ and edited the rest in Preview. Due to the size, I’ve had to zoom up and take snapshots on regions of interest.
- The csv file containing all of the data can be obtained here so you can make your own maps.
- The graph is obviously biased towards Western ideologies and culture – the people entering in the information are after all primarily English speakers. It must be said: There are a great many influential people missing from the graph.
- The graph is created from the datasets of dbpedia and so is intrinsically incomplete. By exactly how much? Well, I need to run a few more tests. Many human endeavours are sadly missing.
- This work is just trying to demonstrate that by combining the power of new open-source tools with the vast quantity of the information on the Internet, one can create useful and informative networks.
- The community identification was done using an in-built module of Gephi so I apologise if you disagree with some of the groupings.
- What does the word influence actually mean? Material? Ideological? I for one don’t know the motivations behind the connections shown — I am simply using the relations entered into Wikipedia by its contributors. Please make those you share this graph with aware of this crucial point.
- I would like to compare this network to the Indiana Philosophy Ontology Project and the database at Freebase.
- I already have designed a poster version and if there is sufficient interest, I can make this available. Update: Posters are now available! Version 1 & Version 2. Now VERSION 3 (Poster Version)
- Explore other algorithms on faster PCs. I am limited by the processing power of my desktop and had to resort using Force Atlas 2 in Gephi and good timing. On a faster machine, the other force algorithms might bring out the richness of the network in a more aesthetic manner but this is what will have to do for now.
I restricted the network to only include nodes which have 4 or more connections otherwise my computer crashes trying to render the full network. It also allows you to read the names of people who you probably are more interested in. Doing this however does remove many people from the network. Apart from this one selection, nothing else is altered.
There are a few main communities (roughly):
- Red – 19th/20th century philosophers
- Green – antiquity & enlightenment philosophers
- Pink – enlightenment authors
- Yellow – 19th/20th century authors
- Orange – fiction author
- Purple – comedians
These can be broken down in to further categories but to avoid flame-wars I’m going to avoid breaking the network down much further.
I get a real kick by starting at one node and travelling down the connections to a distantly related someone else. People in philosophy influenced fantasy writers who influenced comedians. It shows one thing above all: the evolution of ideas is a non-linear process. We too, are somewhere in this web, albeit at a smaller scale. We too, are the sum of many.
In Gephi, it is far more interactive because you can highlight nodes and every node that one is connected to becomes highlighted. You can search any name you wish and it becomes highlighted within the network. The list goes on.
Now for some pretty pictures. Click to enlarge as some text might be hard to read at the resolution set below. For best viewing, click here for the ability to dynamically zoom about the graph.
We now come to the most influential people of the landscape (largest nodes). Keep in mind the various biases involved here. The philosophy biographies on Wikipedia tend to have more detailed content (and info boxes) and so will naturally be larger nodes in the network.
An earlier generation of thinkers. Interestingly the bottom section of the network is essentially chronological – starting from the philosophers of antiquity (far left, green) and ending in the 20th century philosophers (far right, red).
Leave any questions or comments you might have in the comments section below. I’d love to hear of any suggestions for future networks.