Dr. Suat ATAN

Experiments on R, Python and Data Science

03 May 2020

Introduction to Network Analysis with R

Quick Start

Instead of theoretical explanations let’s just think about this scenario. You are monitoring e-mails between these persons A, B, C, D, E, F, G, H without details of the e-mails. Just the traffic of the emails. You have this table:

From To
A B
A C
D E
D F
E G
E H
A H
B H

We say directed because there is an arrow from A to B. This reflects there is a movement or directed relation like sending an e-mail or sending money or calling someone. Keep this in the mind, if there is “sending” this requires a “directed” graph. Of course, two things sometimes do not require direction in their relation. Think about friendship. There is no direction, you and your friend are just connected. This kind of relation is undirected.

#directed
library(igraph)
g <- graph_from_literal(A-+B)
plot(g)

#undirected
library(igraph)
g <- graph_from_literal(A--B)
plot(g)

Now, let’s try to show all relations. Of course, we are doing it literally. If you have this table you can use graph_from_data_frame() function. We are using graph_from_literal() function with a special notation. In directed graphs, edges will be created only if the edge operator includes a arrow head (‘+’) at the end of the edge:

# the line which starts the # are comment not code
# computers isn't so smart and cannot understand us
# except lines below :)
# g is our graph object

g <- graph_from_literal(A-+B,
                        A-+C,
                        D-+E,
                        D-+F,
                        E-+G,
                        E-+H,
                        A-+H,
                        B-+H)
# we are plotting it
plot(g)

Who is who? Understanding Network Metrics

OK. We have visualized our first network diagram. Before any calculation, the diagram is pretty self-explanatory. For instance, A looks like has never received any e-mail. C and F look like just ‘getter’ they didn’t send any e-mail. For this kind of easy network visualization simply works and maybe there is no need for any computation. However, when things are messy we should calculate the importance of each node. Of course, there are more metrics. But let’s keep things simple and go. Are you ready? We will calculate the degree metric of each person (or let’s remember nomenclature: node)

degree(g)
## A B C D E F G H 
## 3 2 1 2 3 1 1 3

That’s all. Each person has a number and each number reflects how many numbers the node has. As you know, the bigger number reflects a bigger interaction. Persons like F and G is not so “active”. Let’s fire them :)

comments powered by Disqus