-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathigraphPresentationFinal.Rmd
360 lines (189 loc) · 8.21 KB
/
igraphPresentationFinal.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
---
title: "igraphPresentationFinal"
author: "Sam Rosenblatt"
date: "12/5/2018"
output: html_document
---
```{r}
library(igraph)
library(igraphdata)
library(visNetwork)
data(package = "igraphdata")
data(foodwebs)
```
Why igraph?
* Pretty fast for most things
* Many other pachages use it
* There are a ton of companion packages. If you want to do something you are working with using igraph, and it doesnt seem like igraph can do it, or igraph does it poorly, chances are theres a package which lets you do it/ do it better better using igraph formats
* Versions exist in R, Python, Matlab, C/C++ so data and pseudocode can be easily transferred between languages
#What are networks?
- A representation of individuals/components of a system and their connections which is defined in a way that allows us to analyze them mathematically and computationally
- Vertices/nodes/actors : Individuals/components
- Edges/ links/ ties : connections between individuals/components
A "graph" in mathematics is simply a set of vertices paired with the corresponding set of edges. A "network" is that, in addition to the other variables and properties associated with the vertices and edges
Examples:
<p style="text-align:left;"><img src="NetworkExamples.png" width="680" height="500"></p>
* [Some Publicly Available Networks](https://icon.colorado.edu/#!/networks)
#Making a network from scratch
```{r}
g<-make_empty_graph( directed = FALSE)
g <- g + vertices('Bio', 'Eco', 'CS', 'Soc', 'Psych', 'Econ')
plot(g, vertex.label.dist=3) #Add a little label distance so the labels dont overlap with the nodes. Easier to see
```
Adding edges is simple
```{r}
g <- g + edges(c('CS','Bio', 'CS','Eco', 'CS','Econ', 'CS','Soc', 'CS','Psych'))
plot(g, vertex.label.dist=3)
```
You dont have to add all your edges at once
```{r}
g <- g + edges(c('Bio','Eco', 'Eco','Econ', 'Soc','Psych', 'Soc','Econ'))
plot(g, vertex.label.dist=3)
```
#Data structures of Networks
##Edge list
If you do not have any isolated nodes sitting unconnected, then just entering the list of connections is enough, without explicitly stating the nodes
```{r}
edgeList <- matrix( c('CS','Bio', 'CS','Eco', 'CS','Econ', 'CS','Soc', 'CS','Psych', 'Bio','Eco', 'Eco','Econ', 'Soc','Psych', 'Soc','Econ' ), nc = 2, byrow = TRUE)
print(edgeList)
edgeList2 <- as_edgelist(g, names = TRUE)
print(edgeList2)
gEdgeList<-graph_from_edgelist(edgeList, directed=FALSE)
plot(gEdgeList, vertex.label.dist=3)
gEdgeList2<-graph_from_edgelist(edgeList2, directed=FALSE)
plot(gEdgeList2, vertex.label.dist=3)
```
You can imagine how this would be useful if your data came in the form of a list of edges, as mine has in the past.
##Adjacency Matrix
```{r}
adjMatG <- as_adjacency_matrix(g) #get the adjacency matrix of g
print(adjMatG)
gAdjMat <- graph_from_adjacency_matrix(adjMatG, mode = "undirected") #create a new network from the adjacency matrix
plot(gAdjMat, vertex.label.dist=3)
```
In general, there are a million ways you can make the same network, and some are easier than others given the data you have and the format it is in. If one way is very hard, consider looking into another way.
#Directed Networks
Directed networks are networks where an a connection between two nodes points from one node to the other in a way where the direciton of the pointing is important. Often the direction implies that something flows in that direction
Examples:
- Food Webs: A points to B if the biomass of A transfers to B (B eats A)
- Needle-sharing network: A points to B if B uses a needle after A (pathogen could spread from A to B)
- Citation Networks: A points to B if A cites B (an idea passes from A to B)
Simple example of directed network:
```{r}
adjMat <-matrix(data = 0, nrow=7,ncol=7)
species <- c('coyote', 'vulture', 'snake', 'grass', 'bug', 'hawk', 'mouse')
rownames(adjMat) <- colnames(adjMat) <- species
print(adjMat)
adjMat["grass","bug"] <-1
adjMat["bug","hawk"] <-1
adjMat["grass","mouse"] <-1
adjMat["mouse","hawk"] <-1
adjMat["mouse","coyote"] <-1
adjMat["mouse","snake"] <-1
adjMat["mouse","vulture"] <-1
adjMat["hawk","coyote"] <-1
adjMat["coyote","vulture"] <-1
adjMat["snake","vulture"] <-1
print(adjMat)
foodWeb <- graph_from_adjacency_matrix(adjMat)
plot(foodWeb, vertex.label.dist=3)
is_weighted(foodWeb)
```
#Weighted Networks
Weighted networks are networks where edges come with "weight" which usually implies something about the "strength" or "length" of that edge.
Examples:
- Transportation networks: weights may imply distance
- Food web: weights can denote energy flux between species
- Water transfer networks: weights can denote rate of transfer
```{r}
#Check if your network is weighted
is_weighted(foodWeb)
adjMat["grass","bug"] <-0.2
adjMat["bug","hawk"] <-1
adjMat["grass","mouse"] <-0.8
adjMat["mouse","hawk"] <-0.2
adjMat["mouse","coyote"] <-0.2
adjMat["mouse","snake"] <-0.2
adjMat["mouse","vulture"] <-0.4
adjMat["hawk","coyote"] <-1
adjMat["coyote","vulture"] <-1
adjMat["snake","vulture"] <-1
print(adjMat)
foodWeb <- graph_from_adjacency_matrix(adjMat, weighted=TRUE)
plot(foodWeb, vertex.label.dist=3)
#Notice that nothing looks different
#Now check if your network is weighted again
is_weighted(foodWeb)
#Label the edges with weights
plot(foodWeb, vertex.label.dist=3, edge.label=E(foodWeb)$weight)
#Labelling edges is good when you need to know the exact values of weights, but it gets messy with big or dense networks(with alot of edges)
#Map edge weights to the widths of plotted edges
E(foodWeb)$width <- E(foodWeb)$weight
plot(foodWeb, vertex.label.dist=3)
#This is good except some edges are too thin
# Modify edge widths to make it look better
E(foodWeb)$width <- E(foodWeb)$weight*4
plot(foodWeb, vertex.label.dist=3)
```
In my example I weighted edges normalized to 1, to signify what percent of a population was being eaten by what, edges can be weighted in many different ways and need not be normalized to one. The method of weighting edges depends on the context of the system.
<p style="text-align:left;"><img src="foodNetworkExample.pdf" width="680" height="500"></p>
#Working with existing data
```{r}
fwc <- foodwebs$CrystalC
plot(fwc, layout=layout_as_tree, vertex.label.dist=1.5)
plot(fwc, layout=layout_as_tree, vertex.label = 1:length(V(fwc)))
visIgraph(fwc, layout="layout_as_tree")
```
# Vertex Attributes
Weight and direction are attributes of edges
Vertices also can have attributes.
So far we have used the vertex attribute "name", but vertices can have all sorts of attributes
```{r}
#Check current vertex attributes
vertex_attr_names(fwc)
```
##Continous vertex attributes
```{r}
V(fwc)$size <- V(fwc)$Biomass #note that we are setting a new vertex attribute
visIgraph(fwc, layout="layout_as_tree")
V(fwc)$size <- log1p(V(fwc)$Biomass)
visIgraph(fwc, layout="layout_as_tree")
```
##Categorical vertex attributes
```{r}
V(fwc)$color <- V(fwc)$ECO
visIgraph(fwc, layout="layout_as_tree")
```
#Network Analysis
Network analysis refers to analyzing existing networks
## Centrality
```{r}
#Degree
V(fwc)$size <- degree(fwc)
visIgraph(fwc, layout="layout_as_tree")
V(fwc)$size <- degree(fwc, mode="in")
visIgraph(fwc, layout="layout_as_tree")
V(fwc)$size <- degree(fwc, mode="out")
visIgraph(fwc, layout="layout_as_tree")
#Betweenness Centrality
V(fwc)$size <- log1p(betweenness(fwc))*5
visIgraph(fwc, layout="layout_as_tree")
#Closeness
#Calculated as the reciprocal of the sum of the length of the shortest paths between the node and all other nodes in the graph. Thus, the more central a node is, the closer it is to all other nodes.
V(fwc)$size <- closeness(fwc)*10^4
visIgraph(fwc, layout="layout_as_tree")
```
#Manipulating networks
```{r}
fwc <- foodwebs$CrystalC
fwc1 <- delete_vertices(fwc, c('Input','Output', "Respiration", 'detritus' ))
#V(fwc1)$size <-1
visIgraph(fwc, layout="layout_as_tree")
```
# Resources I used
- https://igraph.org/r/doc/
- https://jjborrelli.github.io/post/simple-food-web-plotting/
- http://tuvalu.santafe.edu/~aaronc/slides/Clauset_CSSS2014_Networks_1.pdf
- http://www2.unb.ca/~ddu/6634/Lecture_notes/Lec1_intro_handout.pdf
- https://www.rdocumentation.org/packages/visNetwork/versions/2.0.4/topics/visNetwork-igraph
- https://www.jessesadler.com/post/network-analysis-with-r/