The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
library(igraph)
library(gsbm)
library(missSBM)
#> Warning: le package 'missSBM' a été compilé avec la version R 4.1.2
library(RColorBrewer)
Les Misérables characters network, encoding interactions between characters of Victor Hugo’s novel, was first created by Donald Knuth as part of the Stanford Graph Base (https://people.sc.fsu.edu/~jburkardt/datasets/sgb/sgb.html). It contains 77 nodes corresponding to characters of the novel, and 254 vertices connecting two characters whenever they appear in the same chapter.
data(les_miserables)
<- les_miserables$A
A<- les_miserables$names
names <- graph_from_adjacency_matrix(A, mode = "undirected")
net V(net)$name <- names
V(net)$color <- "gray80"
<- degree(net, mode="all")
deg V(net)$size <- deg
plot(net, vertex.label.cex = 0.4)
We fit a classical SBM to the graph and represent the graph with nodes proportional to their degrees and colored by community assignment. The number of communities has been selected so as to minimize the ICL criterion.
<- 1:10
vBlocks <- missSBM::estimateMissSBM(A, vBlocks, "node")
collection_sbm #>
#>
#> Adjusting Variational EM for Stochastic Block Model
#>
#> Imputation assumes a 'node' network-sampling process
#>
#> Initialization of 10 model(s).
#> Performing VEM inference
#> Model with 10 blocks.
8 blocks.
Model with 1 blocks.
Model with 2 blocks.
Model with 7 blocks.
Model with 5 blocks.
Model with 6 blocks.
Model with 9 blocks.
Model with 4 blocks.
Model with 3 blocks.
Model with #> Looking for better solutions
#> Pass 1 Going forward +++++++++
1 Going backward +++++++++
Pass
<- round(collection_sbm$bestModel$fittedSBM$probMemberships)
colo <- sapply(1:nrow(A), function(i) which.max(colo[i,]))
colo <- brewer.pal(10, "Set3")
pal3 V(net)$color <- pal3[colo]
V(net)$label <- NA
V(net)$size <- deg
plot(net)
We observe that the main character Jean Valjean is alone in his community, and one of the clusters groups important characters (Thénardier, Éponine, Javert).
The Generalized stochastic Block Model accounts for outlier profiles (hubs, mixed memberships). In this model, nodes are divided into two sets: the inliers which follow a classical SBM, and the outliers, for which we make no assumptions on the connectivity model. These two sets are unknown a priori and are learned automatically by our procedure. Below we represent the result of the clustering, with the detected outliers indicated in red. They correspond to hubs (large center node, Jean Valjean) and nodes with mixed memberships (e.g. smaller central nodes with connections to several clusters).
<- 4
lambda1 <- 5
lambda2 <- gsbm_mcgd(A, lambda1 = lambda1, lambda2 = lambda2)
res <- names[which(colSums(res$S)>0)]
outliers <- svd(res$L)
sv <- sv$u[,1:4]
pc rownames(pc) <- names
<- pc[setdiff(names, outliers),]
pc <- kmeans(pc, centers=4, nstart=50)
com $cluster
com#> Napoleon MlleBaptistine MmeMagloire CountessDeLo
#> 4 4 4 4
#> Geborand Champtercier Cravatte Count
#> 4 4 4 4
#> OldMan Labarre Marguerite MmeDeR
#> 4 4 4 4
#> Isabeau Gervais Tholomyes Listolier
#> 4 4 3 3
#> Fameuil Blacheville Favourite Dahlia
#> 3 3 3 3
#> Zephine MmeThenardier Fauchelevent Bamatabois
#> 3 4 4 2
#> Perpetue Simplice Scaufflaire Woman1
#> 4 4 4 4
#> Judge Champmathieu Brevet Chenildieu
#> 2 2 2 2
#> Cochepaille Pontmercy Boulatruelle Eponine
#> 2 4 4 4
#> Anzelma Woman2 MotherInnocent Gribier
#> 4 4 4 4
#> Jondrette MmeBurgon Gillenormand Magnon
#> 4 4 4 4
#> MlleGillenormand MmePontmercy MlleVaubois LtGillenormand
#> 4 4 4 4
#> BaronessT Mabeuf Enjolras Combeferre
#> 4 1 1 1
#> Prouvaire Feuilly Courfeyrac Bahorel
#> 1 1 1 1
#> Bossuet Joly Grantaire MotherPlutarch
#> 1 1 1 4
#> Gueulemer Babet Claquesous Montparnasse
#> 4 4 4 4
#> Toussaint Child1 Child2 Brujon
#> 4 4 4 4
#> MmeHucheloup
#> 1
<- 1:nrow(A)
colo2 names(colo2) <- names
<- com$cluster
comu which(comu==4)] <- 6
comu[setdiff(names, outliers)] <- pal3[comu]
colo2[<- "red"
colo2[outliers] <- names(A)
labels names(labels) <- names(A)
setdiff(names, outliers)] <- NA
labels[V(net)$label <- NA
V(net)$color <- colo2
V(net)$size <- deg
E(net)$arrow.size <- 5
plot(net, vertex.label.dist=20)
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.