Skip to content
This repository was archived by the owner on Apr 22, 2020. It is now read-only.

Commit 40efdf2

Browse files
mneedhamMark Needham
authored andcommitted
Harmonic (#543)
* harmonic * tweaking harmonic and closeness * tidy up how we display the formula * better explanation (I think) * add normalized harmonic * more accurate like this * rewording why you might use harmonic centrality
1 parent 2aaffd1 commit 40efdf2

File tree

3 files changed

+57
-50
lines changed

3 files changed

+57
-50
lines changed

doc/closeness-centrality.adoc

Lines changed: 21 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,22 +11,31 @@ This is important for the availability of knowledge and resources.
1111
== History, Explanation
1212

1313
// tag::explanation[]
14-
Sabidussi (1966) described the sum of the shortest path distances from one node to every other node as the node’s farness.
14+
Sabidussi (1966) described the sum of the shortest path distances from one node to every other node as the node’s farness.
1515
Freeman (1979) used this idea to define closeness centrality of a node as the inverse of Sabidussi’s farness.
1616

1717
Closeness centrality is defined as the total number of relationships separating a node from all others along the shortest possible paths.
1818

1919
The algorithm operates as follows:
2020

2121
* calculate the shortest path for each for each pair of nodes in the graph
22-
* for each node sum the total distance from the node to all other nodes.
22+
* for each node sum the distance from the node to all other nodes based on those shortest paths
2323

24-
The greater the *raw closeness centrality*, the longer it takes for information originating at random points in the graph to arrive at the node.
24+
The *raw closeness centrality* for a node is then calculated using the following formula:
25+
26+
`raw closeness centrality(node) = 1 / sum(distance from node to all other nodes)`
27+
28+
The greater the raw closeness centrality, the longer it takes for information originating at random points in the graph to arrive at the node.
2529
We could also interpret closeness as the potential ability of a node to reach all other nodes as quickly as possible.
2630

27-
It is important to note that *raw closeness centrality* is an inverse measure of centrality.
31+
It is important to note that raw closeness centrality is an inverse measure of centrality.
2832
i.e. nodes with smaller scores that are the most central.
2933
Our algorithm returns a *normalized closeness centrality* score where nodes with a higher score are more central.
34+
35+
The formula for *normalized closeness centrality* is as follows:
36+
37+
`normalized closeness centrality(node) = (number of nodes - 1) / sum(distance from node to all other nodes)`
38+
3039
// end::explanation[]
3140

3241
== When to use it / use-cases
@@ -111,9 +120,9 @@ k/S| 0.4 0.57 0.67 0.57 0.4 // normalized closeness centrality
111120
.Running algorithm and writing back results
112121
[source,cypher]
113122
----
114-
CALL algo.closeness(label:String, relationship:String,
115-
{write:true, writeProperty:'centrality',graph:'heavy', concurrency:4})
116-
YIELD nodes,loadMillis, computeMillis, writeMillis
123+
CALL algo.closeness(label:String, relationship:String,
124+
{write:true, writeProperty:'centrality',graph:'heavy', concurrency:4})
125+
YIELD nodes,loadMillis, computeMillis, writeMillis
117126
- calculates closeness centrality and potentially writes back
118127
----
119128

@@ -145,7 +154,7 @@ YIELD nodes,loadMillis, computeMillis, writeMillis
145154
.Running algorithm and streaming results
146155
[source,cypher]
147156
----
148-
CALL algo.closeness.stream(label:String, relationship:String,{concurrency:4})
157+
CALL algo.closeness.stream(label:String, relationship:String,{concurrency:4})
149158
YIELD nodeId, centrality - yields centrality for each node
150159
----
151160

@@ -164,7 +173,7 @@ YIELD nodeId, centrality - yields centrality for each node
164173
|===
165174
| name | type | description
166175
| node | long | node id
167-
| centrality | float | closeness centrality weight
176+
| centrality | float | closeness centrality weight
168177
|===
169178

170179
== Cypher projection
@@ -178,7 +187,7 @@ Set `graph:'cypher'` in the config.
178187
include::scripts/closeness-centrality.cypher[tag=cypher-loading]
179188
----
180189

181-
== Versions
190+
== Versions
182191

183192
We support the following versions of the closeness centrality algorithm:
184193

@@ -190,7 +199,7 @@ We support the following versions of the closeness centrality algorithm:
190199

191200
** Only with cypher projection
192201

193-
* [ ] undirected, weighted
202+
* [ ] undirected, weighted
194203

195204

196205
== References
@@ -217,7 +226,7 @@ ifdef::implementation[]
217226
:leveloffset: +1
218227
// copied from: https://github.com/neo4j-contrib/neo4j-graph-algorithms/issues/99
219228

220-
_Closeness Centrality_ of a node is a measure of centrality in a network, calculated as the sum of the length of the shortest paths between the node and all other nodes in the graph.
229+
_Closeness Centrality_ of a node is a measure of centrality in a network, calculated as the sum of the length of the shortest paths between the node and all other nodes in the graph.
221230
Thus the more central a node is, the closer it is to all other nodes.
222231

223232
== Details

doc/harmonic-centrality.adoc

Lines changed: 32 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,47 +1,49 @@
11
= Harmonic Centrality
22

33
// tag::introduction[]
4-
The harmonic mean has been known since the time of Pythagoras and Plato as the mean expressing “harmonious and tuneful ratios”, and later has been employed by musicians to formalize the diatonic scale, and by architects as paradigm for beautiful proportions.[1]
4+
Closeness centrality works best on connected graphs - results may be confused if there are multiple connected components.
55

6-
Social network analysis is a rapid expanding interdisciplinary field, growing from work of sociologists, physicists, historians, mathematicians, political scientists, etc.
7-
Some methods have been commonly accepted in spite of defects, perhaps because of the rareness of synthetic work like (Freeman, 1978; Faust & Wasserman, 1992).
8-
Harmonic centrality was proposed as an alternative index of closeness centrality defined on undirected networks.
9-
Results show its computation on real cases are identical to those of the closeness centrality index, with same computational complexity and we give some interpretations.
10-
An important property is its use in the case of unconnected networks.[2]
6+
Harmonic centrality (also known as 'valued centrality') is a variant of closeness centrality that addresses this problem.
7+
As with many of the centrality algorithms, it originates from the field of social network analysis.
118
// end::introduction[]
129

1310

1411
== History, Explanation
1512

1613
// tag::explanation[]
17-
Main problem of closeness centrality lies in the presence of pairs of unreachable nodes.
18-
We get inspiration from Marchiori and Latora [1]: faced with the problem of providing a sensible notion of “average shortest path” for a generic directed network, they propose to replace the average distance with the harmonic mean of all distances (i.e., the n(n − 1) distances between every pair of distinct nodes).
19-
Indeed, in case a large number of pairs of nodes are not reachable, the average of finite distances can be misleading: a graph might have a very low average distance while it is almost completely disconnected (e.g., a perfect matching has average distance 1/2).
20-
The harmonic mean has the useful property of handling ∞ cleanly.
21-
For example, the harmonic mean of distances of a perfect matching is n − 1: in fact, for every node there is exactly another node at a non-infinite distance, and its distance is 1; so the sum of the inverse of all distances is n, making the harmonic average equal to n(n − 1)/n = n − 1.
22-
In general, for each graph-theoretical notion based on arithmetic averaging or maximization there is an equivalent notion based on the harmonic mean.
23-
If we consider closeness the reciprocal of a denormalized average of distances, it is natural to consider also the reciprocal of a denormalized harmonic mean of distances.[3]
24-
25-
The difference with might seem minor, but actually it is a radical change.
26-
Harmonic centrality is strongly correlated to closeness centrality in simple networks, but naturally also accounts for nodes y that cannot reach x.
27-
Thus, it can be fruitfully applied to graphs that are not strongly connected.[3]
14+
Harmonic centrality was proposed by Marchiori and Latora [1] while trying to come up with a sensible notion of "average shortest path".
15+
They suggested replacing the average distance calculation from the closeness centrality algorithm with the harmonic mean of all distances.
16+
17+
The algorithm operates as follows:
18+
19+
* calculate the shortest path for each for each pair of nodes in the graph
20+
* for each node determine the distance from the node to all other nodes based on those shortest paths
21+
22+
The *raw harmonic centrality* for a node is then calculated using the following formula:
23+
24+
`raw harmonic centrality(node) = sum(1 / distance from node to every other node excluding itself)`
25+
26+
As with closeness centrality we can also calculate a *normalized harmonic centrality* with the following formula:
27+
28+
`normalized harmonic centrality(node) = sum(1 / distance from node to every other node excluding itself) / (number of nodes - 1)`
29+
30+
The advantage of harmonic centrality is that ∞ are handled cleanly.
31+
Harmonic centrality and closeness centrality will often come up with similar results, but harmonic centrality can handle graphs that aren't connected. [3]
2832

2933
Harmonic centrality was proposed independently by Dekker (2005)[4], using the name "valued centrality," and by Rochat (2009)[2].
3034
// end::explanation[]
3135

3236
== When to use it / use-cases
3337

3438
// tag::use-case[]
35-
Because harmonic centrality was proposed as an alternative to closeness centrality, they have similar use cases.
36-
As an example, one can consider identifying the location within a city where to place a new public service, so that it is easily accessible for everyone.
37-
Similarly, identifying central people that have ideal social network location for the purpose of information dissemination or network influence.
38-
In such kind of applications, the nodes who can access the entire network faster need to be selected.
39-
As mentioned above it can be fruitfully applied to graphs that are not connected.
39+
Harmonic centrality was proposed as an alternative to closeness centrality, and therefore has similar use cases.
40+
41+
For example, we might use it if we're trying to identify where in the city to place a new public service so that it's easily accessible for residents.
42+
If we're trying to spread a message on social media we could use the algorithm to find the key influencers that can help us achieve our goal.
4043
// end::use-case[]
4144

4245
== Constraints / when not to use it
4346

44-
4547
// tag::constraint[]
4648

4749
// end::constraint[]
@@ -107,8 +109,8 @@ of each cell and multiply by 1/(n-1)
107109
[source,cypher]
108110
----
109111
CALL algo.harmonic(label:String, relationship:String,
110-
{write:true, writeProperty:'centrality',graph:'heavy', concurrency:4})
111-
YIELD nodes,loadMillis, computeMillis, writeMillis
112+
{write:true, writeProperty:'centrality',graph:'heavy', concurrency:4})
113+
YIELD nodes,loadMillis, computeMillis, writeMillis
112114
- calculates closeness centrality and potentially writes back
113115
----
114116

@@ -140,7 +142,7 @@ YIELD nodes,loadMillis, computeMillis, writeMillis
140142
.Running algorithm and streaming results
141143
[source,cypher]
142144
----
143-
CALL algo.harmonic.stream(label:String, relationship:String,{concurrency:4})
145+
CALL algo.harmonic.stream(label:String, relationship:String,{concurrency:4})
144146
YIELD nodeId, centrality - yields centrality for each node
145147
----
146148

@@ -158,7 +160,7 @@ YIELD nodeId, centrality - yields centrality for each node
158160
|===
159161
| name | type | description
160162
| node | long | node id
161-
| centrality | float | closeness centrality weight
163+
| centrality | float | closeness centrality weight
162164
|===
163165

164166
== Cypher projection
@@ -172,21 +174,21 @@ Set `graph:'cypher'` in the config.
172174
include::scripts/harmonic-centrality.cypher[tag=cypher-loading]
173175
----
174176

175-
== Versions
177+
== Versions
176178

177179
We support the following versions of the harmonic centrality algorithm:
178180

179181
* [*] undirected, unweighted
180182

181-
* [ ] undirected, weighted
183+
* [ ] undirected, weighted
182184

183185

184186
== References
185187

186188
// tag::references[]
187189
* [1] https://arxiv.org/pdf/cond-mat/0008357.pdf
188190

189-
* [2] \https://infoscience.epfl.ch/record/200525/files/[[EN]]ASNA09.pdf?
191+
* [2] \https://infoscience.epfl.ch/record/200525/files/[EN]ASNA09.pdf?
190192

191193
* [3] https://arxiv.org/pdf/1308.2140.pdf
192194

@@ -212,9 +214,3 @@ ifdef::implementation[]
212214

213215
// end::implementation[]
214216
endif::implementation[]
215-
216-
217-
218-
219-
220-

doc/scripts/harmonic-centrality.cypher

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,9 @@ CREATE (a)-[:LINK]->(b),
1414
// tag::stream-sample-graph[]
1515

1616
CALL algo.harmonic.stream('Node', 'LINKS') YIELD nodeId, centrality
17-
RETURN nodeId,centrality order by centrality desc limit 20;
17+
RETURN nodeId,centrality
18+
ORDER BY centrality DESC
19+
LIMIT 20;
1820

1921
// end::stream-sample-graph[]
2022

@@ -33,4 +35,4 @@ CALL algo.harmonic(
3335
{graph:'cypher', writeProperty: 'centrality'}
3436
);
3537

36-
// end::cypher-loading[]
38+
// end::cypher-loading[]

0 commit comments

Comments
 (0)