Skip to content

Commit 3e25367

Browse files
authored
add desc for clustering coefficient algorithm (#22)
* add desc for clustering coefficient algorithm * update grammer
1 parent 02a18ce commit 3e25367

File tree

2 files changed

+19
-11
lines changed

2 files changed

+19
-11
lines changed

README-CN.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ nebula-algorithm 是一款基于 [GraphX](https://spark.apache.org/graphx/) 的
1616
| GraphTriangleCount |全图三角形计数|网络紧密性分析|
1717
| BetweennessCentrality | 介数中心性 |关键节点挖掘,节点影响力计算|
1818
| DegreeStatic | 度统计 |图结构分析|
19+
| ClusteringCoefficient | 聚集系数 |推荐,电信诈骗分析|
1920

2021
使用 `nebula-algorithm`,可以通过提交 `Spark` 任务的形式使用完整的算法工具对 `Nebula Graph` 数据库中的数据执行图计算,也可以通过编程形式调用`lib`库下的算法针对DataFrame执行图计算。
2122

@@ -32,8 +33,6 @@ nebula-algorithm 是一款基于 [GraphX](https://spark.apache.org/graphx/) 的
3233
https://repo1.maven.org/maven2/com/vesoft/nebula-algorithm/2.0.0/
3334
3435
# 使用 Nebula Algorithm
35-
36-
使用限制:Nebula Algorithm 未自动对字符串id进行编码,因此执行图算法时,边的源点和目标点必须是整数(Nebula Space 的 vid_type可以是String类型,但数据必须是整数)。
3736
3837
* 使用方法1:直接提交 nebula-algorithm 算法包
3938
@@ -46,9 +45,12 @@ nebula-algorithm 是一款基于 [GraphX](https://spark.apache.org/graphx/) 的
4645
```
4746
${SPARK_HOME}/bin/spark-submit --master <mode> --class com.vesoft.nebula.algorithm.Main nebula-algorithm-2.0.0.jar -p application.conf
4847
```
48+
* 使用限制
49+
50+
Nebula Algorithm 算法包未自动对字符串 id 进行编码,因此采用第一种方式执行图算法时,边的源点和目标点必须是整数(Nebula Space 的 vid_type 可以是 String 类型,但数据必须是整数)。
4951
* 使用方法2:调用 nebula-algorithm 算法接口
5052
51-
在`nebula-algorithm``lib`库中提供了10中常用图计算算法,可通过编程调用的形式调用算法。
53+
`nebula-algorithm``lib` 库中提供了10中常用图计算算法,可通过编程调用的形式调用算法。
5254
* 在pom.xml中添加依赖
5355
```
5456
<dependency>
@@ -67,18 +69,20 @@ nebula-algorithm 是一款基于 [GraphX](https://spark.apache.org/graphx/) 的
6769
val prConfig = new PRConfig(5, 1.0)
6870
val prResult = PageRankAlgo.apply(spark, data, prConfig, false)
6971
```
70-
72+
* 如果你的节点 id 是 String 类型,可以参考 PageRank 的 [Example](https://github.com/vesoft-inc/nebula-algorithm/blob/master/example/src/main/scala/com/vesoft/nebula/algorithm/PageRankExample.scala) 。
73+
该 Example 进行了 id 转换,将 String 类型 id 编码为 Long 类型的 id , 并在算法结果中将 Long 类型 id 解码为原始的 String 类型 id 。
74+
7175
其他算法的调用方法见[测试示例](https://github.com/vesoft-inc/nebula-algorithm/tree/master/nebula-algorithm/src/test/scala/com/vesoft/nebula/algorithm/lib) 。
7276
73-
> 注:执行算法的DataFrame默认第一列是源点,第二列是目标点,第三列是边权重。
77+
> 注:执行算法的 DataFrame 默认第一列是源点,第二列是目标点,第三列是边权重。
7478
7579
## 版本匹配
7680
7781
| Nebula Algorithm Version | Nebula Version |
7882
|:------------------------:|:--------------:|
7983
| 2.0.0 | 2.0.0, 2.0.1 |
8084
| 2.1.0 | 2.0.0, 2.0.1 |
81-
| 2.5.0 | 2.5.0 |
85+
| 2.5.0 | >=2.5.0 |
8286
| 2.5-SNAPSHOT | nightly |
8387
8488
## 贡献

README.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ nebula-algorithm is a Spark Application based on [GraphX](https://spark.apache.o
2020
| GraphTriangleCount | network structure and tightness analysis|
2121
| BetweennessCentrality | important node digging, node influence calculation|
2222
| DegreeStatic | graph structure analysis|
23+
| ClusteringCoefficient | recommended, telecom fraud analysis|
2324

2425

2526
You could submit the entire spark application or invoke algorithms in `lib` library to apply graph algorithms for DataFrame.
@@ -41,8 +42,6 @@ You could submit the entire spark application or invoke algorithms in `lib` libr
4142
4243
## Use Nebula Algorithm
4344
44-
Limitation: Due to Nebula Algorithm will not encode string id, thus during the algorithm execution, the source and target of edges must be in Type Int (The `vid_type` in Nebula Space could be String, while data must be in Type Int).
45-
4645
* Option 1: Submit nebula-algorithm package
4746
4847
* Configuration
@@ -55,6 +54,10 @@ Limitation: Due to Nebula Algorithm will not encode string id, thus during the a
5554
${SPARK_HOME}/bin/spark-submit --master <mode> --class com.vesoft.nebula.algorithm.Main nebula-algorithm-2.0.0.jar -p application.conf
5655
```
5756
57+
* Limitation
58+
59+
Due to Nebula Algorithm jar does not encode string id, thus during the algorithm execution, the source and target of edges must be in Type Int (The `vid_type` in Nebula Space could be String, while data must be in Type Int).
60+
5861
* Option2: Call nebula-algorithm interface
5962
6063
Now there are 10 algorithms provided in `lib` from `nebula-algorithm`, which could be invoked in a programming fashion as below:
@@ -78,8 +81,9 @@ Limitation: Due to Nebula Algorithm will not encode string id, thus during the a
7881
val prResult = PageRankAlgo.apply(spark, data, prConfig, false)
7982
```
8083
81-
For other algorithms, please refer to [test cases](https://github.com/vesoft-inc/nebula-algorithm/tree/master/nebula-algorithm/src/test/scala/com/vesoft/nebula/algorithm/lib).
82-
84+
If your vertex ids are Strings, see [Pagerank Example](https://github.com/vesoft-inc/nebula-algorithm/blob/master/example/src/main/scala/com/vesoft/nebula/algorithm/PageRankExample.scala) for how to encoding and decoding them.
85+
86+
For examples of other algorithms, see [examples](https://github.com/vesoft-inc/nebula-algorithm/tree/master/example/src/main/scala/com/vesoft/nebula/algorithm)
8387
> Note: The first column of DataFrame in the application represents the source vertices, the second represents the target vertices and the third represents edges' weight.
8488
8589
## Version match
@@ -88,7 +92,7 @@ Limitation: Due to Nebula Algorithm will not encode string id, thus during the a
8892
|:------------------------:|:--------------:|
8993
| 2.0.0 | 2.0.0, 2.0.1 |
9094
| 2.1.0 | 2.0.0, 2.0.1 |
91-
| 2.5.0 | 2.5.0 |
95+
| 2.5.0 | >=2.5.0 |
9296
| 2.5-SNAPSHOT | nightly |
9397
9498
## Contribute

0 commit comments

Comments
 (0)