You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+16-23Lines changed: 16 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,26 +4,23 @@
4
4
5
5
Your next API to work with [Spark](https://spark.apache.org/)
6
6
7
-
One day this should become part of https://github.com/apache/spark of repository, consider this as beta-quality software.
7
+
One day this should become part of https://github.com/apache/spark of repository. Consider this as beta-quality software.
8
8
9
9
## Goal
10
10
11
-
This project adds missing layer of compatibility between [Kotlin](https://kotlinlang.org/) and [Spark](https://spark.apache.org/).
11
+
This project adds a missing layer of compatibility between [Kotlin](https://kotlinlang.org/) and [Spark](https://spark.apache.org/).
12
12
13
-
Despite Kotlin having first-class compatibility API Kotlin developers might want to use familiar features like data
14
-
classes and lambda expressions as simple expressions in curly braces or method references.
13
+
Despite Kotlin having first-class compatibility API, Kotlin developers might want to use familiar features like data classes and lambda expressions as simple expressions in curly braces or method references.
15
14
16
15
## Non-goals
17
16
18
-
There is no goal to replace any currently supported language or provide them with some functionality to support Kotlin
19
-
language.
17
+
There is no goal to replace any currently supported language or provide them with some functionality to support Kotlin language.
20
18
21
19
## Installation
22
20
23
-
Currently, there are no kotlin-spark-api artifacts in maven central, but you can obain copy using JitPack here:
Currently, there are no kotlin-spark-api artifacts in maven central, but you can obtain copy using JitPack here: [](https://jitpack.io/#JetBrains/kotlin-spark-api)
25
22
26
-
There is support for `Maven`, `Gradle`, `SBT` and `leinengen` on JitPack.
23
+
There is support for `Maven`, `Gradle`, `SBT`, and `leinengen` on JitPack.
27
24
28
25
This project does not force you to use any concrete version of spark, but we've only tested it with spark `3.0.0-preview2`.
29
26
We believe it should also work fine with version `2.4.5`
@@ -49,8 +46,7 @@ So if you're using Maven you'll hve to add following into your `pom.xml`:
49
46
</dependency>
50
47
```
51
48
52
-
`core` is being compiled against Scala version `2.12` and it means you have to use `2.12` build of spark if you want to
53
-
try out this project.
49
+
`core` is being compiled against Scala version `2.12` and it means you have to use `2.12` build of spark if you want to try out this project.
54
50
55
51
## Usage
56
52
@@ -79,20 +75,18 @@ spark.toDS("a" to 1, "b" to 2)
79
75
Indeed, this produces `Dataset<Pair<String, Int>>`. There are a couple more `toDS` methods which accept different arguments.
80
76
81
77
Also, there are several interesting aliases in API, like `leftJoin`, `rightJoin` etc.
82
-
Interesting fact about them that they're null-safe by design. For example, `leftJoin` is aware of nullability and returns
83
-
`Dataset<Pair<LEFT, RIGHT?>>`.
84
-
Note that were forcing `RIGHT` to be nullable for you as developer to be able to handle this situation.
78
+
Interesting fact about them that they're null-safe by design. For example, `leftJoin` is aware of nullability and returns `Dataset<Pair<LEFT, RIGHT?>>`.
79
+
Note that were forcing `RIGHT` to be nullable for you as a developer to be able to handle this situation.
85
80
86
81
We know that `NullPointerException`s are hard to debug in Spark And trying hard to make them happen as rare as possible.
87
82
88
83
## Useful helper methods
89
84
90
85
### `withSpark`
91
86
92
-
We provide you with useful function `withSpark`, which accepts everything that may be needed to run spark — properties,
93
-
name, master location and so on. Also it accepts block, which should be launched in spark context.
87
+
We provide you with useful function `withSpark`, which accepts everything that may be needed to run spark — properties, name, master location and so on. It also accepts a block of code to execute inside spark context.
94
88
95
-
After work block ends `spark.stop()` is called automatically.
89
+
After work block ends,`spark.stop()` is called automatically.
96
90
97
91
```kotlin
98
92
withSpark {
@@ -108,8 +102,8 @@ withSpark {
108
102
109
103
It may easily happen that we need to fork our computation to several paths. To compute things only once we should call `cache`
110
104
method. But there it is hard to control when we're using cached `Dataset` and when not.
111
-
Also it's easy to forget to unpersist cached data, which may make things to break unexpectadle or just take more memory
112
-
then intended.
105
+
It is also easy to forget to unpersist cached data, which may make break things unexpectably or take more memory
106
+
than intended.
113
107
114
108
To solve these problems we introduce `withCached` function
115
109
@@ -127,15 +121,14 @@ withSpark {
127
121
}
128
122
```
129
123
130
-
Here we're showing cached `Dataset` for debugging purposes then filtering it. `filter` method returns filtered `Dataset`
131
-
and then cached `Dataset` is being unpersisted so we have more memory to call `map` method and collect resulting `Dataset`.
124
+
Here we're showing cached `Dataset` for debugging purposes then filtering it. The `filter` method returns filtered `Dataset` and then the cached `Dataset` is being unpersisted, so we have more memory to call the `map` method and collect the resulting `Dataset`.
132
125
133
126
## Examples
134
127
135
-
You cn find more examples in [examples](https://github.com/JetBrains/kotlin-spark-api/tree/master/examples/src/main/kotlin/org/jetbrains/spark/api/examples) module.
128
+
You can find more examples in [examples](https://github.com/JetBrains/kotlin-spark-api/tree/master/examples/src/main/kotlin/org/jetbrains/spark/api/examples) module.
136
129
137
130
## Issues and feedback
138
131
139
132
Issues and any feedback are very welcome in `Issues` here.
140
133
141
-
If you find that we missed some important feature — please report and we'll consider adding it.
134
+
If you find that we missed some important features — please report it, and we'll consider adding them.
0 commit comments