You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This change makes it so that we look for a user specified nullValue
through the CSV parsing. This allows for handling CSVs that might use
something else other than an empty string to represent nulls.
It reuses the same flag as CSV saving, `nullValue`. This change should
be non-breaking.
This also pushes this behavior into inferSchema so that inferred schemas
will properly reflect the user given null value.
Author: Addison Higham <ahigham@instructure.com>
Closes#224 from addisonj/master.
Copy file name to clipboardExpand all lines: README.md
+6-5Lines changed: 6 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -56,6 +56,7 @@ When reading files the API accepts several options:
56
56
*`inferSchema`: automatically infers column types. It requires one extra pass over the data and is false by default
57
57
*`comment`: skip lines beginning with this character. Default is `"#"`. Disable comments by setting this to `null`.
58
58
*`codec`: compression codec to use when saving to file. Should be the fully qualified name of a class implementing `org.apache.hadoop.io.compress.CompressionCodec`. Defaults to no compression when a codec is not specified.
59
+
*`nullValue`: specificy a string that indicates a null value, any fields matching this string will be set as nulls in the DataFrame
59
60
60
61
The package also support saving simple (non-nested) DataFrame. When saving you can specify the delimiter and whether we should generate a header row for the table. See following examples for more details.
0 commit comments