ARROW-10889: [Rust] [Proposal] Add guidelines about usage of `unsafe` #8901

jorgecarleitao · 2020-12-12T19:12:15Z

I would like to propose that we outline and enforce guidelines on the arrow crate implementation with respect to the usage of unsafe.

The background of this proposal are PRs #8645 and #8829. In both cases, while addressing an unrelated issue, they hit undefined behavior (UB) due to an incorrect usage of unsafe in the code base. This UB was very time-consuming to identify and debug: combined, they accounted for more than 12hs of my time.

Safety against undefined behavior is the core premise of the Rust language. In many cases, the maintenance burden (time to find and fix bugs) does not justify the performance improvements and the decrease in motivation in handling them (they are just painful due to how difficult they are to debug). In particular, IMO those 12 hours would have been better spent in other parts of the code if unsafe would have not been used in the first place, which would have been likely translated in faster code or more features.

There are situations where unsafe is necessary, and the guidelines outline these cases. However, I also see many uses of unsafe that are not necessary nor properly documented.

The goal of these guidelines is to motivate contributors of the Rust implementation to be conscious about the maintenance cost of unsafe, and outline specific necessary conditions for any new unsafe to be introduced in the code base.

andygrove · 2020-12-12T19:14:20Z

This is funny timing. I just filed ARROW-10889 for this and then saw this PR!

codecov-io · 2020-12-12T19:18:59Z

Codecov Report

Merging #8901 (ff35ce8) into master (5353c28) will increase coverage by 7.78%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #8901      +/-   ##
==========================================
+ Coverage   75.48%   83.26%   +7.78%     
==========================================
  Files         181      195      +14     
  Lines       41649    48066    +6417     
==========================================
+ Hits        31439    40023    +8584     
+ Misses      10210     8043    -2167

Impacted Files	Coverage Δ
rust/parquet/src/data_type.rs	`78.72% <0.00%> (-5.89%)`	⬇️
rust/parquet/src/util/memory.rs	`89.57% <0.00%> (-1.04%)`	⬇️
rust/parquet/src/file/statistics.rs	`93.80% <0.00%> (-0.60%)`	⬇️
rust/arrow/src/datatypes.rs	`76.45% <0.00%> (-0.41%)`	⬇️
rust/parquet/src/encodings/rle.rs	`92.68% <0.00%> (-0.35%)`	⬇️
rust/parquet/src/schema/parser.rs	`89.86% <0.00%> (-0.34%)`	⬇️
rust/arrow/src/ffi.rs	`70.28% <0.00%> (-0.34%)`	⬇️
rust/parquet/src/record/api.rs	`98.01% <0.00%> (-0.14%)`	⬇️
rust/arrow/src/json/reader.rs	`83.09% <0.00%> (-0.09%)`	⬇️
rust/arrow/src/csv/writer.rs	`78.82% <0.00%> (ø)`
... and 98 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 48fee66...ff35ce8. Read the comment docs.

github-actions · 2020-12-12T19:24:11Z

https://issues.apache.org/jira/browse/ARROW-10889

andygrove

This looks great. Thanks @jorgecarleitao !

andygrove · 2020-12-12T19:49:29Z

It would be really cool to have a GitHub Action that detected unsafe in a PR and auto-posted a link to this documentation, as well as tagging the PR with an unsafe label. I do not know how feasible that is though.

Dandandan · 2020-12-12T20:25:05Z

Very cool proposal. Will take some time soon to read it in detail. I think currently there is a lot of code violating what is proposed, so that's clearly needed 👍

I think it also makes sense to keep track of the amount of unsafe that is used, and try to reduce it as much as possible by introducing safe APIs that are as performant. An example might be: introducing a safe iterator to avoid bounds checking.

Also I think the unsafe code can /should be more tested, e.g. by having more debug_asserts etc. checking for invariants and checking for undefined behavior in the CI.

Dandandan · 2020-12-13T07:24:16Z

rust/arrow/README.md

Maybe 10% is a bit arbitrary and depends on context? Maybe it should.bd demonstrated / be likely that it has an impact on real world usage of arrow?

I agree that this wording would be stronger without a specific bound. I would rather say something like "usage of unsafe for performance reasons is justified only when all other alternatives have been exhausted and the performance benefits are sufficiently large. Performance benefits should be quantified with a bench".

I understand the sentiment.

My concern is that I am not sure I would like to maintain a complex unsafe code for a 6% improvement at this particular phase of the project.

The rational for a concrete number is to impose a bound that we consider to not have the capacity to handle the maintenance cost for "small" improvements, and thus people should not try to focus on those types of improvements (again, at this particular phase of the project).

My concern with "performance benefits are sufficiently large" is that anything bigger than zero is always "sufficiently large" when compared to zero.

What if we write something like

usage of unsafe for performance reasons is justified only when all other alternatives have been exhausted and the performance benefits are sufficiently large (>~10%)

so that we allow other values, but we offer a number for reference?

Could the improvement be expressed by the "hotness" of the code? I can imagine that a 6% improvement in the Buffers is much more valuable than a 11% percentage improvement in a specific kernel.

I like

usage of unsafe for performance reasons is justified only when all other alternatives have been exhausted and the performance benefits are sufficiently large (e.g. >~10%)

@ritchie46 I agree with the sentiments that some performance improvements are more important than others, which is why I was suggesting leaving room for interpretation in the writeup

Yes, I think it would be preferable to have some room to trade safety for performance and vice versa. A change that makes everything a few percent faster might be acceptable, while a 50% improvement on a micro benchmark might not in some cases. And in some cases you maybe want to reduce the amount of safety, but might want to accept a certain negative performance impact.

rust/arrow/README.md

alamb

I think this is a great idea -- thank you for writing it up @jorgecarleitao !

rust/arrow/README.md

alamb · 2020-12-13T11:11:47Z

rust/arrow/README.md

https://doc.rust-lang.org/reference/behavior-considered-undefined.html might be a reasonable citation, though it doesn't quantify the sources of unsafetly.

rust/arrow/README.md

alamb · 2020-12-13T11:13:55Z

rust/arrow/README.md

I suggest being more explicit here -- maybe "Omitting bounds checking for performance" or something

alamb · 2020-12-13T11:17:14Z

rust/arrow/README.md

stylistically, I think we could remove the 'Usage of unsafe for ...' sentences. The rationale for use of unsafe is already explained above so this repetition seems redundant to me.

I also think such wording may give the impression that unsafe is always allowed for these operations, when I think the intent is "unsafe can be used as a last resort"

alamb · 2020-12-13T11:19:39Z

rust/arrow/README.md

I agree that this wording would be stronger without a specific bound. I would rather say something like "usage of unsafe for performance reasons is justified only when all other alternatives have been exhausted and the performance benefits are sufficiently large. Performance benefits should be quantified with a bench".

rust/arrow/README.md

alamb · 2020-12-14T18:24:48Z

@jorgecarleitao I am not sure what has happened to this PR but it now seemingly includes many changes that are unrelated to the readme improvements:

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

jorgecarleitao · 2020-12-14T18:29:26Z

Some mistake on my part. Fixed

alamb

I think this looks good. @sunchao @paddyhoran or @nevi-me would you like to express an opinion on this proposal?

alamb · 2020-12-14T18:40:42Z

rust/arrow/README.md

I like

usage of unsafe for performance reasons is justified only when all other alternatives have been exhausted and the performance benefits are sufficiently large (e.g. >~10%)

@ritchie46 I agree with the sentiments that some performance improvements are more important than others, which is why I was suggesting leaving room for interpretation in the writeup

Dandandan

For me the changes are clear enough and provide a good guideline for using unsafe.

andygrove · 2020-12-15T15:49:52Z

I saw a tweet yesterday from @timClicks showing an example of how the imageproc project documents their unsafe code and I really like it. I think we should do something similiar. Maybe we could add this to the guidelines @jorgecarleitao ?

            // JUSTIFICATION
            //  Benefit
            //      Using checked indexing here makes bench_integral_image_rgb take 1.05x as long
            //      (The results are noisy, but this seems to be reproducible. I've not checked the generated assembly.)
            //  Correctness
            //      x and y are within bounds by definition of in_width and in_height
            let input = unsafe { image.unsafe_get_pixel(x, y) };

Dandandan · 2020-12-15T15:54:57Z

I saw a tweet yesterday from @timClicks showing an example of how the imageproc project documents their unsafe code and I really like it. I think we should do something similiar. Maybe we could add this to the guidelines @jorgecarleitao ?

            // JUSTIFICATION
            //  Benefit
            //      Using checked indexing here makes bench_integral_image_rgb take 1.05x as long
            //      (The results are noisy, but this seems to be reproducible. I've not checked the generated assembly.)
            //  Correctness
            //      x and y are within bounds by definition of in_width and in_height
            let input = unsafe { image.unsafe_get_pixel(x, y) };

Saw that too :) I think it's an interesting approach. In this PR #8900 I added a comment with "SAFETY", but that misses the benefit part.

paddyhoran

LGTM

rust/arrow/README.md

jorgecarleitao · 2020-12-16T06:18:37Z

Thanks all for the suggestions. Really good stuff.

I have incorporated all the changes:

the change of the requirement for performance was re-written as suggested by @alamb and @Dandandan
added a code snipped with the idea from @andygrove
fixed typos identified by @paddyhoran

I have used the term soundness instead of correctness because the former is already defined on an official guide: https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#soundness-of-code--of-a-library and is based on existing literature, while the latter seems to be used only by imageproc.

nevi-me

I don't have any additional comments

alamb · 2020-12-16T21:25:00Z

I think this PR has enough approvals / 👍 so I merged it in. We can continue to iterate on it in the future but this is a great base to work from. Thank you @jorgecarleitao and everyone else who contributed. A true team effort 👍

timClicks · 2020-12-16T21:44:42Z

Thanks all for working hard to keep Arrow as robust as possible

jorgecarleitao requested review from alamb, andygrove, kou, nevi-me, paddyhoran and sunchao December 12, 2020 19:12

github-actions bot added the Component: Rust label Dec 12, 2020

andygrove changed the title ~~[Rust] [Proposal] Add guidelines about usage of unsafe~~ ARROW-10889: [Rust] [Proposal] Add guidelines about usage of unsafe Dec 12, 2020

andygrove approved these changes Dec 12, 2020

View reviewed changes

jorgecarleitao marked this pull request as ready for review December 12, 2020 19:48

Dandandan reviewed Dec 13, 2020

View reviewed changes

rust/arrow/README.md Outdated Show resolved Hide resolved

alamb approved these changes Dec 13, 2020

View reviewed changes

jorgecarleitao and others added 5 commits December 14, 2020 18:28

Added discussion about usage of unsaee

4850d0f

Update rust/arrow/README.md

3462c3b

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Update rust/arrow/README.md

e26e5fb

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Update rust/arrow/README.md

8da1cba

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Re-phrased assert.

0274a17

alamb approved these changes Dec 14, 2020

View reviewed changes

Dandandan approved these changes Dec 14, 2020

View reviewed changes

paddyhoran approved these changes Dec 15, 2020

View reviewed changes

rust/arrow/README.md Outdated Show resolved Hide resolved

rust/arrow/README.md Outdated Show resolved Hide resolved

rust/arrow/README.md Outdated Show resolved Hide resolved

Typos and incorporated changes.

ff35ce8

nevi-me approved these changes Dec 16, 2020

View reviewed changes

alamb closed this in 900f6fe Dec 16, 2020

jorgecarleitao deleted the arrow_unsafe branch December 16, 2020 22:10

asfimport mentioned this pull request Dec 16, 2020

[Rust] Document our approach to unsafe code in README #18413

Closed

ARROW-10889: [Rust] [Proposal] Add guidelines about usage of unsafe #8901

ARROW-10889: [Rust] [Proposal] Add guidelines about usage of unsafe #8901

Uh oh!

Conversation

jorgecarleitao commented Dec 12, 2020

Uh oh!

andygrove commented Dec 12, 2020

Uh oh!

codecov-io commented Dec 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Dec 12, 2020

Uh oh!

andygrove left a comment

Choose a reason for hiding this comment

Uh oh!

andygrove commented Dec 12, 2020

Uh oh!

Dandandan commented Dec 12, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

alamb commented Dec 14, 2020

Uh oh!

jorgecarleitao commented Dec 14, 2020

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Dandandan left a comment

Choose a reason for hiding this comment

Uh oh!

andygrove commented Dec 15, 2020

Uh oh!

Dandandan commented Dec 15, 2020

Uh oh!

paddyhoran left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jorgecarleitao commented Dec 16, 2020

Uh oh!

nevi-me left a comment

Choose a reason for hiding this comment

Uh oh!

alamb commented Dec 16, 2020

Uh oh!

timClicks commented Dec 16, 2020

Uh oh!

ARROW-10889: [Rust] [Proposal] Add guidelines about usage of `unsafe` #8901

ARROW-10889: [Rust] [Proposal] Add guidelines about usage of `unsafe` #8901

codecov-io commented Dec 12, 2020 •

edited

Loading