Prevent mutating the global environment pointer in CommandExt::exec and opt to use execve and resolve path manually#157144
Conversation
…nd opt to use execve and resolve path manually
|
r? @joboet rustbot has assigned @joboet. Use Why was this reviewer chosen?The reviewer was selected based on:
|
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
I highly recommend having a look at the POSIX specification – GNU/Linux mostly follows that, but the specification sometimes leaves some things unspecified and we should be careful not to break those.
Also, this is currently unsound, we cannot perform any memory allocation in do_exec, since it's called after fork for the standard spawn path. You probably need to preallocate some memory when creating the Command (or when setting certain parameters).
| return Err(io::Error::from_raw_os_error(libc::ENOENT)); | ||
| } | ||
| // Path searching does not occur when our file starts with a `/` | ||
| else if file_as_bytes.starts_with(&[b'/']) { |
There was a problem hiding this comment.
This mismatches the POSIX specification, the file is interpreted as path if it contains a slash character in any place.
Also, I'd move this check to before all the path stuff, that's redundant if this path is taken.
| let argv = self.get_argv(); | ||
| let parent_path = crate::env::var("PATH") | ||
| .map(|var| CString::new(var)) | ||
| .unwrap_or(CString::new("/bin:/usr/bin"))?; |
There was a problem hiding this comment.
I don't really like this duplication of the default path. Can you factor this differently?
| return Err(io::Error::from_raw_os_error(libc::ENAMETOOLONG)); | ||
| } else { | ||
| let mut got_perm_denied = None; | ||
| while paths != &[b'\0'] { |
There was a problem hiding this comment.
split_paths allocates for a PathBuf on SplitPath::next. If we want to avoid allocating memory in do_exec, we have to do this manually.
(I still also prefer doing this manually than using split_paths because we don't really need to allocate again if it's not necessary and just grab the u8 subslice)
There was a problem hiding this comment.
Unless we do everything before do_exec...
| None => c"/bin:/usr/bin".to_bytes_with_nul(), | ||
| } | ||
| } | ||
| None => parent_path.as_bytes_with_nul(), |
There was a problem hiding this comment.
I recommend moving the environment variable lookup here so that it's only performed when actually needed.
| // Remove "PATH=" | ||
| Some(p) => &p.to_bytes_with_nul()[5..], | ||
| // Falls back to this if PATH environment variable couldn't be found | ||
| None => c"/bin:/usr/bin".to_bytes_with_nul(), |
There was a problem hiding this comment.
This default is incorrect. E.g. on GNU/Linux you should use confstr(_CS_PATH) (see sys/pal/unix/conf.rs for a safe wrapper). _CS_PATH is probably the right choice in general, but might not be enough to match execvp (POSIX specifies that the fallback is "implementation defined").
There was a problem hiding this comment.
confstr in sys/pal/unix/conf.rs is cfg gated to target_vendor="apple". should I just create a separate confstr function on all other targets, that does something like the following:
let n = unsafe { libc::confstr(libc::_CS_PATH, ptr::null_mut(), 0) };
let mut buf = Vec::with_capacity(n);
let res = unsafe {libc::confstr(libc::_CS_PATH, buf.as_mut_ptr(), n)};
if res == 0 {
return Err(io::Error::last_os_error)
}
let default_path = unsafe { CStr::from_ptr(buf.as_ptr()) }.into(); // as CStringShould this be stored in Command on creation (in which case should we panic or give an empty string)? Or should we compute this within spawn/exec before forking?
| let parent_path = crate::env::var("PATH") | ||
| .map(|var| CString::new(var)) | ||
| .unwrap_or(CString::new("/bin:/usr/bin"))?; | ||
| let envp = maybe_envp.map(|envp| envp.as_ptr()).unwrap_or(*sys::env::environ()); |
There was a problem hiding this comment.
It's probably a good idea to hold the environment read lock if inheriting the environment from the parent.
There was a problem hiding this comment.
Isn't the environment read lock always held before do_exec is called?
do_exec (as far as I can tell) is called from Command::spawn & Command:exec, which I see that they invoke sys::env::read_lock before calling do_exec
There was a problem hiding this comment.
Oh right, fair enough. That means that the parent PATH lookup cannot use env::var_os since RwLock does not allow recursive read locks (you shouldn't use it anyway as it allocates).
There was a problem hiding this comment.
That means that the parent
PATHlookup cannot useenv::var_ossinceRwLockdoes not allow recursive read locks (you shouldn't use it anyway as it allocates).
That's fine, I think I can use libc::getenv(key.as_ptr()) as *const libc::c_char and CStr::from_ptr to do the PATH lookup (this is what getenv does underneath the hood).
| // Ensure that our given file does not exceed the limit set by NAME_MAX | ||
| // Note: `file` is a CStr, so it should be guaranteed to have a nul terminated | ||
| // byte (hence no underflow occurs here) | ||
| else if file_as_bytes.len() - 1 >= libc::NAME_MAX as usize { | ||
| return Err(io::Error::from_raw_os_error(libc::ENAMETOOLONG)); |
There was a problem hiding this comment.
I'd just leave this check to execve...
| && err == libc::ENOEXEC | ||
| { | ||
| let mut new_argv = CStringArray::with_capacity(argv.len() + 2); | ||
| new_argv.push(CString::new("/bin/sh")?); |
There was a problem hiding this comment.
This path is implementation defined – though I imagine it's correct for nearly all platforms.
There was a problem hiding this comment.
Let me know if there's a constant for this. I saw from the glibc archived upstream, it was _PATH_BSHELL, but couldn't find that in libc.
|
Reminder, once the PR becomes ready for a review, use |
…confstr_as_cstring added to get a CString value returned by confstr, added CStringArray::from_ptr, added default_path and shell_argv fields to unix Command, and refactored do_exec code
| libc::execve(path.as_ptr(), argv.as_ptr(), envp.as_ptr()); | ||
| let err = io::Error::last_os_error(); | ||
|
|
||
| // Unsure we should do this on ENOEXEC because that's implementation |
There was a problem hiding this comment.
This was a pending question I had earlier; I currently have shell path fallback embedded in Command, but I'm wondering if that's okay to do.
This PR fixes the bug described in #156951 where
CommandExt::execmutating the global environment pointer could cause correctness issue with concurrentexeccalls (also this function holds an environment read lock, so it shouldn't cause any writes to the global environment pointer anyways). We do everything withinCommandExt::execviaexecvethanexecvp, which means we have to resolve the path manually.I took reference to what was done in an archived (Feb 2026) upstream mirror of glibc's
execvpandexecvpe. I also took reference to #55359 on how they implemented the concerned portion ofCommandExt::execwithexecve. There's also this OpenBSD libc implementation ofexecvpethat I saw, but I didn't deeply take a look at this one and saw how it differ.Other than that, I'm unsure how to handle the error
ETIMEOUT/TimedOutas glibc breaks parsing through thePATHenvironment variable value and returns the error while #55359 continues parsing.Let me know if you also want me to put some of
execvecode in a helper function or refactor it in other ways.