-
Notifications
You must be signed in to change notification settings - Fork 165
Last preparations before upstreaming Git for Windows' symlink support #2017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: js/test-symlink-windows
Are you sure you want to change the base?
Last preparations before upstreaming Git for Windows' symlink support #2017
Conversation
|
/submit |
|
Submitted as pull.2017.git.1765899229.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
4c01018 to
a6a50b0
Compare
|
This patch series was integrated into seen via git@d4817df. |
77dfd22 to
a4c7170
Compare
|
This branch is now known as |
|
There was a status update in the "New Topics" section about the branch Further preparation to upstream symbolic link support on Windows. Comments? source: <pull.2017.git.1765899229.gitgitgadget@gmail.com> |
| #endif | ||
|
|
||
| char *mingw_getcwd(char *pointer, int len) | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Patrick Steinhardt wrote (reply to this):
On Tue, Dec 16, 2025 at 03:33:45PM +0000, Johannes Schindelin via GitGitGadget wrote:
> diff --git a/compat/mingw.c b/compat/mingw.c
> index ba1b7b6dd1..7215b127cc 100644
> --- a/compat/mingw.c
> +++ b/compat/mingw.c
> @@ -1251,18 +1251,16 @@ char *mingw_getcwd(char *pointer, int len)
> {
> wchar_t cwd[MAX_PATH], wpointer[MAX_PATH];
> DWORD ret = GetCurrentDirectoryW(ARRAY_SIZE(cwd), cwd);
> + HANDLE hnd;
>
> if (!ret || ret >= ARRAY_SIZE(cwd)) {
> errno = ret ? ENAMETOOLONG : err_win_to_posix(GetLastError());
> return NULL;
> }
> - ret = GetLongPathNameW(cwd, wpointer, ARRAY_SIZE(wpointer));
> - if (!ret && GetLastError() == ERROR_ACCESS_DENIED) {
> - HANDLE hnd = CreateFileW(cwd, 0,
> - FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL,
> - OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
> - if (hnd == INVALID_HANDLE_VALUE)
> - return NULL;
> + hnd = CreateFileW(cwd, 0,
> + FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL,
> + OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
> + if (hnd != INVALID_HANDLE_VALUE) {
> ret = GetFinalPathNameByHandleW(hnd, wpointer, ARRAY_SIZE(wpointer), 0);
> CloseHandle(hnd);
> if (!ret || ret >= ARRAY_SIZE(wpointer))
Okay. Due to the change we now also try calling `GetFileAttributesW()`
in case `CreateFileW()` fails, which wasn't the case before. But I'd
consider that to be a win -- if we cannot figure out the final path
name, then we can at least return the unresolved current working
directory.
Patrick
> @@ -1271,13 +1269,11 @@ char *mingw_getcwd(char *pointer, int len)
> return NULL;
> return pointer;
> }
> - if (!ret || ret >= ARRAY_SIZE(wpointer))
> - return NULL;
> - if (GetFileAttributesW(wpointer) == INVALID_FILE_ATTRIBUTES) {
> + if (GetFileAttributesW(cwd) == INVALID_FILE_ATTRIBUTES) {
> errno = ENOENT;
> return NULL;
> }
> - if (xwcstoutf(pointer, wpointer, len) < 0)
> + if (xwcstoutf(pointer, cwd, len) < 0)
> return NULL;
> convert_slashes(pointer);
> return pointer;|
User |
| string = ep; | ||
| } | ||
|
|
||
| return (current & ~negative) | positive; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Patrick Steinhardt wrote (reply to this):
On Tue, Dec 16, 2025 at 03:33:46PM +0000, Johannes Schindelin via GitGitGadget wrote:
> diff --git a/setup.c b/setup.c
> index 7086741e6c..42e4e7a690 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -2611,7 +2611,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
> * have set up the repository format such that we can evaluate
> * includeIf conditions correctly in the case of re-initialization.
> */
> - repo_config(the_repository, platform_core_config, NULL);
> + repo_config(the_repository, git_default_core_config, NULL);
>
> safe_create_dir(the_repository, git_dir, 0);
Two lines further down we call `create_default_files()`, and there we
end up calling `repo_config(the_repository, git_default_config, NULL)`
as one of the first things. We do so after copying templates though, so
indeed this comes too late.
We also cannot really merge these two calls: we need to re-parse the
configuration after having copied over the template, as the template may
contain a gitconfig file itself.
Furthermore, `git_default_core_config()` already knows to call
`platform_core_config()`, as well. So we're not losing any of that
information, either.
All to say that this change makes sense to me and should be safe, as we
don't end up parsing _more_ configuration keys, we only parse a subset
of it a bit earlier.
Patrick|
|
||
| ssize_t strbuf_write(struct strbuf *sb, FILE *f) | ||
| { | ||
| return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Patrick Steinhardt wrote (reply to this):
On Tue, Dec 16, 2025 at 03:33:48PM +0000, Karsten Blees via GitGitGadget wrote:
> diff --git a/strbuf.c b/strbuf.c
> index 44a8f6a554..fa4e30f112 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -566,8 +566,6 @@ ssize_t strbuf_write(struct strbuf *sb, FILE *f)
> return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0;
> }
>
> -#define STRBUF_MAXLINK (2*PATH_MAX)
> -
> int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
> {
> size_t oldalloc = sb->alloc;
> @@ -575,7 +573,7 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
> if (hint < 32)
> hint = 32;
>
> - while (hint < STRBUF_MAXLINK) {
> + for (;;) {
> ssize_t len;
>
> strbuf_grow(sb, hint + 1);
This makes me wonder whether we have a better way to figure out the
actual size of the buffer that we ultimately need to allocate. But
reading through readlink(3p) doesn't indicate anything, and I'm not sure
whether we can always rely on lstat(3p) to return the correct size for
symlink contents on all platforms.
One thing that _is_ noted though is that calling the function with a
buffer size larger than SSIZE_MAX is implementation-defined. It does
make me a bit uneasy in that light to grow indefinitely.
Which makes me wonder whether Windows has a limit for the symlink
contents that we could enforce in theory so that we can reasonably turn
this into a bounded loop again?
PatrickThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Johannes Schindelin wrote (reply to this):
Hi Patrick,
On Wed, 17 Dec 2025, Patrick Steinhardt wrote:
> On Tue, Dec 16, 2025 at 03:33:48PM +0000, Karsten Blees via GitGitGadget wrote:
> > diff --git a/strbuf.c b/strbuf.c
> > index 44a8f6a554..fa4e30f112 100644
> > --- a/strbuf.c
> > +++ b/strbuf.c
> > @@ -566,8 +566,6 @@ ssize_t strbuf_write(struct strbuf *sb, FILE *f)
> > return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0;
> > }
> >
> > -#define STRBUF_MAXLINK (2*PATH_MAX)
> > -
> > int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
> > {
> > size_t oldalloc = sb->alloc;
> > @@ -575,7 +573,7 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
> > if (hint < 32)
> > hint = 32;
> >
> > - while (hint < STRBUF_MAXLINK) {
> > + for (;;) {
> > ssize_t len;
> >
> > strbuf_grow(sb, hint + 1);
>
> This makes me wonder whether we have a better way to figure out the
> actual size of the buffer that we ultimately need to allocate. But
> reading through readlink(3p) doesn't indicate anything, and I'm not sure
> whether we can always rely on lstat(3p) to return the correct size for
> symlink contents on all platforms.
>
> One thing that _is_ noted though is that calling the function with a
> buffer size larger than SSIZE_MAX is implementation-defined. It does
> make me a bit uneasy in that light to grow indefinitely.
>
> Which makes me wonder whether Windows has a limit for the symlink
> contents that we could enforce in theory so that we can reasonably turn
> this into a bounded loop again?
https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation
suggests that the maximum permissible target path should be 32,768. But
that's not _quite_ correct, as
`../t/../Documentation/RelNotes/../../README.md` is a perfectly valid (if
awkward) symlink target.
Still, I would say that 32,768 would make for a fine (still insanely high,
but not so high as to allow malicious symlinks to cause memory problems)
limit.
Sound good?
JohannesThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Patrick Steinhardt wrote (reply to this):
On Fri, Dec 19, 2025 at 09:50:15AM +0100, Johannes Schindelin wrote:
> Hi Patrick,
>
> On Wed, 17 Dec 2025, Patrick Steinhardt wrote:
>
> > On Tue, Dec 16, 2025 at 03:33:48PM +0000, Karsten Blees via GitGitGadget wrote:
> > > diff --git a/strbuf.c b/strbuf.c
> > > index 44a8f6a554..fa4e30f112 100644
> > > --- a/strbuf.c
> > > +++ b/strbuf.c
> > > @@ -566,8 +566,6 @@ ssize_t strbuf_write(struct strbuf *sb, FILE *f)
> > > return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0;
> > > }
> > >
> > > -#define STRBUF_MAXLINK (2*PATH_MAX)
> > > -
> > > int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
> > > {
> > > size_t oldalloc = sb->alloc;
> > > @@ -575,7 +573,7 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
> > > if (hint < 32)
> > > hint = 32;
> > >
> > > - while (hint < STRBUF_MAXLINK) {
> > > + for (;;) {
> > > ssize_t len;
> > >
> > > strbuf_grow(sb, hint + 1);
> >
> > This makes me wonder whether we have a better way to figure out the
> > actual size of the buffer that we ultimately need to allocate. But
> > reading through readlink(3p) doesn't indicate anything, and I'm not sure
> > whether we can always rely on lstat(3p) to return the correct size for
> > symlink contents on all platforms.
> >
> > One thing that _is_ noted though is that calling the function with a
> > buffer size larger than SSIZE_MAX is implementation-defined. It does
> > make me a bit uneasy in that light to grow indefinitely.
> >
> > Which makes me wonder whether Windows has a limit for the symlink
> > contents that we could enforce in theory so that we can reasonably turn
> > this into a bounded loop again?
>
> https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation
> suggests that the maximum permissible target path should be 32,768. But
> that's not _quite_ correct, as
> `../t/../Documentation/RelNotes/../../README.md` is a perfectly valid (if
> awkward) symlink target.
>
> Still, I would say that 32,768 would make for a fine (still insanely high,
> but not so high as to allow malicious symlinks to cause memory problems)
> limit.
>
> Sound good?
> Johannes
Sounds good to me, thanks!
PatrickThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
Patrick Steinhardt <ps@pks.im> writes:
>> > This makes me wonder whether we have a better way to figure out the
>> > actual size of the buffer that we ultimately need to allocate. But
>> > reading through readlink(3p) doesn't indicate anything, and I'm not sure
>> > whether we can always rely on lstat(3p) to return the correct size for
>> > symlink contents on all platforms.
>> >
>> > One thing that _is_ noted though is that calling the function with a
>> > buffer size larger than SSIZE_MAX is implementation-defined. It does
>> > make me a bit uneasy in that light to grow indefinitely.
>> >
>> > Which makes me wonder whether Windows has a limit for the symlink
>> > contents that we could enforce in theory so that we can reasonably turn
>> > this into a bounded loop again?
>>
>> https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation
>> suggests that the maximum permissible target path should be 32,768. But
>> that's not _quite_ correct, as
>> `../t/../Documentation/RelNotes/../../README.md` is a perfectly valid (if
>> awkward) symlink target.
>>
>> Still, I would say that 32,768 would make for a fine (still insanely high,
>> but not so high as to allow malicious symlinks to cause memory problems)
>> limit.
>>
>> Sound good?
>> Johannes
>
> Sounds good to me, thanks!
As this is a generic codepath in strbuf.c, platforms that do not
honor Microsoft's promise cited above can break the assumption made
here by going beyond 32k, no?
I am OK if this infinite loop had our own "we are growing the buffer
very long and still getting not-enough-buf error; let's give up"
termination condition.
IOW, a simpler alternative may be
---- >8 ----
Subject: strbuf_readlink(): do not trust PATH_MAX
We have been bitten before by platforms that sets PATH_MAX way too
low, far below the length of paths they comfortably support. The
strbuf_readlink() limits the link targets to PATH_MAX, which is a
code path that is broken by such platforms.
Raise the limit to 32kB, which matches the limit of a
platform with such a problem [*].
* https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation
strbuf.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git c/strbuf.c w/strbuf.c
index 7fb7d12ac0..1c7659bcd2 100644
--- c/strbuf.c
+++ w/strbuf.c
@@ -566,7 +566,11 @@ ssize_t strbuf_write(struct strbuf *sb, FILE *f)
return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0;
}
-#define STRBUF_MAXLINK (2*PATH_MAX)
+/*
+ * Do not use PATH_MAX, as some platforms sets it too low;
+ * 32kB matches what Windows has as the real limit for a pathnname.
+ */
+#define STRBUF_MAXLINK (2 * (1 << 15))
int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
{a4c7170 to
b293ce9
Compare
|
|
||
| ssize_t strbuf_write(struct strbuf *sb, FILE *f) | ||
| { | ||
| return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
"Karsten Blees via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Karsten Blees <blees@dcon.de>
>
> The `strbuf_readlink()` function refuses to read link targets that
> exceed PATH_MAX (even if a sufficient size was specified by the caller).
>
> As some platforms (*cough* Windows *cough*) support longer paths, remove
> this restriction (similar to `strbuf_getcwd()`).
>
> Signed-off-by: Karsten Blees <blees@dcon.de>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> strbuf.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
We've been bitten before by platforms that sets PATH_MAX too low
(i.e., lower than what they comfortably support), so this is a
welcome change.
> diff --git a/strbuf.c b/strbuf.c
> index 44a8f6a554..fa4e30f112 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -566,8 +566,6 @@ ssize_t strbuf_write(struct strbuf *sb, FILE *f)
> return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0;
> }
>
> -#define STRBUF_MAXLINK (2*PATH_MAX)
> -
> int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
> {
> size_t oldalloc = sb->alloc;
> @@ -575,7 +573,7 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
> if (hint < 32)
> hint = 32;
>
> - while (hint < STRBUF_MAXLINK) {
> + for (;;) {
> ssize_t len;
>
> strbuf_grow(sb, hint + 1);
I briefly wondered if this would cause us loop infinitely on a truly
broken platform, where readlink() somehow keeps returning negative,
but we only retry when we got ERANGE (which can be seen several
lines below the postimage of hte patch), so we should be safe.
Thanks.b293ce9 to
ef6dd00
Compare
|
There was a status update in the "Cooking" section about the branch Further preparation to upstream symbolic link support on Windows. Will merge to 'next'? source: <pull.2017.git.1765899229.gitgitgadget@gmail.com> |
|
There was a status update in the "Cooking" section about the branch Further preparation to upstream symbolic link support on Windows. Will merge to 'next'? source: <pull.2017.git.1765899229.gitgitgadget@gmail.com> |
|
There was a status update in the "Cooking" section about the branch Further preparation to upstream symbolic link support on Windows. Will merge to 'next'? source: <pull.2017.git.1765899229.gitgitgadget@gmail.com> |
|
There was a status update in the "Cooking" section about the branch Further preparation to upstream symbolic link support on Windows. Expecting review responses. cf. <xmqqcy3wh8d1.fsf@gitster.g> source: <pull.2017.git.1765899229.gitgitgadget@gmail.com> |
|
There was a status update in the "Cooking" section about the branch Further preparation to upstream symbolic link support on Windows. Expecting review responses. cf. <xmqqcy3wh8d1.fsf@gitster.g> source: <pull.2017.git.1765899229.gitgitgadget@gmail.com> |
|
There was a status update in the "Cooking" section about the branch Further preparation to upstream symbolic link support on Windows. Expecting review responses. cf. <xmqqcy3wh8d1.fsf@gitster.g> source: <pull.2017.git.1765899229.gitgitgadget@gmail.com> |
|
There was a status update in the "Cooking" section about the branch Further preparation to upstream symbolic link support on Windows. Expecting review responses. cf. <xmqqcy3wh8d1.fsf@gitster.g> source: <pull.2017.git.1765899229.gitgitgadget@gmail.com> |
As pointed out in git-for-windows#1676, the `git rev-parse --is-inside-work-tree` command currently fails when the current directory's path contains symbolic links. The underlying reason for this bug is that `getcwd()` is supposed to resolve symbolic links, but our `mingw_getcwd()` implementation did not. We do have all the building blocks for that, though: the `GetFinalPathByHandleW()` function will resolve symbolic links. However, we only called that function if `GetLongPathNameW()` failed, for historical reasons: the latter function was supported for a long time, but the former API function was introduced only with Windows Vista, and we used to support also Windows XP. With that support having been dropped, we are free to call the symbolic link-resolving function right away. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In Git for Windows, `has_symlinks` is set to 0 by default. Therefore, we need to parse the config setting `core.symlinks` to know if it has been set to `true`. In `git init`, we must do that before copying the templates because they might contain symbolic links. Even if the support for symbolic links on Windows has not made it to upstream Git yet, we really should make sure that all the `core.*` settings are parsed before proceeding, as they might very well change the behavior of `git init` in a way the user intended. This fixes git-for-windows#3414 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `strbuf_readlink()` function calls `readlink()`` twice if the hint argument specifies the exact size of the link target (e.g. by passing stat.st_size as returned by `lstat()`). This is necessary because `readlink(..., hint) == hint` could mean that the buffer was too small. Use `hint + 1` as buffer size to prevent this. Signed-off-by: Karsten Blees <karsten.blees@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `strbuf_readlink()` function refuses to read link targets that exceed 2*PATH_MAX (even if a sufficient size was specified by the caller). The reason that that limit is 2*PATH_MAX instead of PATH_MAX is that the symlink targets do not need to be normalized. After running `ln -s a/../a/../a/../a/../b c`, the target of the symlink `c` will not be normalized to `b` but instead be much longer. As such, symlink targets' lengths can far exceed PATH_MAX. They are frequently much longer than 2*PATH_MAX on Windows, which actually supports paths up to 32,767 characters, but sets PATH_MAX to 260 for backwards compatibility. For full details, see https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation Let's just hard-code the limit used by `strbuf_readlink()` to 32,767 and make it independent of the current platform's PATH_MAX. Based-on-a-patch-by: Karsten Blees <karsten.blees@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Currently, this function hard-codes the directory separator as the forward slash. However, on Windows the backslash character is valid, too. And we want to call this function in the upcoming support for symlinks on Windows with the symlink targets (which naturally use the canonical directory separator on Windows, which is _not_ the forward slash). Prepare that function to be useful also in that context. Signed-off-by: Karsten Blees <karsten.blees@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
|
This patch series was integrated into seen via git@9c83571. |
3521180 to
9823cbb
Compare
After preparing Git's test suite for the upcoming support for symlinks on Windows, this patch series touches up a couple of code paths that might not seem to be related at first, but need to be adjusted for the symlink support to work as expected.
This is based on
js/test-symlink-windows.Changes since v1:
cc: Patrick Steinhardt ps@pks.im