Skip to content

Add .gitattributes and normalize line endings across repo#6267

Open
tianon-sso wants to merge 2 commits into
microsoft:masterfrom
tianon-sso:line-endings
Open

Add .gitattributes and normalize line endings across repo#6267
tianon-sso wants to merge 2 commits into
microsoft:masterfrom
tianon-sso:line-endings

Conversation

@tianon-sso

@tianon-sso tianon-sso commented Jun 5, 2026

Copy link
Copy Markdown
  • set CRLF as the default checkout encoding for this Windows-native project; text=auto handles binary detection (including UTF-16 LE vendored files) without explicit overrides
  • keep *.patch as LF -- vcpkg applies these against LF source trees via git apply
  • the remaining 1,753 staged files are a one-time re-normalization of git's internal LF object storage; git diff --ignore-all-space shows only .gitattributes as a real change

Assisted-By: "claude my eyes right out"

(I'm more than happy to rebase, adjust, discuss, amend, etc)

Microsoft Reviewers: Open in CodeFlow

@tianon-sso tianon-sso requested a review from a team as a code owner June 5, 2026 20:58
@tianon-sso

Copy link
Copy Markdown
Author

I noticed this while looking at something else -- I'm happy to close instead if that's preferred.

I do not know if my employer (Docker) has signed Microsoft's CLA, but I imagine they have. Is there some way we can check? I can't reasonably sign for them, but this is made in the scope of my work (which is generally all open source and freely licensed).

@tianon-sso

Copy link
Copy Markdown
Author

(one could also argue pretty convincingly that the changes I've made here are pretty categorically un-copywritable, and I'd be perfectly comfortable with that interpretation)

florelis
florelis previously approved these changes Jun 5, 2026

@florelis florelis left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm generally okay with this change, but I'd like to have at least one other person in the team weigh in.

I pulled the branch and confirmed there are only whitespace changes besides .gitattributes

I think I'd prefer to have all of the files with the same line endings instead of excluding some types. I would expect all the tooling we use to Just Work® with CRLF.

For the excluded ExternalModules, couldn't we instead do -eol?

For the excluded binary types, I would have expected text=auto to recognize them properly.

@tianon-sso

Copy link
Copy Markdown
Author

❤️

That's all fair and sane, yeah (although -eol isn't a thing - we'd just remove the rule and let text=auto do the right thing naturally); here's a smaller version I'm happy to update to:

# Default: CRLF on checkout - primary tooling is Visual Studio / PowerShell on Windows.
# Git always stores text as LF in the object store; eol= controls the working tree only.
* text=auto eol=crlf

# Patch files must stay LF - git apply and patch(1) match content lines against
# the target source tree, which vcpkg manages as LF internally
*.patch text eol=lf

(should I amend + force-push, or make a new commit for that?)

@tianon-sso

Copy link
Copy Markdown
Author

Aha, I found a case where I've got permission (and we've previously signed a CLA):

@microsoft-github-policy-service agree [company="Docker"]

@tianon-sso

Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree company="Docker"

@Trenly Trenly left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .editorconfig specifies CRLF for all files with very specific exclusions. The new gitattributes file specifies lf endings more broadly. These two should match in their assumptions of which file types use which ending (and should take into account the suggestions from @florelis above)

@microsoft-github-policy-service microsoft-github-policy-service Bot added the Needs-Author-Feedback Issue needs attention from issue or PR author label Jun 5, 2026
@Trenly

Trenly commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

(should I amend + force-push, or make a new commit for that?)

Just a new commit should be fine - the PRs here get squash merged, so it's nice to preserve the history on the PR

@microsoft-github-policy-service microsoft-github-policy-service Bot added Needs-Attention Issue needs attention from Microsoft and removed Needs-Author-Feedback Issue needs attention from issue or PR author labels Jun 5, 2026
@tianon-sso

Copy link
Copy Markdown
Author

Totally fair/understood; updated! ❤️

$ git diff --ignore-all-space origin/master..HEAD
diff --git a/.gitattributes b/.gitattributes
new file mode 100644
index 00000000..36e0b0c7
--- /dev/null
+++ b/.gitattributes
@@ -0,0 +1,7 @@
+# Default: CRLF on checkout - primary tooling is Visual Studio / PowerShell on Windows.
+# Git always stores text as LF in the object store; eol= controls the working tree only.
+* text=auto eol=crlf
+
+# Patch files must stay LF - git apply matches content lines against the target source
+# tree, which vcpkg manages as LF internally
+*.patch text eol=lf
$ git diff --ignore-all-space --stat origin/master..HEAD
 .gitattributes | 7 +++++++
 1 file changed, 7 insertions(+)

@JohnMcPMS

Copy link
Copy Markdown
Member

Just to be open about this PR, anything that touches 2 thousand files is suspect. I literally cannot be too careful here as I'm sure you can understand. I'm not accusing you of anything, I'm just trying to explain my defensive posture.

I would suggest using --ignore-space-change in the future, as --ignore-all-space can definitely allow for changes to semantics (despite the difficulty in achieving an actionable attack). It also only detected the .gitattributes.

I copied your .gitattributes onto a new branch and did my own renormalize, which was found to have no diff from your branch.

@JohnMcPMS

Copy link
Copy Markdown
Member

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@tianon-sso

tianon-sso commented Jun 9, 2026

Copy link
Copy Markdown
Author

100% understood, appreciated, and if I could make this change without touching all the files I would!

I could revert all the normalization and only do the .gitattributes change if you'd rather let normalization happen as each file gets changed, but then it clutters future PRs and makes history more annoying. We could also revert this to just the .gitattributes file and have someone from the team do a follow-up instead -- I'm happy to do whatever! ❤️

(but yeah, to be clear/repeat myself, your careful review and skepticism is 100% warranted, appreciated, and understood! 💖)

@JohnMcPMS

Copy link
Copy Markdown
Member

I could revert all the normalization and only do the .gitattributes change if you'd rather let normalization happen as each file gets changed, but then it clutters future PRs and makes history more annoying. We could also revert this to just the .gitattributes file and have someone from the team do a follow-up instead -- I'm happy to do whatever! ❤️

I considered asking for that, but I would just --renormalize as I did anyway and with the outputs being identical I can be happy with that.

I did notice that there are some test failures (likely due to certs expiring) so don't be concerned if you see those. I am going to fix them and will need to do another build after.

@JohnMcPMS

Copy link
Copy Markdown
Member

Given the conflicts, if you want to just revert to .gitattributes only I can shepherd a --renormalize PR through.

- set CRLF as the default checkout encoding for this Windows-native project, with LF overrides for cross-platform formats (`.md`, `.json`, `.js`, `.txt`, `.patch`, git config files)
- mark `src/PowerShell/ExternalModules/**` as `-text` to preserve vendored files byte-for-byte -- several are UTF-16 LE encoded
- the remaining 1,753 staged files are a one-time re-normalization of git's internal LF object storage; `git diff --ignore-all-space` shows only `.gitattributes` as a real change

Assisted-By: "claude my eyes right out"
- drop explicit LF overrides for `.md`, `.json`, `.js`, etc. -- `text=auto eol=crlf` handles those correctly; no tooling in this repo requires LF for those types
- drop `src/PowerShell/ExternalModules/** -text` -- `text=auto` already detects UTF-16 LE files as binary natively
- drop explicit `binary` entries -- redundant with `text=auto` binary detection
- keep `*.patch text eol=lf` -- vcpkg applies these against LF source trees via `git apply`; CRLF patch content risks mismatched hunk context
@tianon-sso

Copy link
Copy Markdown
Author

Successfully rebased/reverted to the single file change 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Needs-Attention Issue needs attention from Microsoft

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants