-
Notifications
You must be signed in to change notification settings - Fork 15.7k
[dsymutil] Add new argument allow_invalid_macho #173503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-debuginfo Author: Ch1p (Ch111p) ChangesSimilar to the issue discussed in #165940 and in some apps I’ve seen at work, we’ve observed dSYM files being generated where a section’s offset exceeds 4GB. This seems to happen because many DWARF sections produced for DWARF4 end up with an alignment of 0. Before PR #168925 (which added validation for sections with a zero alignment attribute), this could go unnoticed and result in an invalid MachO. However, such a dSYM is effectively an invalid Mach-O file today. Unless/until Mach-O explicitly defines standards like: Full diff: https://github.com/llvm/llvm-project/pull/173503.diff 6 Files Affected:
diff --git a/llvm/tools/dsymutil/DwarfLinkerForBinary.cpp b/llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
index 1fc5bba602d8b..68756a891d90d 100644
--- a/llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
+++ b/llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
@@ -883,7 +883,7 @@ bool DwarfLinkerForBinary::linkImpl(
ObjectType == Linker::OutputFileType::Object)
return MachOUtils::generateDsymCompanion(
Options.VFS, Map, *Streamer->getAsmPrinter().OutStreamer, OutFile,
- RelocationsToApply);
+ RelocationsToApply, Options.AllowInvalidMachO);
Streamer->finish();
return true;
diff --git a/llvm/tools/dsymutil/LinkUtils.h b/llvm/tools/dsymutil/LinkUtils.h
index c333a3d4afee0..d950f5f19474c 100644
--- a/llvm/tools/dsymutil/LinkUtils.h
+++ b/llvm/tools/dsymutil/LinkUtils.h
@@ -123,6 +123,10 @@ struct LinkOptions {
bool IncludeSwiftModulesFromInterface = false;
/// @}
+ /// Whether to allow generating Mach-O files that exceed certain format
+ /// limitations, such as sections with file offsets greater than 4GB.
+ bool AllowInvalidMachO = false;
+
LinkOptions() = default;
};
diff --git a/llvm/tools/dsymutil/MachOUtils.cpp b/llvm/tools/dsymutil/MachOUtils.cpp
index fba698896618b..deb2225dc22c6 100644
--- a/llvm/tools/dsymutil/MachOUtils.cpp
+++ b/llvm/tools/dsymutil/MachOUtils.cpp
@@ -324,7 +324,7 @@ static void transferSegmentAndSections(
// Write the __DWARF segment load command to the output file.
static bool createDwarfSegment(const MCAssembler& Asm,uint64_t VMAddr, uint64_t FileOffset,
uint64_t FileSize, unsigned NumSections,
- MachObjectWriter &Writer) {
+ MachObjectWriter &Writer, bool AllowInvalidMachO) {
Writer.writeSegmentLoadCommand("__DWARF", NumSections, VMAddr,
alignTo(FileSize, 0x1000), FileOffset,
FileSize, /* MaxProt */ 7,
@@ -340,7 +340,7 @@ static bool createDwarfSegment(const MCAssembler& Asm,uint64_t VMAddr, uint64_t
VMAddr = alignTo(VMAddr, Alignment);
FileOffset = alignTo(FileOffset, Alignment);
}
- if (FileOffset > UINT32_MAX)
+ if (FileOffset > UINT32_MAX && !AllowInvalidMachO)
return error("section " + Sec->getName() +
"'s file offset exceeds 4GB."
" Refusing to produce an invalid Mach-O file.");
@@ -374,7 +374,7 @@ bool generateDsymCompanion(
llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem> VFS, const DebugMap &DM,
MCStreamer &MS, raw_fd_ostream &OutFile,
const std::vector<MachOUtils::DwarfRelocationApplicationInfo>
- &RelocationsToApply) {
+ &RelocationsToApply, bool AllowInvalidMachO) {
auto &ObjectStreamer = static_cast<MCObjectStreamer &>(MS);
MCAssembler &MCAsm = ObjectStreamer.getAssembler();
auto &Writer = static_cast<MachObjectWriter &>(MCAsm.getWriter());
@@ -586,7 +586,7 @@ bool generateDsymCompanion(
// Write the load command for the __DWARF segment.
if (!createDwarfSegment(MCAsm, DwarfVMAddr, DwarfSegmentStart, DwarfSegmentSize,
- NumDwarfSections, Writer))
+ NumDwarfSections, Writer, AllowInvalidMachO))
return false;
assert(OutFile.tell() == LoadCommandSize + HeaderSize);
diff --git a/llvm/tools/dsymutil/MachOUtils.h b/llvm/tools/dsymutil/MachOUtils.h
index 0229647b00f85..d276a21b6de72 100644
--- a/llvm/tools/dsymutil/MachOUtils.h
+++ b/llvm/tools/dsymutil/MachOUtils.h
@@ -59,7 +59,7 @@ bool generateDsymCompanion(
llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem> VFS, const DebugMap &DM,
MCStreamer &MS, raw_fd_ostream &OutFile,
const std::vector<MachOUtils::DwarfRelocationApplicationInfo>
- &RelocationsToApply);
+ &RelocationsToApply, bool AllowInvalidMachO);
std::string getArchName(StringRef Arch);
} // namespace MachOUtils
diff --git a/llvm/tools/dsymutil/Options.td b/llvm/tools/dsymutil/Options.td
index 571f90c1e46f5..036094f5eeca5 100644
--- a/llvm/tools/dsymutil/Options.td
+++ b/llvm/tools/dsymutil/Options.td
@@ -108,6 +108,12 @@ def fat64: F<"fat64">,
def update: F<"update">,
HelpText<"Updates existing dSYM files to contain the latest accelerator tables and other DWARF optimizations.">,
Group<grp_general>;
+
+def allow_invalid_macho: F<"allow-invalid-macho">,
+ HelpText<"Allow generating Mach-O files that exceed certain format limitations, such as sections with file offsets greater than 4GB."
+ "Use with caution, as such files may not be compatible with all tools.">,
+ Group<grp_general>;
+
def: Flag<["-"], "u">,
Alias<update>,
HelpText<"Alias for --update">,
diff --git a/llvm/tools/dsymutil/dsymutil.cpp b/llvm/tools/dsymutil/dsymutil.cpp
index a71c57c60593c..0a3af1bdb0cec 100644
--- a/llvm/tools/dsymutil/dsymutil.cpp
+++ b/llvm/tools/dsymutil/dsymutil.cpp
@@ -315,6 +315,8 @@ static Expected<DsymutilOptions> getOptions(opt::InputArgList &Args) {
Options.LinkOpts.Fat64 = Args.hasArg(OPT_fat64);
Options.LinkOpts.KeepFunctionForStatic =
Args.hasArg(OPT_keep_func_for_static);
+ Options.LinkOpts.AllowInvalidMachO =
+ Args.hasArg(OPT_allow_invalid_macho);
if (opt::Arg *ReproducerPath = Args.getLastArg(OPT_use_reproducer)) {
Options.ReproMode = ReproducerMode::Use;
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
b3be6e6 to
9ca9117
Compare
|
Hello @JDevlieghere ! Could you help me review this PR? |
|
I had some trouble following the PR description, but here's what I think is going on. Please correct me if I misunderstood anything.
Separately, as long as the actual offset is less than UINT32_MAX and sections are ordered, this can be made to work, even if the computed offset exceeds it, but as you point out that's not standard and currently only supported by LLVM's libObject (as of #165940). Presumably, the motivation for this new flag is to support the latter, rather than just provide an escape hatch for your previous fix? Because without that you're just generating Mach-Os you can't read. Assuming that's the case, then I think a new flag to allow this is reasonable given:
|
|
Sorry for the ambiguous PR description. And yes, your understanding matches what I’m seeing.
Agreed. I’ll add a comment at the relevant point in the code.
Agreed. I’ll rename the flag to be more descriptive and update the help text. |
llvm/tools/dsymutil/Options.td
Outdated
| HelpText<"Updates existing dSYM files to contain the latest accelerator tables and other DWARF optimizations.">, | ||
| Group<grp_general>; | ||
|
|
||
| def allow_slice_section_header_offset_overflow : F<"allow-slice-section-header-offset-overflow">, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Let's drop the slice part. We should include that in the description (to disambiguate with the fat64 header flag) but I think it's a bit too verbose for this option. I'd drop it form the variable name too (i.e AllowSectionHeaderOffsetOverflow).
| def update: F<"update">, | ||
| HelpText<"Updates existing dSYM files to contain the latest accelerator tables and other DWARF optimizations.">, | ||
| Group<grp_general>; | ||
|
|
||
| def allow_slice_section_header_offset_overflow : F<"allow-slice-section-header-offset-overflow">, | ||
| HelpText<"Allow emitting Mach-O where, within a single slice, section header offsets (section.offset, 32-bit) exceed the 4GB limit (non-standard). " | ||
| "This option overrides dsymutil's per-slice 4GB check when writing section headers. " | ||
| "Use with caution: some tools may not be able to read the output.">, | ||
| Group<grp_general>; | ||
|
|
||
| def: Flag<["-"], "u">, | ||
| Alias<update>, | ||
| HelpText<"Alias for --update">, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
Allow emitting a section offset (32-bit) that, when applied, would exceed the Mach-O 4GB limit. This leverages the fact that the offset can be computed as a 64-bit value if the sections are ordered (always the case for dsymutil). Note that this generates an invalid Mach-O and requires explicit support by the tools that will need to consume it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's much cleaner! I’ll update the help text and drop the 'slice' part.
Similar to the issue discussed in #165940 and in some apps I’ve seen at work, we’ve observed dSYM files being generated where a section’s offset exceeds 4GB.
This seems to happen because many DWARF sections produced for DWARF4 end up with an alignment of 0. Before PR #168925 (which added validation for sections with a zero alignment attribute), this could go unnoticed and result in an invalid MachO.
However, such a dSYM is effectively an invalid Mach-O file today. Unless/until Mach-O explicitly defines standards like:
"32-bit section offsets can exceed 4G but must be ordered for 64-bit offset reconstruction", I think the proper approach is to add a new argument to support this.