-
-
Notifications
You must be signed in to change notification settings - Fork 34.7k
Speed up mimetypes.guess_type() for plain file paths #150821
Copy link
Copy link
Closed as not planned
Labels
pendingThe issue will be closed if no feedback is providedThe issue will be closed if no feedback is providedperformancePerformance or resource usagePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directoryStandard Library Python modules in the Lib/ directorytype-featureA feature request or enhancementA feature request or enhancement
Metadata
Metadata
Assignees
Labels
pendingThe issue will be closed if no feedback is providedThe issue will be closed if no feedback is providedperformancePerformance or resource usagePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directoryStandard Library Python modules in the Lib/ directorytype-featureA feature request or enhancementA feature request or enhancement
Fields
Give feedbackNo fields configured for issues without a type.
mimetypes.guess_type()accepts either a URL or a filesystem path, so it parses its argument as a URL withurllib.parse.urlparse()before looking at the extension. The common argument is a plain file path, which has no URL scheme to find, so the parse and theurllib.parseimport it pulls in are spent on nothing.Guessing content types from file names is everywhere: static-file servers, upload handlers, archive and build tools deciding how to treat each file as they walk a tree of thousands.
Guessing types for 15 real file names sampled from the top-1000 corpus takes 23.4 µs today and 11.0 µs when a plain path skips the URL parse and goes straight to extension lookup, 112% faster. Real URLs still take the full parsing path, and results are unchanged for both.
Linked PRs