-
Notifications
You must be signed in to change notification settings - Fork 126
The performance of creating new datetime formats can be improved #585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
The performance of creating new datetime formats can be improved #585
Conversation
- Replace direct list concatenation in `ParserStructure.append()` with `ConcatenatedListView` for improved efficiency. - Add `ConcatenatedListView` implementation to lazily combine two lists without creating a new collection.
|
@dkhalanskyjb Hello! What do you think about this idea? |
|
Hi! This may help with the first stage (building a parser before the normalisation), but normalisation has quadratic complexity, too, and it wouldn't benefit from the proposed approach, as the lists themselves will also need to be reconstructed. We could extract the happy fast path where there are no adjacent numeric parser operations and simplify normalisation there. That is a common case, so it would be nice to provide brilliant performance there. I'm not yet convinced the common case can't be drastically improved by a better algorithm. |
|
Now that I think about it, it doesn't even fix the quadratic complexity of the initial stage. Concatenating Most parsers we concatenate are going to be single-element, so |
|
Yep, the initial stage also doesn't benefit from this. A quick run of a benchmark shows this: Before the change: After the change: Here, less is better (the numbers 5.2 and 7.8 show how long in milliseconds an operation takes). The benchmark itself is creating the datetime format used in Python: @Benchmark
fun buildFormat(blackhole: Blackhole) {
val v = LocalDateTime.Format {
year()
char('-')
monthNumber()
char('-')
day()
char(' ')
hour()
char(':')
minute()
optional {
char(':')
second()
optional {
char('.')
secondFraction()
}
}
}
blackhole.consume(v)
} |
…ability and specificity.
…ure formatting scalability.
- Introduce `ConcatenatedListViewIterator` to enable iteration without materializing combined lists. - Optimize nested list handling by directly traversing inner `ConcatenatedListView` instances.
…tokens in format definition
|
I have avoided open class ParserStructureConcatBenchmark {
@Param("1", "2", "4", "8", "16", "32", "64", "128", "256", "512", "1024")
var n = 0
@Benchmark
fun largeSerialFormat(blackhole: Blackhole) {
val format = LocalDateTime.Format {
repeat(n) {
char('^')
monthNumber()
char('&')
day()
char('!')
hour()
char('$')
minute()
char('#')
second()
char('@')
}
}
blackhole.consume(format)
}
}
|
…cture formatting performance
…mplex parser structures
…us checks with bitwise operations
|
|
|
It's cool that the quadratic complexity can be mitigated in some scenarios, and I still believe that it's currently contributing significantly to the runtime we essentially observe. That said, the PR as a whole looks very much like a Pyrrhic victory to me, as the (Less is better here) Performance improvements are pointless if they negatively impact the actual use cases our users will encounter. Yes, the quadratic complexity was eliminated, but the increase in the resulting constant just seems too high. I propose that we add all the common formats we already provide as benchmarks ( |
ParserStructure.append()withConcatenatedListViewfor improved efficiency.ConcatenatedListViewimplementation to lazily combine two lists without creating a new collection.