Skip to content

feat: Add timestamp nanosecond primitive types#653

Open
zhjwpku wants to merge 2 commits into
apache:mainfrom
zhjwpku:iceberg-v3-timestamp_ns-timestamptz_ns_types
Open

feat: Add timestamp nanosecond primitive types#653
zhjwpku wants to merge 2 commits into
apache:mainfrom
zhjwpku:iceberg-v3-timestamp_ns-timestamptz_ns_types

Conversation

@zhjwpku
Copy link
Copy Markdown
Collaborator

@zhjwpku zhjwpku commented May 17, 2026

No description provided.

@zhjwpku
Copy link
Copy Markdown
Collaborator Author

zhjwpku commented May 17, 2026

I chose TypeId::kTimestampNs over TypeId::kTimestampNano (Java uses Nano) to align with the spec. @evindj Please help review the timestamp parsing part when you have time. I changed the fractional seconds handling a bit.

@zhjwpku zhjwpku requested a review from wgtmac May 17, 2026 04:33
template <>
int32_t HashLiteral<TypeId::kTimestampTzNs>(const Literal& literal) {
return BucketUtils::HashLong(std::get<int64_t>(literal.value()));
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the Iceberg V3 spec and the Java implementation (BucketTimestampNano.java), nanosecond timestamps must be converted to microseconds (divided by 1000) before hashing. This ensures that bucket partitioning is consistent between microsecond and nanosecond precision types for the same logical time.

return BucketUtils::HashLong(std::get<int64_t>(literal.value()) / 1000);

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment thread src/iceberg/util/transform_util.cc Outdated
std::string TransformUtil::HumanTimestampNs(int64_t timestamp_nanos) {
auto tp = std::chrono::time_point<std::chrono::system_clock, std::chrono::seconds>{
std::chrono::seconds(timestamp_nanos / kNanosPerSecond)};
auto nanos = timestamp_nanos % kNanosPerSecond;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For negative timestamps (pre-1970), C++'s division (/) and modulo (%) operators truncate towards zero. This causes ParseTimestampNs and HumanTimestampNs to compute an incorrect base time point and a negative fractional part, breaking the string formatting and parsing.

For example, 1969-12-31T23:59:59.123456789 parses to -876543211 nanos. Passing this back here yields 0 for seconds and -876543211 for nanos, resulting in 1970-01-01T00:00:00.-876543211.

Consider using std::chrono::floor to handle the negative values correctly (note: the original microsecond HumanTimestamp and ParseTimestamp also suffer from this issue).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, fixed with some additional test cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants