Skip to content

GH-838: [FlightSQL][JDBC] Fix timestamp unit conversion and sub-millisecond precision in PreparedStatement parameter binding#1081

Open
dmitry-chirkov-dremio wants to merge 5 commits intoapache:mainfrom
dmitry-chirkov-dremio:fix-flightsql-jdbc-timestamp-precision
Open

GH-838: [FlightSQL][JDBC] Fix timestamp unit conversion and sub-millisecond precision in PreparedStatement parameter binding#1081
dmitry-chirkov-dremio wants to merge 5 commits intoapache:mainfrom
dmitry-chirkov-dremio:fix-flightsql-jdbc-timestamp-precision

Conversation

@dmitry-chirkov-dremio
Copy link

@dmitry-chirkov-dremio dmitry-chirkov-dremio commented Mar 21, 2026

Rationale for this change

The FlightSQL JDBC driver's PreparedStatement.setTimestamp() has two bugs (#838):

  1. Unit scaling bug: TimestampAvaticaParameterConverter writes raw epoch millisecond values directly into Arrow timestamp vectors regardless of the vector's time unit. This causes data corruption when the server uses SECOND, MICROSECOND, or NANOSECOND precision — the millisecond value is misinterpreted as the target unit (e.g., epoch millis treated as epoch micros → dates near 1970).

  2. Precision loss bug: The underlying Avatica library serializes java.sql.Timestamp into a long (epoch milliseconds) via TypedValue.ofJdbc(), discarding the sub-millisecond nanoseconds from Timestamp.getNanos() before they reach the Arrow converters. This means even with correct unit scaling, microsecond and nanosecond precision is permanently lost.

What changes are included in this PR?

Fix 1 — Unit scaling (converter layer):

  • Modified TimestampAvaticaParameterConverter to store the ArrowType.Timestamp and scale the epoch millisecond value to the target unit before writing to the vector:
    • SECOND: divide by 1,000
    • MILLISECOND: pass through
    • MICROSECOND: multiply by 1,000
    • NANOSECOND: multiply by 1,000,000

Fix 2 — Sub-millisecond precision preservation (JDBC interception layer):

  • ArrowFlightPreparedStatement: Overrides setTimestamp() to capture the raw java.sql.Timestamp objects (with full nanosecond precision) before Avatica serializes them to millis. Also overrides clearParameters() to clean up.
  • ArrowFlightMetaImpl: Added getRawTimestamps() to bridge the captured timestamps from the statement to the parameter binder.
  • AvaticaParameterBinder: Updated bind() and BinderVisitor to propagate the optional raw Timestamp to converters.
  • TimestampAvaticaParameterConverter: Added bindParameter(vector, typedValue, index, rawTimestamp) overload and convertFromTimestamp(Timestamp) that reconstructs full-precision epoch values from Timestamp.getTime() (epoch seconds) + Timestamp.getNanos() (fractional second in nanos), avoiding the Avatica millis truncation.
  • ListAvaticaParameterConverter, LargeListAvaticaParameterConverter, FixedSizeListAvaticaParameterConverter: Updated to pass null as the raw timestamp argument to the new BinderVisitor constructor.

Are these changes tested?

Yes.

  • Unit tests (TimestampAvaticaParameterConverterTest): Added tests for all 8 vector types (4 units × with/without timezone) for the millis-based scaling, plus 6 new tests for the raw Timestamp path covering micro precision, nano precision, milli truncation, second truncation, and null-rawTimestamp fallback.
  • Integration tests (ArrowFlightPreparedStatementTest): Added testTimestampParameterWithMicrosecondPrecision and testTimestampParameterWithMillisecondPrecision that exercise the full setTimestamp() → interception → binding → mock server validation path.

Are there any user-facing changes?

This PR includes breaking changes to public APIs. PreparedStatement#setTimestamp() now correctly converts timestamp values for non-millisecond precision columns. Previously, timestamps inserted via the FlightSQL JDBC driver into SECOND, MICROSECOND, or NANOSECOND columns were silently corrupted. Sub-millisecond precision (microseconds, nanoseconds) is now preserved when the target column supports it.

This PR contains a "Critical Fix". The bug caused incorrect data to be produced — epoch milliseconds were written as-is into microsecond/nanosecond vectors, resulting in wildly wrong dates (e.g., 2024 → 1970). Additionally, sub-millisecond precision was silently discarded.

Closes #838

@github-actions

This comment has been minimized.

@dmitry-chirkov-dremio dmitry-chirkov-dremio force-pushed the fix-flightsql-jdbc-timestamp-precision branch from c43b2e8 to 6d63367 Compare March 21, 2026 00:27
@dmitry-chirkov-dremio dmitry-chirkov-dremio changed the title GH-838: [FlightSQL][JDBC] Fix timestamp unit conversion in PreparedStatement parameter binding GH-WIP: [FlightSQL][JDBC] Fix timestamp unit conversion in PreparedStatement parameter binding Mar 21, 2026
…ecision in PreparedStatement parameter binding

Intercept raw java.sql.Timestamp objects at the JDBC layer before Avatica
serializes them to epoch milliseconds (losing sub-ms nanos). Propagate the
original Timestamp through ArrowFlightMetaImpl and AvaticaParameterBinder to
TimestampAvaticaParameterConverter, which reconstructs full-precision epoch
values from Timestamp.getTime() + Timestamp.getNanos().

Add unit tests for micro/nano/milli/second precision with raw Timestamp,
and integration tests exercising the full setTimestamp() path through the
mock FlightSQL server.
@dmitry-chirkov-dremio dmitry-chirkov-dremio marked this pull request as ready for review March 21, 2026 03:46
@dmitry-chirkov-dremio dmitry-chirkov-dremio changed the title GH-WIP: [FlightSQL][JDBC] Fix timestamp unit conversion in PreparedStatement parameter binding GH-838: [FlightSQL][JDBC] Fix timestamp unit conversion and sub-millisecond precision in PreparedStatement parameter binding Mar 21, 2026
@dmitry-chirkov-dremio
Copy link
Author

dmitry-chirkov-dremio commented Mar 21, 2026

There might be a preexisting lack of timezone offset handling somewhere near this path however its not relevant to the issue nor the fix.
Edit: #463, #732

@jbonofre jbonofre added the bug-fix PRs that fix a big. label Mar 21, 2026
@jbonofre jbonofre added this to the 20.0.0 milestone Mar 21, 2026
Copy link
Member

@jbonofre jbonofre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good one ! Thanks @dmitry-chirkov-dremio !

I just have a suggestion for the convertFromMillis method.

Copy link
Contributor

@xborder xborder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM. I'd probably just add a few tests for edge cases. Examples:

  • Change the Timestamp object after setting the parameter. If we're keeping a reference to the object, is it possible to change it after the bind?
  • Setting timestamp with null on a string with multiple parameters
  • Test setObject
  • Use setObject for a timestamp parameter, before/after setTimestamp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change bug-fix PRs that fix a big.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[JAVA] FlightSQL JDBC Driver PreparedStatement#setTimestamp() ignores connection timeStampPrecision

3 participants