SQL Filtering on Timestamps Returning Incorrect Records

I am hoping to find help with a bug I am experiencing within Dremio.

This is occurring for connections out to MySQL, HDFS, and S3. I am querying through the JDBC Driver.

I am currently developing a connector for our BI tool out to Dremio for a client and I am running into an interesting bug. I am not sure if it has been recorded yet.

Of my 1800 integration tests, 11 are failing because of a specific filter against timestamp fields (date & time).

In the below query I am looking for where date_with_null is equal to 2008-12-31T23:59:59.999Z, but the result set is returning records for the timestamp 2009-01-01 00:00:00.000

select ds.date_full as date_full, ds.date_with_null as date_with_null
from "Mysql.integration_tests".connector_test as ds
  where ds.date_with_null = '2008-12-31T23:59:59.999Z' AND ds.date_full in ('2009-12-31T23:59:59.999Z', '2012-01-01T00:00:00.000Z', '2008-12-31T23:59:59.999Z', '2010-12-31T23:59:59.999Z')

To resolve the issue I need to modify the query to be:

select ds.date_full as date_full, ds.date_with_null as date_with_null
from "Mysql.integration_tests".connector_test as ds
  where ds.date_with_null >= '2008-12-31T23:59:59.999Z' AND ds.date_with_null <= '2008-12-31T23:59:59.999Z' AND ds.date_full in ('2009-12-31T23:59:59.999Z', '2012-01-01T00:00:00.000Z', '2008-12-31T23:59:59.999Z', '2010-12-31T23:59:59.999Z')

This is not ideal as most of our query generation is automated and we cannot accommodate for that change easily.

I’ve also tried:

  • replacing the string literal with CAST(‘2008-12-31T23:59:59.999Z’ AS TIMESTAMP) but that does not work either (it performs the same).
  • using an in clause instead of an equals filter, but this too performs incorrectly.

Is this an issue that the team is currently aware of?

Thanks,
Alexander

Would you mind sharing the query profile with us (how-to available here)? I suspect some precision is loss somewhere…

Sure thing. I assumed it was a precision issue, just new to the platform so I am not sure where to begin.

Attached are two query profiles. One for the equals case with the incorrect results, and the other for the “between” case with the correct results.

Thanks!

incorrect_results.zip (3.2 KB)
correct_results.zip (3.2 KB)

It is indeed a precision issue: when pushing the condition to the MySQL source, Dremio casts the string literal into a DATETIME type, but does not specify the precision to apply. MySQL in turn is rounding the number (and not truncating it) which finally causes the extra results to show up.

One possible solution would be to use a timestamp literal, and not a string literal. In that case, Dremio would not add an extra CAST, and MySQL should return the correct results. But note that {ts '2008-12-31T23:59:59.999Z'} is not a valid timestamp literal for Dremio: you have to use {ts '2008-12-31 23:59:59.999'} instead.

Unfortunately that does not resolve the issue either. Unless I misunderstood.

I have executed:

select ds.date_full as date_full, ds.date_with_null as date_with_null
from "Mysql.integration_tests".connector_test as ds
  where ds.date_with_null = {ts '2008-12-31 23:59:59.999'} AND ds.date_full in ('2009-12-31T23:59:59.999Z', '2012-01-01T00:00:00.000Z', '2008-12-31T23:59:59.999Z', '2010-12-31T23:59:59.999Z')

To no success (2009-01-01 00:00:00.0 is still returned). I attached the query profile below. I appreciate any help that can be provided on this.

timestamp-literal.zip (3.2 KB)

Looks like it’s now hitting a separate bug where Dremio is losing precision when manipulating constant expressions. I’ll mention this thread to our internal ticket so that we can verify your queries at the same time.

Thanks for looking into this. Any ETA for adding it to the roadmap/investigate? Just a rough number that I can report back to my boss and relay to our client. We can disable the suite of features that rely on these particular queries for the time being.

It’s currently under investigation, but unfortunately I cannot give you an ETA at this point. But I would suggest to try and reach to our sales and partnership team at contact@dremio.com if you have a business case about this issue.