Unexpected result when using both APPROX_COUNT_DISTINCT and COUNT

The setup is quite simple:

CREATE TABLE $scratch.people (name) AS VALUES ('Bob'), ('Charlie'), ('Alice')

When running this query:

SELECT COUNT(*) FROM (SELECT * FROM $scratch.people WHERE name = 'Barry')

The result for the count is 0 as expected.

However, when running this query:

SELECT APPROX_COUNT_DISTINCT(name), COUNT(*) FROM (SELECT * FROM $scratch.people WHERE name = 'Barry')

The result for the count is null which is unexpected.

@bogdan.coman Thanks a lot for helping on the other issue.
Do you have a hint for that issue as well ?

It’s on the same version: 24.3.0-202312220021110714-f19d865e (from the Docker image)

Hi @jnd77

I don’t know if using COUNT twice is expected, how about if you try:

SELECT APPROX_COUNT_DISTINCT(name) FROM (SELECT * FROM $scratch.people WHERE name = 'Barry')

Does that return the expected result?

Thanks, Bogdan

Hi @bogdan.coman,

Thanks for your reply.

The issue lies with count(*) which should never return null.
I just happened to notice it when using APPROX_COUNT_DISTINCT as well (most likely a corner case when generating the query plan).

Thanks for your help.

Hi @jnd77

I see what you’re saying, not really trying to defend the bug :slight_smile:
I raised a JIRA for it (DX-86597), so the COUNT expression returns 0 instead of null. The order doesn’t matter, it looks like COUNT will return null either way.

Thanks a lot for the report!

Bogdan

Thanks for raising a JIRA ticket. :slight_smile: