Run TPC-DS SQL via API or CLI with context

Hello, I’m really interested in adoption of dremio as a data warehouse to our project and doing some performance test.

benchmark set is TPC-DS which is well known solution in bigdata world and i’m trying to run 99 queries using dremio cli which i found from Command Line Interface — Dremio client __version__ = '0.13.2' documentation

according to this documentation and dremio rest API description, there’s context parameter that i can use when i wanna run a query and give a path where i run it.

but for some reason i always fail to pass the “context” parameter and i have no idea which format is proper.

for example my user name is “abc” and i have “tpcds” workspace. i tried to passes like “@abc.tpcds” , “tpcds” “/tpcds” and i failed to validate my query (the code i got was 400 every time).

i have two questions

  1. how could i pass “context” for my query execution path decision? can you give me some example?
  2. do you have tpc-ds query set which run on dremio well? i use presto and spark sql query set but some queries get failed

thanks in advance!

@indigoblue8848

#1 Does the same query work via the UI?
#2 When you some queries get failed, do you have some profiles?

  1. failed queries via cli(API) dont’s work via the UI as well. i will list failed queries with error messages below

  2. i have profile and i’m not sure if you want to share with you. because of security policy of my company i guess i should be a little bit careful about uploading files on internet. if it’s necessary please let me know.

i have 6 servers equipped with 380GB memory.

conf
DREMIO_MAX_MEMORY_SIZE_MB=320384
DREMIO_MAX_DIRECT_MEMORY_SIZE_MB=200384
DREMIO_MAX_HEAP_MEMORY_SIZE_MB=81920

here’s failed queries

query77 Dremio doesn’t support ROLLUP, CUBE or GROUPING SETS functionality
query67 Dremio doesn’t support ROLLUP, CUBE or GROUPING SETS functionality.
query48 takes too long
query41 Dremio doesn’t currently support non-scalar sub-queries used in an expressio
query38 Dremio doesn’t currently support INTERSECT operations.
queryq24a Query was cancelled because it exceeded the memory limits set by the administrator.
queryq24b Query was cancelled because it exceeded the memory limits set by the administrator.
query23b Dremio doesn’t currently support non-scalar sub-queries used in an expression
query23a Dremio doesn’t currently support non-scalar sub-queries used in an expression
query22 Dremio doesn’t support ROLLUP, CUBE or GROUPING SETS functionality.
query18 Dremio doesn’t support ROLLUP, CUBE or GROUPING SETS functionality.
query14a Dremio doesn’t currently support INTERSECT operations.
query13 takes tooooo long. error with ExecutionSetupException: One or more nodes lost connectivity during query. Identified nodes were [10.182.0.59:0].
query08 Dremio doesn’t currently support INTERSECT operations.
query06 Dremio doesn’t currently support non-scalar sub-queries used in an expression
query05 Dremio doesn’t support ROLLUP, CUBE or GROUPING SETS functionality.

@indigoblue8848 I see two classifications of errors

  • Out of memory - Seems like a capacity issue, if you think it is not then we need profiles to continue helping
  • Support for SQL functions, we are constantly enhancing our SQL functions list and please watch release notes for new SQL functions introduced

thanks, as for out of memory, the queries incur this errors works well with presto on the same machine. i’ll share my profile after checking if it’s not prohibited.

my main question at this point is about “context” and regarding this information would be very helpful :slight_smile: