There are commands which allows to query different types of metadata but It is not entirely obvious for me how to list all columns for a specific table.
There are commands to get tables, schemas, catalogs, primary and foreign keys but it is not straight forward how to get columns unless I missed something ?
JDBC specification includes corresponding getColumns method.
Is there any way to get all columns for a table using Flight Sql ?
I’ve discovered myself that it might be achieved by sending CommandGetTables command with includeSchema parameter set to true, unfortunately such request ends with an error, so I cannot check if it really works.
In the job profile I see the query recognized as [Get Columns] Catalog Filter: , Schema Filter: , Table Filter , Column Filter .
so it proves I am going right way but still as mentioned the request cannot be completed.
I found in the server.log file something which might help in finding where is the problem:
2022-06-21 14:39:10,820 [1d4e27f1-1618-8a1f-0788-004044c1e200/0:foreman-planning] WARN com.dremio.exec.store.ischema.Column - Type not handled explicitly (code needs review): ANY
2022-06-21 14:39:10,832 [out-of-band-observer] INFO query.logger - Query: 1d4e27f1-1618-8a1f-0788-004044c1e200; outcome: COMPLETED
Shall I open the ticket for this ? if so please direct me to the repository with the link. Thank you
Yes, the GetTables RPC with includeSchema is how you’d get column metadata similar to the COLUMNS view in INFORMATION_SCHEMA.
It seems likely that there’s a problem with the particular table you are testing. It’d be good to log a ticket with details about that tables (in particular its name and data types) and indicate if the problem occurs with other tables with simpler data types.
Thank you @jduong for this confirmation.
Regarding the table I am using for testing it seems irrelevant. I’ve tried almost every possible combination, With table, with no table (passing null -to collect all), with schema, with no schema (null), with catalog, and with no catalog (null) and always getting the exception below.
I’ve used Derby Flight SQL Service included in Apache Arrow Flight SQL source code and it works fine.
Exception in thread "main" org.apache.arrow.flight.FlightRuntimeException: INTERNAL: There was an error servicing your request.
at org.apache.arrow.flight.CallStatus.toRuntimeException(CallStatus.java:131)
at org.apache.arrow.flight.grpc.StatusUtils.fromGrpcRuntimeException(StatusUtils.java:164)
at org.apache.arrow.flight.grpc.StatusUtils.fromThrowable(StatusUtils.java:185)
at org.apache.arrow.flight.FlightStream$Observer.onError(FlightStream.java:454)
at org.apache.arrow.flight.FlightClient$1.onError(FlightClient.java:328)
at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:479)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at org.apache.arrow.flight.grpc.ClientInterceptorAdapter$FlightClientCallListener.onClose(ClientInterceptorAdapter.java:117)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:562)
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:743)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:722)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Hi,
thank you for answering all the previous questions.
I am looking for one yet more answer. I would like to test one scenario where
I am getting data from various different endpoints using Flight SQL.
Here is example how to get data using FlightSql
FlightInfo flightInfo; // Use a FlightSqlClient method to get a FlightInfo
// 1. Fetch each partition sequentially (though this can be done in parallel)
for (FlightEndpoint endpoint : flightInfo.getEndpoints()) {
// 2. Get a stream of results as Arrow vectors
try (FlightStream stream = flightSqlClient.getStream(endpoint.getTicket())) {
// 3. Iterate through the stream until the end
while (stream.next()) {
// 4. Get a chunk of results (VectorSchemaRoot) and print it to the console
VectorSchemaRoot vectorSchemaRoot = stream.getRoot();
System.out.println(vectorSchemaRoot.contentToTSVString());
}
}
}
and I am talking specifically about this line:
// 1. Fetch each partition sequentially (though this can be done in parallel)
for (FlightEndpoint endpoint : flightInfo.getEndpoints()) {
I always get the same single endpoint and I would like to employ several partitions instead.
Could you please guide me what I should do in regards to cluster configuration and data source
to achieve getting many endpoints from flight info.
As part of the GetTables request you can specify the schema and table to indicate the exact table you want to get metadata for. You can set the include_schema flag to get column metadata.