Parquet Logical Type Support


#1

Running into this error:

java.lang.UnsupportedOperationException: SMALLINT is not supported

I understand that INT8 and INT16 are only logical data types in parquet and that the format converts them to INT32 anyway, but is there any way or plan to have dremio support these Logical Data Types?


#2

Hi @mitchell.davis,

Wondering if you are querying the Parquet file via Hive or directly on the file system? Any chance you can send me the parquet-tool meta for this Parquet file?

Thanks,
@balaji.ramaswamy


#3

metadata.zip (13.0 KB)

My Bad @balaji.ramaswamy, I uploaded the wrong metadata, I’ve corrected it in this one.

We’re querying directly from AWS S3.


#4

@mitchell.davis

Thanks. Let me look into this and get back to you. Meanwhile if the Parquet files or one of the Parquet Files is relatively small and does not have sensitive data, kindly attach

Thanks
@balaji.ramaswamy


#5

It’s not small and it is kinda sensitive. Sorry @balaji.ramaswamy. Any insight you can give me would be great.


#6

Hi @mitchell.davis,

Can you share any information about how this parquet file that gives the exception was generated?

Thanks,
@ben


#7

It was built with Spark @ben. I’m currently working on generating a proof of concept that I’ll attach here.


#8

@ben @balaji.ramaswamy: I’ve created a test parquet file that is producing the error in our Dremio Implementation.

part-00000-77734830-7ea8-4e3d-9ea7-250d7b04661d.snappy.parquet.zip (687 Bytes)

Specifically, here is the error that is coming up. For some reason it came up under jobs this time. I usually have to navigate the server logs.

DATA_READ ERROR: Failure while attempting to retrieve metadata information for table uploads.<REDACTED>."part-00000-77734830-7ea8-4e3d-9ea7-250d7b04661d.snappy_parquet-2d705144-d72a-4631-b24b-7730db0b0f9e".

SQL Query select * from table("__home".<REDACTED>."part-00000-77734830-7ea8-4e3d-9ea7-250d7b04661d.snappy_parquet-2d705144-d72a-4631-b24b-7730db0b0f9e" (type => 'parquet')) limit 500


  (java.lang.UnsupportedOperationException) SMALLINT is not supported
    com.dremio.exec.store.dfs.MetadataUtils.toPartitionValue():183
    com.dremio.exec.store.parquet.ParquetFormatDatasetAccessor.getSplits():260
    com.dremio.exec.store.parquet.ParquetFormatDatasetAccessor.buildAll():194
    com.dremio.exec.store.parquet.ParquetFormatDatasetAccessor.buildDataset():175
    com.dremio.exec.store.dfs.FileSystemDatasetAccessor.getDataset():112
    com.dremio.service.namespace.QuietAccessor.getDataset():42
    com.dremio.exec.store.MaterializedDatasetTable.getRowType():73
    org.apache.calcite.sql.validate.ProcedureNamespace.validateImpl():69
    org.apache.calcite.sql.validate.AbstractNamespace.validate():84
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():943
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():924
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():2971
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():2956
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3197
    org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
    org.apache.calcite.sql.validate.AbstractNamespace.validate():84
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():943
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():924
    org.apache.calcite.sql.SqlSelect.validate():226
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():899
    org.apache.calcite.sql.validate.SqlValidatorImpl.validate():609
    com.dremio.exec.planner.sql.SqlConverter.validate():180
    com.dremio.exec.planner.sql.handlers.PrelTransformer.validateNode():174
    com.dremio.exec.planner.sql.handlers.PrelTransformer.validateAndConvert():163
    com.dremio.exec.planner.sql.handlers.PrelTransformer.validateAndConvert():159
    com.dremio.exec.planner.sql.handlers.query.NormalHandler.getPlan():43
    com.dremio.exec.planner.sql.handlers.commands.HandlerToExec.plan():69
    com.dremio.exec.work.foreman.AttemptManager.run():292
    java.util.concurrent.ThreadPoolExecutor.runWorker():1149
    java.util.concurrent.ThreadPoolExecutor$Worker.run():624
    java.lang.Thread.run():748

#9

Hi @mitchell.davis

The attached Parquet (full of nulls) queries fine from uploads, NAS and S3,

What version of Dremio are you using?
Can we have the job profile that got created last time?
If you copy the file to your computer, are you able to upload to Dremio via the upload button?

Thanks
@balaji.ramaswamy


#10

Here is the profile:

784df5d4-3f6c-40d6-904d-baee81e19836.zip (4.3 KB)

That was gathered from downloading it on my machine and uploading it via the upload button.

Our version of dremio is: 3.0.0


#11

Hi @mitchell.davis

I was able to reproduce the issue on an older version of Dremio. Not happening on a later version.

Caused by: java.lang.UnsupportedOperationException: SMALLINT is not supported
at com.dremio.exec.store.dfs.MetadataUtils.toPartitionValue(MetadataUtils.java:183) ~[dremio-sabot-kernel-2.1.8-201810192316430922-9b6e669.jar:2.1.8-201810192316430922-9b6e669]

Kindly download 3.0.6 from our website and give it a shot

Thanks
@balaji.ramaswamy


#12

Sounds good @balaji.ramaswamy. I’ll upgrade and report back.


#13

Well, @balaji.ramaswamy I upgraded. I tested on the parquet file I attached and everything seems to work. However, when I point at the original data, it still fails. The worst part is that I’m not getting any errors in the logs now.

The screenshot above is from the error we get in the UI after I setup the DataSet in the UI. I’m scouring the logs and nothing is coming through that hints anything is wrong. Are there settings I’m missing that would push those kinds of errors to the logs?

Also, the documentation says we can only restore from the same version of the software. Does that mean any version or only major releases? I lost all my metadata forgetting that I couldn’t update.


#14

Hi @mitchell.davis

Can you please click help-about Dremio and see if the version shows 3.0.6?

Thanks
@balaji.ramaswamy


#15

@balaji.ramaswamy


#16

@balaji.ramaswamy I was able to find the new error in the logs: (I cut off the bottom of the stack trace, did you need that?)

2019-01-15 03:42:31,695 [qtp160830059-109] ERROR c.d.d.server.GenericExceptionMapper - Unexpected exception when processing POST http://dremio.veriship.net/apiv2/datasets/new_untitled?parentDataset=<REDACTED>&newVersion=0003598160851519&limit=150 : java.lang.NullPointerException
java.lang.NullPointerException: null
at com.dremio.dac.explore.DatasetsResource.getDatasetSummary(DatasetsResource.java:268) ~[dremio-dac-backend-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
at com.dremio.dac.explore.DatasetsResource.newUntitled(DatasetsResource.java:141) ~[dremio-dac-backend-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
at com.dremio.dac.explore.DatasetsResource.newUntitledFromParent(DatasetsResource.java:208) ~[dremio-dac-backend-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_191]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_191]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_191]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_191]
at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81) ~[jersey-server-2.25.1.jar:na]
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144) ~[jersey-server-2.25.1.jar:na]
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161) ~[jersey-server-2.25.1.jar:na]
at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:205) ~[jersey-server-2.25.1.jar:na]
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99) ~[jersey-server-2.25.1.jar:na]
at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389) ~[jersey-server-2.25.1.jar:na]
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347) ~[jersey-server-2.25.1.jar:na]
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102) ~[jersey-server-2.25.1.jar:na]
at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326) ~[jersey-server-2.25.1.jar:na]
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) [jersey-common-2.25.1.jar:na]
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) [jersey-common-2.25.1.jar:na]
at org.glassfish.jersey.internal.Errors.process(Errors.java:315) [jersey-common-2.25.1.jar:na]
at org.glassfish.jersey.internal.Errors.process(Errors.java:297) [jersey-common-2.25.1.jar:na]
at org.glassfish.jersey.internal.Errors.process(Errors.java:267) [jersey-common-2.25.1.jar:na]
at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317) [jersey-common-2.25.1.jar:na]
at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305) [jersey-server-2.25.1.jar:na]
at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154) [jersey-server-2.25.1.jar:na]
at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473) [jersey-container-servlet-core-2.25.1.jar:na]
at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427) [jersey-container-servlet-core-2.25.1.jar:na]
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388) [jersey-container-servlet-core-2.25.1.jar:na]

#17

Hi @mitchell.davis

This is a different error. Can you try adding it as a local NAS as opposed to S3 or direct upload?

Thanks
@balaji.ramaswamy


#18

@balaji.ramaswamy, I did just that and the same thing happened. Here is the error that popped up in the logs:

2019-01-15 16:36:29,220 [qtp160830059-150] WARN  com.dremio.exec.catalog.DatasetSaver - Failure while retrieving and saving dataset temp.problemfiles.
java.lang.UnsupportedOperationException: SMALLINT is not supported
	at com.dremio.exec.store.dfs.MetadataUtils.toPartitionValue(MetadataUtils.java:183) ~[dremio-sabot-kernel-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
	at com.dremio.exec.store.parquet.ParquetFormatDatasetAccessor.getSplits(ParquetFormatDatasetAccessor.java:267) ~[dremio-sabot-kernel-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
	at com.dremio.exec.store.parquet.ParquetFormatDatasetAccessor.buildAll(ParquetFormatDatasetAccessor.java:201) ~[dremio-sabot-kernel-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
	at com.dremio.exec.store.parquet.ParquetFormatDatasetAccessor.buildDataset(ParquetFormatDatasetAccessor.java:182) ~[dremio-sabot-kernel-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
	at com.dremio.exec.store.dfs.FileSystemDatasetAccessor.getDataset(FileSystemDatasetAccessor.java:112) ~[dremio-sabot-kernel-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
	at com.dremio.exec.catalog.DatasetSaver.completeSave(DatasetSaver.java:65) ~[dremio-sabot-kernel-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
	at com.dremio.exec.catalog.DatasetManager.createOrUpdateDataset(DatasetManager.java:380) [dremio-sabot-kernel-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
	at com.dremio.exec.catalog.CatalogImpl.createOrUpdateDataset(CatalogImpl.java:497) [dremio-sabot-kernel-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
	at com.dremio.exec.catalog.DelegatingCatalog.createOrUpdateDataset(DelegatingCatalog.java:179) [dremio-sabot-kernel-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
	at com.dremio.dac.service.source.SourceService.createPhysicalDataset(SourceService.java:535) [dremio-dac-backend-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
	at com.dremio.dac.resource.SourceResource.saveFolderFormat(SourceResource.java:375) [dremio-dac-backend-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_191]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_191]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_191]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_191]
	at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81) [jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144) [jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161) [jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:205) [jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99) [jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389) [jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347) [jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102) [jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326) [jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) [jersey-common-2.25.1.jar:na]
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) [jersey-common-2.25.1.jar:na]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:315) [jersey-common-2.25.1.jar:na]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:297) [jersey-common-2.25.1.jar:na]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:267) [jersey-common-2.25.1.jar:na]
	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317) [jersey-common-2.25.1.jar:na]
	at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305) [jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154) [jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473) [jersey-container-servlet-core-2.25.1.jar:na]
	at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427) [jersey-container-servlet-core-2.25.1.jar:na]
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388) [jersey-container-servlet-core-2.25.1.jar:na]
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341) [jersey-container-servlet-core-2.25.1.jar:na]
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228) [jersey-container-servlet-core-2.25.1.jar:na]
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:812) [jetty-servlet-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669) [jetty-servlet-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83) [jetty-servlets-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:301) [jetty-servlets-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) [jetty-servlet-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) [jetty-servlet-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) [jetty-servlet-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:95) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.Server.handle(Server.java:499) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:258) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544) [jetty-io-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) [jetty-util-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) [jetty-util-9.2.22.v20170606.jar:9.2.22.v20170606]
	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_191]
2019-01-15 16:36:29,680 [qtp160830059-20402] ERROR c.d.d.server.GenericExceptionMapper - Unexpected exception when processing POST http://dremio.veriship.net/apiv2/datasets/new_untitled?parentDataset=temp.problemfiles&newVersion=0009912186041139&limit=150 : java.lang.NullPointerException
java.lang.NullPointerException: null
	at com.dremio.dac.explore.DatasetsResource.getDatasetSummary(DatasetsResource.java:268) ~[dremio-dac-backend-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
	at com.dremio.dac.explore.DatasetsResource.newUntitled(DatasetsResource.java:141) ~[dremio-dac-backend-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
	at com.dremio.dac.explore.DatasetsResource.newUntitledFromParent(DatasetsResource.java:208) ~[dremio-dac-backend-3.0.6-201812082352540436-1f684f9.jar:3.0.6-201812082352540436-1f684f9]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_191]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_191]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_191]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_191]
	at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81) ~[jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144) ~[jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161) ~[jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:205) ~[jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99) ~[jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389) ~[jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347) ~[jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102) ~[jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326) ~[jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) [jersey-common-2.25.1.jar:na]
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) [jersey-common-2.25.1.jar:na]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:315) [jersey-common-2.25.1.jar:na]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:297) [jersey-common-2.25.1.jar:na]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:267) [jersey-common-2.25.1.jar:na]
	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317) [jersey-common-2.25.1.jar:na]
	at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305) [jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154) [jersey-server-2.25.1.jar:na]
	at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473) [jersey-container-servlet-core-2.25.1.jar:na]
	at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427) [jersey-container-servlet-core-2.25.1.jar:na]
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388) [jersey-container-servlet-core-2.25.1.jar:na]
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341) [jersey-container-servlet-core-2.25.1.jar:na]
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228) [jersey-container-servlet-core-2.25.1.jar:na]
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:812) [jetty-servlet-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669) [jetty-servlet-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83) [jetty-servlets-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:301) [jetty-servlets-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) [jetty-servlet-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) [jetty-servlet-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) [jetty-servlet-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:95) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.Server.handle(Server.java:499) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:258) [jetty-server-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544) [jetty-io-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) [jetty-util-9.2.22.v20170606.jar:9.2.22.v20170606]
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) [jetty-util-9.2.22.v20170606.jar:9.2.22.v20170606]
	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_191]

@balaji.ramaswamy, according to the stack trace, it looks like the split logic is choosing a SMALLINT column as a partition split which it doesn’t support: https://github.com/dremio/dremio-oss/blob/16b26a201d241b496cb9b03f7d301ff2b139eea9/sabot/kernel/src/main/java/com/dremio/exec/store/dfs/MetadataUtils.java#L183

I can create a pull request to update that logic but I have NO idea what that will break.


#19

Hi @mitchell.davis

I just downloaded 3.0.6 Community Edition from the Dremio website and just followed the exact same steps on your Parquet and works fine. I have a screen recording and wanted to see if you can find anything I missed or you missed :slight_smile:

I am not able to attach the screen recording as it is ~ 23 MB

Do you have an email I share the recording?

Thanks
@balaji.ramaswamy


#20

@balaji.ramaswamy, the parquet file you’re using works. However, that was cleansed data. I pulled out as much sensitive data as I could. The problem popped back up when I pointed to the original data that you don’t have. I’m working to figure out if I can cleanse it a different way that will break 3.0.6.