504 - Gateway Time-out when formatting

david.lee · October 10, 2019, 8:12pm

Working on connecting our internal S3 StorageGrid to Kubernetes installed Dremio 4.0.2.

After choosing a directory containing parquet files it looks like it is working fine, but on save it gives a 504 - Gateway Time-out

Tried all kinds of settings:

fs.s3a.connection.maximum to 10000
fs.s3a.max.threads to 5000

Enable compatibility and asynchronous access is checked.

Is there a log file that contains debugging info?

doron · October 10, 2019, 9:33pm

Server.log usually contains errors for REST API calls. If there are no errors logged, one thing you can do is run SELECT * FROM source.folder which will autopromote the folder - the query profile (if it fails) will contain the error.

Note that you need to enable Automatically format files into physical datasets when users issue queries. for the Source in the Metadata section (the default behavior was changed so it may be turned on if it was created a while back).

david.lee · October 10, 2019, 9:57pm

Ok. Got some error info back running a Select * from source.folder…

Waited for 15000ms, but tasks for ‘Fetch parquet metadata’ are not complete. Total runnable size 11, parallelism 11.

chafidz · October 11, 2019, 1:37am

Did you read from many parquet (e.g. the result of spark streaming) in a single file path or just a few of them?

david.lee · October 11, 2019, 5:43pm

I’ve got different directories with different parquet files.
Some directories contain 10 or less files. Others have subdirectories with 100s of parquet files.
They were all created using pyarrow and sized to be 128 megs or less

I came in this morning and the number of 504 errors have almost disappeared and when they do show up after I do a web page refresh it shows the directory in Purple so the format was still successful.

I’m basically migrating 3 terabytes of files from HDFS to S3.

Something is probably wrong with our S3 storage…

david.lee · February 10, 2020, 11:01pm

The root problem was just stability with our S3 system, but now I’m seeing the same problem with JSON source files.

select * limit 10000 works fine, but select count(*) or any type of non-limiting query creates a GC and the server becomes unstable and needs to be recycled.

2020-02-10T22:56:51.583+0000: [GC (Allocation Failure) [PSYoungGen: 1395008K->384K(1396224K)] 2561866K->1167290K(2983936K), 0.0078312 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
2020-02-10T22:56:52.123+0000: [GC (Allocation Failure) [PSYoungGen: 1395072K->320K(1396224K)] 2561978K->1167298K(2983936K), 0.0088563 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
2020-02-10T22:56:52.669+0000: [GC (Allocation Failure) [PSYoungGen: 1395008K->460K(1395200K)] 2561986K->1167479K(2982912K), 0.0091217 secs] [Times: user=0.04 sys=0.00, real=0.01 secs]

balaji.ramaswamy · February 14, 2020, 8:14am

@david.lee

GC allocation Failures are fine can you check if you see “Full GC” during that time?

david.lee · February 19, 2021, 7:25pm

I did find the root problem of the 504 gateway timeouts… I had to change the consistency setting for my S3 bucket…

http://docs.netapp.com/sgws-111/index.jsp?topic=%2Fcom.netapp.doc.sg-s3%2FGUID-B48E07AA-B1F5-41E6-964C-81B599517A45.html

read-after-new-write (Default) Provides read-after-write consistency for new objects and eventual consistency for object updates. Offers high availability and data protection guarantees. Matches AWS S3 consistency guarantees.

Note: If your application uses HEAD requests on objects that do not exist, you might receive a high number of 500 Internal Server errors if one or more Storage Nodes are unavailable. To prevent these errors, set the consistency control to “available” unless you require consistency guarantees similar to AWS S3.

available (eventual consistency for HEAD operations) Behaves the same as the “read-after-new-write” consistency level, but only provides eventual consistency for HEAD operations. Offers higher availability for HEAD operations than “read-after-new-write” if Storage Nodes are unavailable. Differs from AWS S3 consistency guarantees for HEAD operations only.

balaji.ramaswamy · February 20, 2021, 5:11am

Thanks for the update and the useful information @david.lee, glad it is working now

Topic		Replies	Views
Gateway Timeout when creating PDS format	2	1260	October 22, 2018
Unable to execute HTTP request: Timeout waiting for connection from pool	9	8863	January 11, 2019
Timeout reading parquet file from s3	5	4000	January 24, 2019
Query was cancelled planning time exceeded 60 seconds	3	1966	June 7, 2018
The sourceis currently unavailable. Metadata is not accessible	1	41	October 27, 2024

504 - Gateway Time-out when formatting

Related topics