Summary Statistics of Job

vrb · March 27, 2022, 1:26am

Hi,
What does the Input Size (339.43 GB) in the below picture represent ? Is it the total size of the data that was parsed OR is it the size of the Memory used by the Job ? Is there a way i can get this programmatically through the query profile file ?

Thank you
VRB

balaji.ramaswamy · April 3, 2022, 5:09pm

@vrb queries.json should have those fields and you can add it toDremio and query. If you are interested how much memory the query used node wise, that should be available in the profile

vrb · April 3, 2022, 8:18pm

Hi Balaji,
Looks like you answered a different question of mine on this thread. My question was what does the Input size indicate ? and how can i get it programmatically through the profile (which key should i be looking at)

vrb · April 3, 2022, 8:19pm

Hi Balaji,
Looks like you answered a different question of mine on this thread. My question was what does the Input size indicate ? and how can i get it programmatically through the profile (which key should i be looking at)

balaji.ramaswamy · April 5, 2022, 6:18pm

@vrb Profiles does not show a single box called input bytes, you can get that from queries.json. Will get back you on exactly what input bytes accounts for

balaji.ramaswamy · April 6, 2022, 10:48pm

@vrb

Input: Size/Records count

Input counter tracks the size of records read and their count by summing across all the scan operators of the query like AVRO_SUB_SCAN , PARQUET_ROW_GROUP_SCAN , MONGO_SUB_SCAN etc.

Output: Size/Records count

Output counter tracks the size of records written/output and their count by summing across the writer operators. If there are multiple writer operators, we count the operators lower in the tree (the one writing data) as opposed to the upper writer (the one writing metadata).

Internal queries (UI/REST): The ARROW_WRITER operators are counted as the results are written to a store.

External queries (ODBC/JDBC/FLIGHT): The SCREEN operator is counted as the results are streamed to the client directly.

CTAS queries: The output operators considered are of PARQUET_WRITER, TEXT_WRITER, JSON_WRITER based on the write type.

Topic		Replies	Views
Input Bytes size mismatch Reflection Footprint	1	961	April 1, 2021
Unable to read data more than 10K when i upload in Dremio	2	1516	February 15, 2019
Dremio displays incorrect Rows Returned count while query is running	1	649	August 22, 2023
Information about RowsScanned from historical jobs	7	543	July 22, 2023
Dremio read data from HDFS	3	1309	October 24, 2018

Summary Statistics of Job

Input: Size/Records count

Output: Size/Records count

Related topics