How to insert data from dataframe in a iceberg table?

chrisRains · October 18, 2025, 10:09am

I did setup minio and connected it via nessie to dremio ( community edition). Now I want to insert rows from a dataframe in an iceberg table via dremio / pyarrow.
how would I do this ?

I managed to read the data, but I didn’t find an example yet how to insert rows from let’s say a pandas dataframe. An example would be super helpful here

thanks in advance

Chris

quangbilly79 · October 27, 2025, 7:58am

Pandas Dataframe is an in-memory dataframe. Pandas reads data from CSV files from disk, tables from databases, and saves it as an in-memory dataframe.

Dremio can read/write from/to Iceberg tables (need Catalog like Nessie), and Iceberg tables save data as parquet files, which are stored on disk (S3/HDFS/Local HDD). Iceberg tables not only need data files (parquet format), but they also need metadata files (Avro, JSON format).

1. Insert into Dremio Iceberg table from another table (even a different DB/source, like from PostgreSQL to Dremio Nessie)

INSERT INTO arctic.sales_data
SELECT * FROM postgres.sales_info;

2. Insert into Dremio Iceberg table from CSV files
COPY INTO arctic.customer_data
FROM ‘@s3/bucket-name/customer-data-folder/’
FILES (‘customers.csv’, ‘additional_customers.csv’)
FILE_FORMAT ‘csv’
(FIELD_DELIMITER ‘,’)

Insert into Dremio Iceberg table from Pandas df
You have to use a specific library and JDBC to do this, like pyodbc. The flow is something like, use Pandas to read data from CSV/table, then write to Dremio Iceberg table using JDBC connection

If you use Spark, you can write directly from Spark df to Iceberg Tables using the Spark-Iceberg extension. I don’t know if Pandas has this kind of extension

chrisRains · October 29, 2025, 6:28pm

I am able to insert a spark dataframe in an iceberg table directly, without dremio. But I want to insert data via dremio, without materializing the data before. I know that single row inserts are possible via pyarrow or pyodbc. But I want to do bulk uploads from a dataframe or similar. Surprisingly dremio does not support parametrized uploads, so I guess execute many is not possible, and single row inserts are for sure too slow.

so is there a way to do bulk uploads via pyarrow or pyodbc directly via dremio ?

quangbilly79 · October 30, 2025, 1:36am

Parameterized Query is supported in the Cloud version. But still no update on the Software Community version

Topic		Replies	Views
How to upload data to dremio nessie iceberg table via apache arrow flight? Apache Iceberg	2	58	October 9, 2025
Inserting data into Iceberg table using Arrow Flight client	9	921	May 27, 2025
Dremio Arrow Flight client writes	22	724	October 18, 2024
Unable to Query Iceberg Data in Dremio - Data Written by Kafka Connector Apache Iceberg	11	634	January 10, 2025
Dbt tables stored in Apache Iceberg	8	2432	August 31, 2023

How to insert data from dataframe in a iceberg table?

Related topics