JDBC writing data to S3

Hi @balaji.ramaswamy ,

I am trying to follow create jdbc connection with Dremio using spark . In which I am reading from RDBMS and writing to S3. For that I have already added our s3 as data source with below options.

However, When I am trying to write back to data. Below is my code

package com.example.dremio.spark;

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
import org.apache.spark.sql.functions;

public class SparkExample {


static Dataset<Row> processDf(Dataset<Row> orderDetails, Dataset<Row> products,Dataset<Row> orders){
    orders = orders.withColumn("DAY_OF_WEEK",functions.date_format(functions.col("orderDate"),"E")).filter(functions.col("DAY_OF_WEEK").equalTo("Mon"));
    Dataset<Row> order_orderDetails = orders.join(orderDetails,orders.col("orderNumber").equalTo(orderDetails.col("orderNumber")),"inner");
    return order_orderDetails.join(products,order_orderDetails.col("productCode").equalTo(products.col("productCode")));
}

public static void main(String[] args) throws ClassNotFoundException {
    SparkSession spark = SparkSession.builder()
            .appName("Dremio-Spark example")
            .master("local[*]")
            .getOrCreate();

    Dataset<Row> orderDetails = spark.read().format("jdbc").option("driver","com.dremio.jdbc.Driver").option("url", "jdbc:dremio:direct=127.0.0.1:31857").option("dbtable", "mysqldev.classicmodels.orderdetails").option("user", "ayush.goyal").option("password", "mypassword").load();
    orderDetails.createOrReplaceGlobalTempView("orderDetails");

    Dataset<Row> products = spark.read()
            .format("jdbc")
            .option("driver","com.dremio.jdbc.Driver")
            .option("url", "jdbc:dremio:direct=127.0.0.1:31857")
            .option("dbtable", "mysqldev.classicmodels.products")
            .option("user", "ayush.goyal")
            .option("password", "mypassword")
            .load();

    Dataset<Row> orders = spark.read()
            .format("jdbc")
            .option("driver","com.dremio.jdbc.Driver")
            .option("url", "jdbc:dremio:direct=127.0.0.1:31857")
            .option("dbtable", "mysqldev.classicmodels.orders")
            .option("user", "ayush.goyal")
            .option("password", "mypassword")
            .load();

    Dataset<Row> df = processDf(orderDetails,products,orders);
    df.show();
    df.write().format("jdbc").option("driver","com.dremio.jdbc.Driver").option("url","jdbc:dremio:direct=127.0.0.1:31857").option("dbtable","\"minio-dev\".dremio.result").option("user","ayush.goyal").option("password","mypassword").save();

}
}

With this code I am to read the data but unable to write because Dremio interpreting this incorrectly. for that I have attached the profile.

ac3a7993-131c-4fad-bbee-5452ea7cc240.zip (5.7 KB)

Thanks

@Ayush.goyal Dremio only supports CTAS not just create table, here are some examples

https://docs.dremio.com/sql-reference/sql-commands/tables.html

Hi @balaji.ramaswamy ,

I am not running query directly. I am using spark to write the data to s3 source through dremio. Is it possible write data to s3 source? if yes then how?

Hi @balaji.ramaswamy ,

Is there a way to programatically write data to s3 source by enabling CTAS and Drop queries in advance systems?

@Ayush.goyal You can use REST API calls to

Create source

http://docs.dremio.com/rest-api/catalog/container-source/

Modify source

http://docs.dremio.com/rest-api/catalog/container-source-config/

SQL API

http://docs.dremio.com/rest-api/sql/post-sql/

@balaji.ramaswamy
I am more interested in creation of file programatically using jdbc/odbc drivers.

@Ayush.goyal Apologies, not able to fully understand your question, let me try to understand and help.

Are you trying to run a CTAS from JDBC program or tool? Not sure how you can create CTAS files from the driver? I assume when you say JDBC then it is via a JDBC program (That internally uses the driver)?

Yes, I am trying to create file from a dataset in Spark using JDBC. Generally this command from my code create file on destination with data in dataset. This is the way we use spark to write to jdbc source. Is it different with dremio. If yes, then can you please share the example to do that?

df.write().format("jdbc").option("driver","com.dremio.jdbc.Driver").option("url","jdbc:dremio:direct=127.0.0.1:31857").option("dbtable","\"minio-dev\".dremio.result").option("user","ayush.goyal").option("password","mypassword").save();

Hi @balaji.ramaswamy Any info on this? Is Dremio support writing to data lake using jdbc connector?

@Ayush.goyal Currently the only way to write to S3 is using the CTAS command, it is a Dremio command that can be called from the UI or a JDBC tool like DBeaver/DBvisualizer