How to add an Amazon S3 data source via REST API?

@monocongo,

You can use POST catalog/ to create the S3 source in Dremio via the API. For example (using cURL):

curl --request POST \
  --url 'http://localhost:9047/api/v3/catalog/?=' \
  --header 'authorization: _dremio{authorization token}' \
  --header 'content-type: application/json' \
  --data '{
  "entityType": "source",
  "config": {
    "accessKey": "your S3 access key here",
    "accessSecret": "your S3 access secret here",
    "secure": false,
    "allowCreateDrop": true,
    "rootPath": "/",
    "credentialType": "ACCESS_KEY",
    "enableAsync": true,
    "compatibilityMode": false,
    "isCachingEnabled": true,
    "maxCacheSpacePct": 100,
    "requesterPays": false,
    "enableFileStatusCheck": true
  },
  "type": "S3",
  "name": "testing-S3",
  "metadataPolicy": {
    "authTTLMs": 86400000,
    "namesRefreshMs": 3600000,
    "datasetRefreshAfterMs": 3600000,
    "datasetExpireAfterMs": 10800000,
    "datasetUpdateMode": "PREFETCH_QUERIED",
    "deleteUnavailableDatasets": true,
    "autoPromoteDatasets": false
  },
  "accelerationGracePeriodMs": 10800000,
  "accelerationRefreshPeriodMs": 3600000,
  "accelerationNeverExpire": false,
  "accelerationNeverRefresh": false,
  "allowCrossSourceSelection": false,
  "accessControlList": {},
  "permissions": [],
  "checkTableAuthorizer": true
}'

Note that the rootPath here is set to / so you will see all the buckets in this S3 account that the credentials have access to. Then, as @balaji.ramaswamy noted, assuming that part_0.csv and part_1.csv have the same schema, you can promote (format) the datadir folder to a physical dataset which will contain records from both of the files.

1 Like