Apache Spark howto import data from a jdbc database using python

 2016-10-27

Using Apache spark 2.0 and python I’ll show how to import a table from a relational database (using its jdbc driver) into a python dataframe and save it in a parquet file. In this demo the database is an oracle 12.x file jdbc-to-parquet.py:``` from pyspark.sql import SparkSession

spark = SparkSession \ .builder \ .appName(“Python Spark SQL basic example”) \ .getOrCreate()

df = spark.read.format(“jdbc”).options(url=“jdbc:oracle:thin:ro/ro@mydboracle.redaelli.org:1521:MYSID”, dbtable=“myuser.dim_country”, driver=“oracle.jdbc.OracleDriver”).load()

df.write.parquet(“country.parquet”)

 Tags: #apache #database #jdbc #stark

Share on the Fediverse

Enter your instance's address

Cancel Share

Calling Qliksense Repository API from Apache Drill via sql

 2022-02-23 |  #apache #api #drill #Qliksense #rest #sql #sql

Abstract I’ll show how to connect to Qliksense Repository API via sql using Apache Drill. In this example Qliksense engine service runs at https://qlik.redaelli.org:4242/ Download Download and unzip Apache Drill from https://drill.

Continue reading 

Calling Talend Cloud Rest API from Apache Drill via sql

 2022-02-04 |  #apache #api #drill #rest #sql #sql #Talend

Abstract I’ll show how to connect to Talend Cloud API via sql using Apache Drill. Download Download Apache Drill from https://drill.apache.org/download/ Configure Create or edit the file conf/storage-plugins-override.conf "storage": { "talendcloud" : { "type" : "http", "cacheResults" : true, "connections" : { "get" : { "url" : "https://api.

Continue reading 

MR70

Apache Spark howto import data from a jdbc database using python

Enter your instance's address

More posts like this

Calling Qliksense Repository API from Apache Drill via sql

Calling Talend Cloud Rest API from Apache Drill via sql