Oracle Big Data Connectors

Oracle Big Data Connectors facilitate to access data in a hadoop cluster. Can be licensed on either Oracle Big Data Appliance or a Hadoop cluster running on commodity hardware.

  • Oracle SQL Connector for HDFS: Enables an Oracle external table to access data in HDFS files or a table in Apache Hive.
  • Oracle Loader for Hadoop: Provides a high-performance loader for fast movement of data from a Hadoop cluster into tables in Oracle database. Oracle Loader for Hadoop prepartitions the data if necessary and transforms it into a database-ready format. It optionally sorts records by primary key or user-defined columns before loading the data or creating output files.
  • Oracle XQuery for Hadoop: Runs transformations expressed in the XQuery language by translating them into a series of MapReduce jobs, which are executed in parallel on the Hadoop cluster. The input data can be located in a file system accessible through the Hadoop File System API, such as HDFS, or in Oracle NoSQL Database. Oracle XQuery for Hadoop can write the transformation results to HDFS, Oracle NoSQL Database, Apache Solr, or Oracle Database. An additional XML processing capability is through XML Extensions for Hive.
  • Oracle Shell for Hadoop Loaders: A helper shell that provides a simple-to-use command line interface to Oracle Loader for Hadoop, Oracle SQL Connector for HDFS, and Copy to Hadoop (a feature of Big Data SQL). It has basic shell features.
  • Oracle R Advanced Analytics for Hadoop: Provides a general computation framework, in which you can use the R language to write your custom logic as mappers or reducers. Oracle R Advanced Analytics for Hadoop includes interfaces to work with Apache Hive tables, the Apache Hadoop compute infrastructure, the local R environment, and Oracle database tables.
  • Oracle Data Integrator: Extracts, loads, and transforms data from sources such as files and databases into Hadoop and from Hadoop into Oracle or third-party databases.
  • Oracle Datasource for Hadoop: Provides direct, fast, parallel, secure and consistent access to master data in Oracle Database using Hive SQL, Spark SQL, as well as Hadoop APIs that support SerDes, HCatalog, InputFormat and StorageHandler.

Individual connectors may require software components to be installed in Oracle Database and either the Hadoop cluster or an external system set up as a Hadoop client for the cluster. For details on integrating Oracle Database and Apache Hadoop visit the Certification MatrixOracle Big Data Connectors 3.0 and later supports Yet Another Resource Negotiator (YARN). For information about Oracle CDH visit