Hive in OSX El-Capitan

STEP 1: MySql should be installed as prerequisite

$ brew install mysql
$ mysql.server restart

If you get error

Error: The `brew link` step did not complete successfully The formula built, but is not symlinked into /usr/local Could not symlink include/mysql /usr/local/include is not writable.

Make /usr/ writeable & try again using:

$ brew link mysql

STEP 2: Install Hive

$ brew install hive

hive

STEP 3: Add hadoop and hive to your path by editing your bash_profile

$ vim ~/.bash_profile
export HADOOP_HOME=/usr/local/Cellar/hadoop/2.7.0
export HIVE_HOME=/usr/local/Cellar/hive/1.2.1/libexec
$ source ~/.bash_profile

STEP 4: Download mysql jdbc connector: http://dev.mysql.com/downloads/connector/j/

$ tar zxvf mysql-connector-java-5.1.36.tar.gz
$ sudo cp mysql-connector-java-5.1.36/mysql-connector-java-5.1.36-bin.jar /usr/local/Cellar/hive/1.2.1/libexec/lib/

STEP 5: Setup MySQL database

$ mysqladmin -u root password 'your root password'
$ mysql -u root -p
mysql> CREATE DATABASE metastore;
mysql> USE metastore;
mysql> ALTER DATABASE metastore CHARACTER SET latin1 COLLATE latin1_swedish_ci;
mysql> CREATE USER 'hiveuser'@'localhost' IDENTIFIED BY 'welcome1';
mysql> GRANT ALL PRIVILEGES ON *.* TO 'hiveuser'@'localhost' WITH GRANT OPTION;

STEP 6: Copy hive-default-xml to hive-site.xml

$ cd /usr/local/Cellar/hive/1.2.1/libexec/conf
$ cp hive-default.xml.template hive-site.xml

Edit following lines in hive-site.xml

<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hiveuser</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>welcome1</value>
</property>
<property>
<name>datanucleus.fixedDatastore</name>
<value>false</value>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/hive</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/hive</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/tmp/hive</value>
<description>Location of Hive run time structured log file</description>
</property>

Run hive

$ hive
hive> show tables;

STEP 7: Create the HDFS directories

$ hdfs dfs -mkdir /user/hive/warehouse
$ hdfs dfs -mkdir -p /user/hive/warehouse
$ hdfs dfs -chmod g+w /tmp
$ hdfs dfs -chmod g+w /user/hive/warehouse

Create hcat.out

$ mkdir -p hcatalog/var/log
$ touch hcatalog/var/log/hcat.out

Add bin to path

$ vim ~/.bash_profile
$ export PATH=/usr/local/Cellar/hive–1.1.0/bin:$PATH
$ source ~/.bash_profile
$ hive

to debug error in detail use

$ hive -hiveconf hive.root.logger=INFO,console

STEP 8: Examples

hive> CREATE TABLE pokes (foo varchar(255), bar STRING);
hive> CREATE TABLE invites (foo INT, bar STRING) PARTITIONED BY (ds STRING);
hive> SHOW TABLES;
hive> SHOW TABLES '.*s';
hive> DESCRIBE invites;

Using datasets MovieLens User Ratings – Download the dataset from MovieLens

$ wget http://files.grouplens.org/datasets/movielens/ml–100k.zip
$ unzip ml–100k.zip
//Run Hive and load the data
./hive
hive> CREATE TABLE u_data (
  userid INT,
  movieid INT,
  rating INT,
  unixtime STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE;
hive> LOAD DATA LOCAL INPATH '../ml–100k/u.data' OVERWRITE INTO TABLE u_data;
hive> SELECT COUNT(*) FROM u_data;

hive_load

Advertisements

2 thoughts on “Hive in OSX El-Capitan

  1. I also needed to set hive.metastore.schema.verification ‘false’

    hive.metastore.schema.verification
    false

    Enforce metastore schema version consistency.
    True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
    schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
    proper metastore schema migration. (Default)
    False: Warn if the version information stored in metastore doesn’t match with one from in Hive jars.

    Like

  2. If it gives database not initialized error ‘MetaException(message:Hive metastore database is not initialized.’, try running the following command, and then run hive:

    schematool -initSchema -dbType
    hive

    Liked by 1 person

Comments are closed.