Wednesday, November 12, 2014

How to process XML data files using Hive of Big Data

Copy your xml data file into your local folder

Now create hive table

Hive> create table xmltable(mystr string);

Load data into hive table from local xml file

Hive>load data local inpath ‘xmlfile’ into table xmltable;

Use XPath to convert the array data into normal hive table data

hive> SELECT xpath(xpathexpression)  FROM xmltable;

How to process JSON data file using HIVE in Big Data

Use Hive-JSON SerDe

Copy hive-json-serde.jar to the Hive server

Copy test,json file into your folder.

ADD JAR /path/to/hive-json-serde.jar;

Create a table

       CREATE TABLE test_json_table (
          field1 string, field2 int, field3 string
       )
       ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.JsonSerde'
       LOAD DATA LOCAL INPATH '/tmp/test.json' INTO TABLE


Write a simple SELECT statement to fetch the data

SELECT field1,field2,field3 FROM test_json_table;

Saturday, November 8, 2014

Connecting to MySQL via Sqoop in Big Data


Syntax to import data from any relational database

sqoop import --connect <JDBC connection string> --table <tablename> --username <username> --password <password>

sqoop import --connect jdbc:mysql://ipaddress/db --username <enter user name> --password <enter password> --table tble