MySQL

Analyze social bookmarking sites to find insights Part 1

Analyze social bookmarking sites to find insights Part 1

In this article, we will Analyze social bookmarking sites to find insights using Big Data Technology, Data comprises of the information gathered from sites that are bookmarking sites and allow you to bookmark, review, rate, on a specific topic. A bookmarking site allows you to bookmark, review, rate, search various links on any topic. The data is in XML format and contains various categories defining it and the ratings linked with it. Problem Statement: Analyse the data in Hadoop Eco-system to: Fetch the data into Hadoop Distributed File System and analyze it with the help of MapReduce, Pig, and Hive…
Read More
Sensex Log Data Processing (PDF File Processing in Map Reduce) Part 1

Sensex Log Data Processing (PDF File Processing in Map Reduce) Part 1

In this article, We will see how to process Sensex Log (Share Market) which is in PDF format using Big Data Technology, We will see step by step process execution of the project. Problem Statement: Analyse the data in Hadoop Eco-system to: Take the complete PDF Input data on HDFSDevelop a Map-Reduce Use Case to get the below-filtered results from the HDFS Input data(Excel data)   If TYPE OF TRADING is -->'SIP'        - OPEN_BALANCE > 25000 & FLTUATION_RATE > 10  --> store "HighDemandMarket"        -CLOSING_BALANCE<22000 & FLTUATION_RATE IN BETWEEN 20 - 30  --> store "OnGoingMarketStretegy"   If TYPE OF…
Read More
Sensex Log Data Processing (PDF File Processing in Map Reduce) Part 2

Sensex Log Data Processing (PDF File Processing in Map Reduce) Part 2

Apache Pig Script​ SENSEX.pig A = LOAD '/hdfs/bhavesh/SENSEX/OUTPUT/HighDemandMarket-r-00000' using PigStorage('\t') as (Sid:int,Sname:chararray,Ttrading:chararray,Sloc:chararray,OBal:int,CBal:int,Frate:int); disHM = DISTINCT A; orHM = ORDER disHM by Sid; STORE orHM into '/hdfs/bhavesh/SENSEX/HM' using PigStorage(','); A = LOAD '/hdfs/bhavesh/SENSEX/OUTPUT/ReliableProducts-r-00000' using PigStorage('\t') as (Sid:int,Sname:chararray,Ttrading:chararray,Sloc:chararray,OBal:int,CBal:int,Frate:int); disRP = DISTINCT A; orRP = ORDER disRP by Sid; STORE orRP into '/hdfs/bhavesh/SENSEX/RP' using PigStorage(','); A = LOAD '/hdfs/bhavesh/SENSEX/OUTPUT/OtherProducts-r-00000' using PigStorage('\t') as (Sid:int,Sname:chararray,Ttrading:chararray,Sloc:chararray,OBal:int,CBal:int,Frate:int); disOP = DISTINCT A; orOP = ORDER disOP by Sid; STORE orOP into '/hdfs/bhavesh/SENSEX/OP' using PigStorage(','); A = LOAD '/hdfs/bhavesh/SENSEX/OUTPUT/WealthyProducts-r-00000' using PigStorage('\t') as (Sid:int,Sname:chararray,Ttrading:chararray,Sloc:chararray,OBal:int,CBal:int,Frate:int); disWP= DISTINCT A; orWP = ORDER disWP by Sid; STORE orWP into '/hdfs/bhavesh/SENSEX/WP' using PigStorage(','); A…
Read More