Bhavesh

77 Posts
Analytics on India census using Apache Spark Part 2

Analytics on India census using Apache Spark Part 2

Code for Spark SQL to get Population Density in terms of Districts Code for Spark SQL to get Scheduled Castes (SCs) Population per State Code for Spark SQL to get What percentage of the states are actually literate in India? Code for Spark SQL to get State which have Literacy rate less than 50% Code for Spark SQL to get Male and Female Literacy rate as per State Code for Spark SQL to get Literacy Rate as per Type of Education for every State Code for Spark SQL to get Male and Female Percentage as per state Code for Spark SQL to…
Read More
Analytics on India census using Apache Spark Part 3

Analytics on India census using Apache Spark Part 3

Code for Spark SQL to get Status Electricity Facility by State Code for Spark SQL to get Education Facility in India as Per State Code for Spark SQL to get Medical Facility in India Code for Spark SQL to get Bus Transportation per State Code for Spark SQL to get Raod Status in India Code for Spark SQL to get Residence Status in India by State
Read More
Basics about Databricks notebook

Basics about Databricks notebook

Click on the Create a Blank Notebook as shown in the below Image Specify the File name and Select the Cluster which we have created. A notebook is a collection of runnable cells (commands). When you use a notebook, you are primarily developing and running cells. The supported magic commands are: %python, %r, %scala, and %sql. Additionally: %shAllows you to execute shell code in your notebook. %fsAllows you to use dbutils filesystem commands. %mdAllows you to include various types of documentation, including text, images, and mathematical formulas and equations. For more details please refer Databricks Documentation.
Read More
Free Account creation in Databricks Community Edition

Free Account creation in Databricks Community Edition

What is the Databricks Community Edition? The Databricks Community Edition is the free version of our cloud-based big data platform. Its allows users to access a micro-cluster as well as a cluster manager and notebook environment. All users can share their notebooks and host them free of charge with Databricks. Link for Databricks Community Edition https://community.cloud.databricks.com/login.html Open the above Link in any Latest Browser, we recommend use Google Chrome for better experience. Click on Sign up as shown in the Image A New Page will get open as shown in the below Image. Fill all the required details as applicable…
Read More
Provisioning a Spark Cluster or Creating a Spark Cluster

Provisioning a Spark Cluster or Creating a Spark Cluster

Once you login to Databricks Community Edition on the Left Tab we have Cluster Button as shown in the Image Click on it. As soon as you click on Clusters Button a new webpage will get open as shown in the below image. As soon as you click on Create Cluster a new webpage will get open as shown in the below image Launching Spark Cluster Steps are as follows: Specify the Cluster name [You can specify any Cluster Name for our all Project we will specify it as SparkCluster] Click on Create Cluster Please make a note: Free 15GB Memory:…
Read More
Loading Data into Databricks Environment

Loading Data into Databricks Environment

Loading Data into Databricks: Click on Import and Explore Data  A new popup will get open Select the file which you want to upload into Databricks A new web page will get open and files will get uploaded into Databricks  Make Sure you see the tick mark which indicated file is uploaded successfully and copy the file location and refer these file in Notebook Once we click on Drop files a new Popup will get Open A new web page will get open and file will get uploaded into Databricks Environment
Read More
Bidding Auction Data Analytics in Apache Spark

Bidding Auction Data Analytics in Apache Spark

In this article, We have explored the Bidding Auction Data Analysis The dataset to be used is from eBay like online auctions. Even by using small data, I could still gain a lot of valuable insights. I have used Spark RDD in Databricks.In this activity, we will load data into Apache Spark and inspect the data using the Spark In this section, we use the SparkContext method, textFile, to load the data into a Resilient Distributed Dataset (RDD). Our dataset is a .csv file that consists of online auction data. Each auction has an auction id associated with it and…
Read More