Data Visualization

YouTube Spam Comment Prediction

YouTube Spam Comment Prediction

Machine Learning Project YouTube Spam Comment Prediction. The study of the classification YouTube comment as spam based on the available attributes Data Set Information COMMENT_ID: StringAUTHOR: StringDATE: StringCONTENT: StringCLASS: Double Technology Used Apache SparkSpark SQL Apache Spark MLLibScalaDataFrame-based APIDatabricks Notebook Challenges Process Comma-separated values file (ie file with .csv as Extensions) with user define a schema for data Convert String data to Numeric format so we can process the data in Apache Spark ML Library. Introduction Welcome to this project on creating prediction model to Identify spam comment in Apache Spark Machine Learning using Databricks platform community edition server which allows…
Read More
Identify the Type of animal (7 Types) based on the available attributes

Identify the Type of animal (7 Types) based on the available attributes

Machine Learning Project Animal Classification. The study of the classification of types of animal, Identify the Type of animal (7 Types) based on the available attributes Data Set Information​ A simple database containing 17 Boolean-valued attributes. The "type" attribute appears to be the class attribute. Here is a breakdown of which animals are in which type:  Class# -- Set of animals:====== ====================================================1 -- (41) aardvark, antelope, bear, boar, buffalo, calf, cavy, cheetah, deer, dolphin, elephant, fruitbat, giraffe, girl, goat, gorilla, hamster, hare, leopard, lion, lynx, mink, mole, mongoose, opossum, oryx, platypus, polecat, pony, porpoise, puma, pussycat, raccoon, reindeer, seal, sealion,…
Read More
Glass Identification

Glass Identification

Machine Learning Project for Glass Identification. Problem Statement or Business Problem​ From USA Forensic Science Service; 6 types of glass; defined in terms of their oxide content (i.e. Na, Fe, K, etc) The study of the classification of types of glass was motivated by the criminological investigation. At the scene of the crime, the glass left can be used as evidence...if it is correctly identified! Attribute Information or Dataset Details: Id number: 1 to 214RI: refractive indexNa: Sodium (unit measurement: weight percent in corresponding oxide, as are attributes 4-10)Mg: MagnesiumAl: AluminumSi: SiliconK: PotassiumCa: CalciumBa: BariumFe: IronType of glass: (class attribute)building_windows_float_processed…
Read More
Predicting the age of abalone from physical measurements Part 1

Predicting the age of abalone from physical measurements Part 1

Machine Learning Project Predicting the age of abalone from physical measurements. Abalone is a common name for any of a group of small to very large sea snails, marine gastropod molluscs in the family Haliotidae. Other common names are ear shells, sea ears, and muttonfish or muttonshells in Australia, ormer in the UK, perlemoen in South Africa, and paua in New Zealand. Abalone are marine snails. Problem Statement or Business Problem​ Predict the age of abalone from physical measurements Predicting the age of abalone from physical measurements. The age of abalone is determined by cutting the shell through the cone,…
Read More
Predicting the age of abalone from physical measurements Part 2

Predicting the age of abalone from physical measurements Part 2

Histogram for Sex and Age​ %sql select Sex, Age from AbaloneData; Plot Option Age Distribution​ %sql select count(Sex), Sex from AbaloneData group by Sex; Histogram for Lenght in mm in Abalone ​ %sql select Length_in_mm from AbaloneData; Histogram for Height in mm in Abalone %sql select Height_in_mm from AbaloneData; Histogram for rings in Abalone %sql select Rings from AbaloneData; Creating a Regression Model​ In this tutorial , you will implement a regression model that uses features of abalone to predict the age of abalone from physical measurements Import Spark SQL and Spark ML Libraries​ First, import the libraries you will…
Read More
Sentiment Analysis on Demonetization in India using Apache Spark

Sentiment Analysis on Demonetization in India using Apache Spark

In this article, We have explored the Sentiments of People in India during Demonetization. Even by using small data, I could still gain a lot of valuable insights. I have used Spark SQL and Inbuild graphs provided by Databricks.India is the second-most populous country in the world, with over 1.271 billion people, more than a sixth of the world’s population. Let us find out the views of different people on the demonetization by analyzing the tweets from Twitter. Attribute Information or Dataset Details: Table Created in Databricks Environment Technology Used Apache SparkSpark SQLDataFrame-based APIDatabricks Notebook Free Account creation in Databricks…
Read More
Analytics on India census using Apache Spark Part 1

Analytics on India census using Apache Spark Part 1

In this article, We have explored Census data for India to understand changes in India’s demographics, population growth, religion distribution, gender distribution, and sex ratio, etc. Even by using small data, I could still gain a lot of valuable insights about the country. I have used Spark SQL and Inbuild graphs provided by Databricks. India is the second-most populous country in the world, with over 1.271 billion people, more than a sixth of the world's population. Already containing 17.5% of the world's population, India is projected to be the world's most populous country by 2025, surpassing China, its population reaching…
Read More
Analytics on India census using Apache Spark Part 2

Analytics on India census using Apache Spark Part 2

Code for Spark SQL to get Population Density in terms of Districts Code for Spark SQL to get Scheduled Castes (SCs) Population per State Code for Spark SQL to get What percentage of the states are actually literate in India? Code for Spark SQL to get State which have Literacy rate less than 50% Code for Spark SQL to get Male and Female Literacy rate as per State Code for Spark SQL to get Literacy Rate as per Type of Education for every State Code for Spark SQL to get Male and Female Percentage as per state Code for Spark SQL to…
Read More
Analytics on India census using Apache Spark Part 3

Analytics on India census using Apache Spark Part 3

Code for Spark SQL to get Status Electricity Facility by State Code for Spark SQL to get Education Facility in India as Per State Code for Spark SQL to get Medical Facility in India Code for Spark SQL to get Bus Transportation per State Code for Spark SQL to get Raod Status in India Code for Spark SQL to get Residence Status in India by State
Read More