Healthcare Analytics for Beginners Part 2

Patient's Age

Patient's Income

Patient's Occupation

All in One Scatter Plot

Loading Data into DataFrame


// File location and type
val file_location = "/FileStore/tables/First_Health_Camp_Attended.csv"
val file_type = "csv"

// CSV options
val infer_schema = "true"
val first_row_is_header = "true"
val delimiter = ","

// The applied options are for CSV files. For other file types, these will be ignored.
val First_Health_Camp_Attended =
.option("inferSchema", infer_schema)
.option("header", first_row_is_header)
.option("sep", delimiter)


Count of Data (Total Records)



res12: Long = 6218

Displaying Statistics of Data



Print Schema of Data



root |-- Patient_ID: integer (nullable = true)
|-- Health_Camp_ID: integer (nullable = true)
|-- Donation: integer (nullable = true)
|-- Health_Score: double (nullable = true)

Creating Temp View so we can run Spark SQL Queries on data



Exploratory Data Analysis

Donation Distribution

Health Score Vs Donation

Histogram of Health Score

By Bhavesh