Healthcare Analytics for Beginners Part 2

BhaveshJanuary 5, 2022

Patient's Age

Patient's Income

Patient's Occupation

All in One Scatter Plot

Loading Data into DataFrame

%scala

// File location and type
val file_location = "/FileStore/tables/First_Health_Camp_Attended.csv"
val file_type = "csv"

// CSV options
val infer_schema = "true"
val first_row_is_header = "true"
val delimiter = ","

// The applied options are for CSV files. For other file types, these will be ignored.
val First_Health_Camp_Attended = spark.read.format(file_type)
.option("inferSchema", infer_schema)
.option("header", first_row_is_header)
.option("sep", delimiter)
.load(file_location)

display(First_Health_Camp_Attended)

Count of Data (Total Records)

%scala

First_Health_Camp_Attended.count()

res12: Long = 6218

Displaying Statistics of Data

%scala

display(First_Health_Camp_Attended.describe())

Print Schema of Data

%scala

First_Health_Camp_Attended.printSchema()

root |-- Patient_ID: integer (nullable = true)
|-- Health_Camp_ID: integer (nullable = true)
|-- Donation: integer (nullable = true)
|-- Health_Score: double (nullable = true)

Creating Temp View so we can run Spark SQL Queries on data

%scala

First_Health_Camp_Attended.createOrReplaceTempView("First_Health_Camp_Attended");

Healthcare Analytics for Beginners Part 2

Patient's Age

Patient's Income

Patient's Occupation

All in One Scatter Plot

Loading Data into DataFrame

Loading Data into DataFrame

Count of Data (Total Records)

Count of Data (Total Records)

Displaying Statistics of Data

Print Schema of Data

Creating Temp View so we can run Spark SQL Queries on data

Exploratory Data Analysis

Donation Distribution

Health Score Vs Donation

Histogram of Health Score

By Bhavesh