In this article, We have explored Census data for India to understand changes in India’s demographics, population growth, religion distribution, gender distribution, and sex ratio, etc. Even by using small data, I could still gain a lot of valuable insights about the country. I have used Spark SQL and Inbuild graphs provided by Databricks.
India is the second-most populous country in the world, with over 1.271 billion people, more than a sixth of the world’s population. Already containing 17.5% of the world’s population, India is projected to be the world’s most populous country by 2025, surpassing China, its population reaching 1.6 billion by 2050. Its population growth rate is 1.2%.
Attribute Information or Dataset Details:
Attribute Information or Dataset Details:
| col_name | data_type | comment |
|---|---|---|
| SerialNo | string | null |
| State | string | null |
| District | string | null |
| Persons | bigint | null |
| Males | bigint | null |
| Females | bigint | null |
| Growthin1991to2001 | float | null |
| Rural | bigint | null |
| Urban | bigint | null |
| ScheduledCastepopulation | bigint | null |
| PercentageSC_tototal | bigint | null |
| Numberofhouseholds | bigint | null |
| Householdsizeperhousehold | bigint | null |
| Sexratiofemales_per_1000_males_ | bigint | null |
| Sex_ratio_0_6_years_ | bigint | null |
| Scheduled_Tribe_population | bigint | null |
| Percentage_to_total_population_ST_ | float | null |
| Persons_literate | bigint | null |
| Males_Literate | bigint | null |
| Females_Literate | bigint | null |
| Persons_literacy_rate | float | null |
| Males_Literatacy_Rate | float | null |
| Females_Literacy_Rate | float | null |
| Total_Educated | bigint | null |
| Data_without_level | bigint | null |
| Below_Primary | bigint | null |
| Primary | bigint | null |
| Middle | bigint | null |
| Matric_Higher_Secondary_Diploma | bigint | null |
| Graduate_and_Above | bigint | null |
| X0__4_years | bigint | null |
| X5__14_years | bigint | null |
| X15__59_years | bigint | null |
| X60_years_and_above_Incl_A_N_S_ | bigint | null |
| Total_workers | bigint | null |
| Main_workers | bigint | null |
| Margi0l_workers | bigint | null |
| Non_workers | bigint | null |
| SC_1_0me | string | null |
| SC_1_Population | bigint | null |
| SC_2_0me | string | null |
| SC_2_Population | bigint | null |
| SC_3_0me | string | null |
| SC_3_Population | bigint | null |
| Religeon_1_0me | string | null |
| Religeon_1_Population | bigint | null |
| Religeon_2_0me | string | null |
| Religeon_2_Population | bigint | null |
| Religeon_3_0me | string | null |
| Religeon_3_Population | bigint | null |
| ST_1_0me | string | null |
| ST_1_Population | bigint | null |
| ST_2_0me | string | null |
| ST_2_Population | bigint | null |
| ST_3_0me | string | null |
| ST_3_Population | bigint | null |
| Imp_Town_1_0me | string | null |
| Imp_Town_1_Population | bigint | null |
| Imp_Town_2_0me | string | null |
| Imp_Town_2_Population | bigint | null |
| Imp_Town_3_0me | string | null |
| Imp_Town_3_Population | bigint | null |
| Total_Inhabited_Villages | bigint | null |
| Drinking_water_facilities | bigint | null |
| Safe_Drinking_water | bigint | null |
| Electricity_Power_Supply_ | bigint | null |
| Electricity_domestic_ | bigint | null |
| Electricity_Agriculture_ | bigint | null |
| Primary_school | bigint | null |
| Middle_schools | bigint | null |
| Secondary_Sr_Secondary_schools | bigint | null |
| College | bigint | null |
| Medical_facility | bigint | null |
| Primary_Health_Centre | bigint | null |
| Primary_Health_Sub_Centre | bigint | null |
| Post_telegraph_and_telephone_facility | bigint | null |
| Bus_services | bigint | null |
| Paved_approach_road | null | |
| Mud_approach_road | bigint | null |
| Permanent_House | float | null |
| Semi_permanent_House | float | null |
| Temporary_House | float | null |
Table Created in Databricks Environment

Technology Used
Technology Used
- Apache Spark
- Spark SQL
- DataFrame-based API
- Databricks Notebook
Free Account creation in Databricks
Free Account creation in Databricks
Creating a Spark Cluster
Creating a Spark Cluster
Basics about Databricks notebook
Basics about Databricks notebook
Code for Spark SQL to get India's States with Number of Districts

Plot Option for Chart
Plot Option for Chart


