In this article, We have explored Census data for India to understand changes in India’s demographics, population growth, religion distribution, gender distribution, and sex ratio, etc. Even by using small data, I could still gain a lot of valuable insights about the country. I have used Spark SQL and Inbuild graphs provided by Databricks.
India is the second-most populous country in the world, with over 1.271 billion people, more than a sixth of the world’s population. Already containing 17.5% of the world’s population, India is projected to be the world’s most populous country by 2025, surpassing China, its population reaching 1.6 billion by 2050. Its population growth rate is 1.2%.
Attribute Information or Dataset Details:
Attribute Information or Dataset Details:
col_name | data_type | comment |
---|---|---|
SerialNo | string | null |
State | string | null |
District | string | null |
Persons | bigint | null |
Males | bigint | null |
Females | bigint | null |
Growthin1991to2001 | float | null |
Rural | bigint | null |
Urban | bigint | null |
ScheduledCastepopulation | bigint | null |
PercentageSC_tototal | bigint | null |
Numberofhouseholds | bigint | null |
Householdsizeperhousehold | bigint | null |
Sexratiofemales_per_1000_males_ | bigint | null |
Sex_ratio_0_6_years_ | bigint | null |
Scheduled_Tribe_population | bigint | null |
Percentage_to_total_population_ST_ | float | null |
Persons_literate | bigint | null |
Males_Literate | bigint | null |
Females_Literate | bigint | null |
Persons_literacy_rate | float | null |
Males_Literatacy_Rate | float | null |
Females_Literacy_Rate | float | null |
Total_Educated | bigint | null |
Data_without_level | bigint | null |
Below_Primary | bigint | null |
Primary | bigint | null |
Middle | bigint | null |
Matric_Higher_Secondary_Diploma | bigint | null |
Graduate_and_Above | bigint | null |
X0__4_years | bigint | null |
X5__14_years | bigint | null |
X15__59_years | bigint | null |
X60_years_and_above_Incl_A_N_S_ | bigint | null |
Total_workers | bigint | null |
Main_workers | bigint | null |
Margi0l_workers | bigint | null |
Non_workers | bigint | null |
SC_1_0me | string | null |
SC_1_Population | bigint | null |
SC_2_0me | string | null |
SC_2_Population | bigint | null |
SC_3_0me | string | null |
SC_3_Population | bigint | null |
Religeon_1_0me | string | null |
Religeon_1_Population | bigint | null |
Religeon_2_0me | string | null |
Religeon_2_Population | bigint | null |
Religeon_3_0me | string | null |
Religeon_3_Population | bigint | null |
ST_1_0me | string | null |
ST_1_Population | bigint | null |
ST_2_0me | string | null |
ST_2_Population | bigint | null |
ST_3_0me | string | null |
ST_3_Population | bigint | null |
Imp_Town_1_0me | string | null |
Imp_Town_1_Population | bigint | null |
Imp_Town_2_0me | string | null |
Imp_Town_2_Population | bigint | null |
Imp_Town_3_0me | string | null |
Imp_Town_3_Population | bigint | null |
Total_Inhabited_Villages | bigint | null |
Drinking_water_facilities | bigint | null |
Safe_Drinking_water | bigint | null |
Electricity_Power_Supply_ | bigint | null |
Electricity_domestic_ | bigint | null |
Electricity_Agriculture_ | bigint | null |
Primary_school | bigint | null |
Middle_schools | bigint | null |
Secondary_Sr_Secondary_schools | bigint | null |
College | bigint | null |
Medical_facility | bigint | null |
Primary_Health_Centre | bigint | null |
Primary_Health_Sub_Centre | bigint | null |
Post_telegraph_and_telephone_facility | bigint | null |
Bus_services | bigint | null |
Paved_approach_road | null | |
Mud_approach_road | bigint | null |
Permanent_House | float | null |
Semi_permanent_House | float | null |
Temporary_House | float | null |
Table Created in Databricks Environment
Technology Used
Technology Used
- Apache Spark
- Spark SQL
- DataFrame-based API
- Databricks Notebook