All assignments are done in pairs (parejas). Each pair is assigned one to three comunas (mostly Región Metropolitana) of roughly similar total population; see the Groups page for the assignment.
The three datasets are introduced progressively: Tarea 0 is a shallow first contact with all three; Tarea 1 goes deep on Census; Tarea 2 goes deep on ENO + GRD; Tarea 3 merges everything at the comuna level. The Final project is on its own page.
Rules
- Submission. Canvas: PDF export of your notebook + link to your team’s GitHub repo.
- Late policy. 2-day extension by email before the deadline.
- AI. Allowed in tareas; disclose in the README of your final repo.
- Both partners must understand all submitted code.
Tareas
| # | Title | Pts | Released | Due | Downloads |
|---|---|---|---|---|---|
| 0 | Setup and first contact with all three datasets | 5 | Thu Mar 5, 2026 | Thu Mar 12, 2026 (before class) | PDF · MD |
| 1 | Demographic profile and migration landscape | 10 | Thu Mar 12, 2026 | Thu Mar 26, 2026 (before class) | PDF · MD |
| 2 | Health landscape: ENO + GRD | 10 | Thu Mar 26, 2026 | Thu Apr 16, 2026 (before class) | PDF · MD |
| Quiz 1 | Share comuna-level summary tables (class-wide format) | 2 | Thu Apr 16, 2026 | Mon Apr 20, 2026, 23:59 | PDF · MD |
| 3 | Cross-dataset ecological modeling | 10 | Thu Apr 16, 2026 | Thu Apr 30, 2026 (before class) | PDF · MD |
What each tarea asks for (one paragraph)
Tarea 0 (5 pts). Install tools (Jupyter-compatible IDE, GitHub account, Python environment), create a team GitHub repo, and load each of the three datasets for your assigned comunas. Report basic shape/info, top 5 diseases (ENO), and top 5 diagnoses (GRD, joined to the CIE-10 lookup). Shallow first contact, no analysis.
Tarea 1 (10 pts). Demographic and migration portrait of your assigned comunas using the Census. Join vivienda + hogar + persona; build age pyramids by sex (Chilean-born vs. foreign-born), % foreign-born by comuna, top nationalities, migration status; produce choropleths and bar charts; compare your comunas to national/regional averages. Key output: a comuna-level summary table that you will reuse in Tarea 3.
Tarea 2 (10 pts). Health portrait. Part A: ENO: clean nationality coding, compute notification rates over time, build disease profiles by nationality, choropleth notification rates. Part B: GRD: load 2022 to 2024 GRD files filtered to your comunas, join to CIE-10, compute average length of stay, top diagnostic chapters, severity by nationality, hospitalization-rate map. Two more comuna-level summary tables (ENO and GRD).
Quiz 1 (2 pts, bonus). Publish your three comuna-level summary tables (Tarea 1 + Tarea 2) in the agreed class-wide format so all 21 teams can concatenate into a single master of about 50 RM comunas for Tarea 3.
Tarea 3 (10 pts). Merge the three comuna-level summary tables on codigo_comuna; explore correlations; fit Poisson or Negative Binomial regression for notification counts (with population offset), or linear/logistic for severity / length of stay; interpret coefficients (incidence rate ratios, odds ratios) and discuss the ecological fallacy explicitly; produce coefficient plots, predicted-rate maps, residual maps.
Bibliography seed
The primary external reference is the INE Census 2024 results portal. Documentation, dictionaries, and methodology PDFs for each dataset are linked from the Data and Resources pages. A short bibliography file is here: bibliography.md.
