link to Regexper

afeld · afeld · commit e66ea404147d · 2026-03-25T13:23:30.000-04:00
diff --git a/lecture_2.ipynb b/lecture_2.ipynb
@@ -32,12 +32,12 @@
     "\n",
     "- Taking notes in the lecture notebooks\n",
     "- Using [another Python/pandas learning resource](https://python-public-policy.afeld.me/en/{{school_slug}}/resources.html)\n",
-    "   - Hear things explained another way\n",
-    "   - Ask in [Ed Discussion]({{discussions_url}}) if others have recommendations\n",
+    "  - Hear things explained another way\n",
+    "  - Ask in [Ed Discussion]({{discussions_url}}) if others have recommendations\n",
     "- [Comment-driven development](https://www.sitepoint.com/comment-driven-development/)\n",
-    "   - Otherwise, trying to do two steps in your head:\n",
-    "      1. Figuring out the logic\n",
-    "      1. Figuring out the syntax"
+    "  - Otherwise, trying to do two steps in your head:\n",
+    "    1. Figuring out the logic\n",
+    "    1. Figuring out the syntax\n"
    ]
   },
   {
@@ -57,7 +57,7 @@
     "```python\n",
     "# find valid ZIP codes\n",
     "# filter the DataFrame to only invalid ZIP codes\n",
-    "```"
+    "```\n"
    ]
   },
   {
@@ -70,7 +70,7 @@
     "tags": []
    },
    "source": [
-    "## [Boolean indexing](https://pandas.pydata.org/docs/user_guide/10min.html#boolean-indexing)"
+    "## [Boolean indexing](https://pandas.pydata.org/docs/user_guide/10min.html#boolean-indexing)\n"
    ]
   },
   {
@@ -225,7 +225,7 @@
     "tags": []
    },
    "source": [
-    "When we compare single values (like `x > 6`), we get a single boolean back. Here, we are checking a _bunch_ of values, so we're going to get multiple booleans, returned as a Series."
+    "When we compare single values (like `x > 6`), we get a single boolean back. Here, we are checking a _bunch_ of values, so we're going to get multiple booleans, returned as a Series.\n"
    ]
   },
   {
@@ -365,7 +365,7 @@
     "\n",
     "```python\n",
     "people[people[\"age\"] > 40]\n",
-    "```"
+    "```\n"
    ]
   },
   {
@@ -382,7 +382,7 @@
     "\n",
     "> Data Cleansing is a process of removing or fixing incorrect, malformed, incomplete, duplicate, or corrupted data\n",
     "\n",
-    "https://hevodata.com/learn/data-cleansing-a-simplified-guide/"
+    "https://hevodata.com/learn/data-cleansing-a-simplified-guide/\n"
    ]
   },
   {
@@ -395,7 +395,7 @@
     "tags": []
    },
    "source": [
-    "When have you needed to clean data?"
+    "When have you needed to clean data?\n"
    ]
   },
   {
@@ -408,7 +408,7 @@
     "tags": []
    },
    "source": [
-    "What are continuous values?"
+    "What are continuous values?\n"
    ]
   },
   {
@@ -421,7 +421,7 @@
     "tags": []
    },
    "source": [
-    "What are categorical values?"
+    "What are categorical values?\n"
    ]
   },
   {
@@ -439,16 +439,16 @@
     "From [my workshop on data cleaning](https://github.com/afeld/data-cleaning):\n",
     "\n",
     "- Missing data\n",
-    "   - Empty values\n",
+    "  - Empty values\n",
     "- Bad (junk) values\n",
-    "   - Duplicates\n",
-    "   - Mismatched types/formatting\n",
+    "  - Duplicates\n",
+    "  - Mismatched types/formatting\n",
     "- Categorical values\n",
-    "   - Uniqueness (cardinality)\n",
-    "   - Value counts\n",
+    "  - Uniqueness (cardinality)\n",
+    "  - Value counts\n",
     "- Continuous values\n",
-    "   - Ranges\n",
-    "   - Spread (distribution)"
+    "  - Ranges\n",
+    "  - Spread (distribution)\n"
    ]
   },
   {
@@ -464,7 +464,7 @@
     "Notes:\n",
     "\n",
     "- \"Values\" in this case can be a single cell (in the spreadsheet sense) or a whole row\n",
-    "- \"Missing\" or \"duplicates\" can be columns (Series), tables (DataFrames), rows, or cells"
+    "- \"Missing\" or \"duplicates\" can be columns (Series), tables (DataFrames), rows, or cells\n"
    ]
   },
   {
@@ -482,7 +482,7 @@
     "- Empty\n",
     "- Bad\n",
     "- Unique\n",
-    "- Spread"
+    "- Spread\n"
    ]
   },
   {
@@ -496,7 +496,7 @@
     "tags": []
    },
    "source": [
-    "## Setup"
+    "## Setup\n"
    ]
   },
   {
@@ -528,7 +528,7 @@
     "tags": []
    },
    "source": [
-    "### Read our cleaned 311 Service Requests dataset"
+    "### Read our cleaned 311 Service Requests dataset\n"
    ]
   },
   {
@@ -571,7 +571,7 @@
     "\n",
     "More data cleaning!\n",
     "\n",
-    "![Minion character vacuuming](https://impulsecreative.com/hs-fs/hubfs/cleaning-minion-gif.gif?width=490&name=cleaning-minion-gif.gif)"
+    "![Minion character vacuuming](https://impulsecreative.com/hs-fs/hubfs/cleaning-minion-gif.gif?width=490&name=cleaning-minion-gif.gif)\n"
    ]
   },
   {
@@ -586,7 +586,7 @@
    "source": [
     "```\n",
     "DtypeWarning: Columns (8,20,31,34) have mixed types.\n",
-    "```"
+    "```\n"
    ]
   },
   {
@@ -1273,7 +1273,7 @@
     "tags": []
    },
    "source": [
-    "ZIP codes _look_ numeric, but aren't really."
+    "ZIP codes _look_ numeric, but aren't really.\n"
    ]
   },
   {
@@ -1286,7 +1286,7 @@
     "tags": []
    },
    "source": [
-    "[Read the ZIP codes in as strings.](https://pandas.pydata.org/pandas-docs/stable/user_guide/text.html#text-data-types)"
+    "[Read the ZIP codes in as strings.](https://pandas.pydata.org/pandas-docs/stable/user_guide/text.html#text-data-types)\n"
    ]
   },
   {
@@ -1323,7 +1323,7 @@
     "tags": []
    },
    "source": [
-    "We fixed the dtype warning for column 8 (`Incident Zip`)."
+    "We fixed the dtype warning for column 8 (`Incident Zip`).\n"
    ]
   },
   {
@@ -1728,7 +1728,10 @@
     "└─ start of string\n",
     "```\n",
     "\n",
-    "[regex101](https://regex101.com/) is useful for testing them."
+    "Helpful tools:\n",
+    "\n",
+    "- [Regexper](https://regexper.com/#%5E%5Cd%7B5%7D%28%3F%3A-%5Cd%7B4%7D%29%3F%24)\n",
+    "- [regex101](https://regex101.com/)\n"
    ]
   },
   {
@@ -1911,7 +1914,7 @@
     "tags": []
    },
    "source": [
-    "[Clear](https://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html#inserting-missing-data) any invalid ZIP codes:"
+    "[Clear](https://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html#inserting-missing-data) any invalid ZIP codes:\n"
    ]
   },
   {
@@ -1939,7 +1942,7 @@
     "tags": []
    },
    "source": [
-    "[`.loc[]`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html) is used for overwriting a subset of values."
+    "[`.loc[]`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html) is used for overwriting a subset of values.\n"
    ]
   },
   {
@@ -1956,7 +1959,7 @@
     "\n",
     "- Hard part is finding what needs to be done\n",
     "- Will be specific to your use case\n",
-    "- Document what you did, since it will affect your results"
+    "- Document what you did, since it will affect your results\n"
    ]
   },
   {
@@ -1969,7 +1972,7 @@
     "tags": []
    },
    "source": [
-    "## [In-class exercise](https://python-public-policy.afeld.me/en/{{school_slug}}/lecture_2_exercise.html)"
+    "## [In-class exercise](https://python-public-policy.afeld.me/en/{{school_slug}}/lecture_2_exercise.html)\n"
    ]
   },
   {
@@ -1984,7 +1987,7 @@
     ]
    },
    "source": [
-    "## [Concatenation](https://pandas.pydata.org/docs/user_guide/merging.html#concat)"
+    "## [Concatenation](https://pandas.pydata.org/docs/user_guide/merging.html#concat)\n"
    ]
   },
   {
@@ -2250,7 +2253,7 @@
     "tags": []
    },
    "source": [
-    "## Simple [merge](https://pandas.pydata.org/docs/user_guide/merging.html#merge)"
+    "## Simple [merge](https://pandas.pydata.org/docs/user_guide/merging.html#merge)\n"
    ]
   },
   {
@@ -2263,7 +2266,7 @@
     "tags": []
    },
    "source": [
-    "_I had [Copilot](https://code.visualstudio.com/docs/copilot/overview) generate the DataFrames, so no idea if the numbers are real._"
+    "_I had [Copilot](https://code.visualstudio.com/docs/copilot/overview) generate the DataFrames, so no idea if the numbers are real._\n"
    ]
   },
   {
@@ -2445,7 +2448,7 @@
     "tags": []
    },
    "source": [
-    "How should we combine them?"
+    "How should we combine them?\n"
    ]
   },
   {
@@ -2617,7 +2620,7 @@
    "source": [
     "To join DataFrames together, we will use the [pandas `.merge()` function](https://pandas.pydata.org/pandas-docs/stable/getting_started/intro_tutorials/08_combine_dataframes.html#join-tables-using-a-common-identifier).\n",
     "\n",
-    "![merge diagram](https://pandas.pydata.org/pandas-docs/stable/_images/08_merge_left.svg)"
+    "![merge diagram](https://pandas.pydata.org/pandas-docs/stable/_images/08_merge_left.svg)\n"
    ]
   },
   {
@@ -2635,7 +2638,7 @@
     "- [SQL `JOIN`](https://pandas.pydata.org/pandas-docs/stable/getting_started/comparison/comparison_with_sql.html#join)\n",
     "- [Spreadsheet `VLOOKUP`](https://pandas.pydata.org/pandas-docs/stable/getting_started/comparison/comparison_with_spreadsheets.html#merging)\n",
     "\n",
-    "In general, called [\"record linkage\" or \"entity resolution\"](https://en.wikipedia.org/wiki/Record_linkage)."
+    "In general, called [\"record linkage\" or \"entity resolution\"](https://en.wikipedia.org/wiki/Record_linkage).\n"
    ]
   },
   {
@@ -2815,7 +2818,7 @@
     "tags": []
    },
    "source": [
-    "[Different types of merges](https://www.geeksforgeeks.org/different-types-of-joins-in-pandas/)"
+    "[Different types of merges](https://www.geeksforgeeks.org/different-types-of-joins-in-pandas/)\n"
    ]
   },
   {
@@ -2832,7 +2835,7 @@
    "source": [
     "## In-class exercise 2\n",
     "\n",
-    "Compute the migrant population as a percent of total by country using [UN data](https://data.un.org/). You're welcome to talk with your neighbors."
+    "Compute the migrant population as a percent of total by country using [UN data](https://data.un.org/). You're welcome to talk with your neighbors.\n"
    ]
   },
   {
@@ -2845,7 +2848,7 @@
     "tags": []
    },
    "source": [
-    "## [Homework 2](https://python-public-policy.afeld.me/en/{{school_slug}}/hw_2.html)"
+    "## [Homework 2](https://python-public-policy.afeld.me/en/{{school_slug}}/hw_2.html)\n"
    ]
   }
  ],

Original file line number	Diff line number	Diff line change
`@@ -32,12 +32,12 @@`
`32`	`32`	`"\n",`
`33`	`33`	`"- Taking notes in the lecture notebooks\n",`
`34`	`34`	`"- Using [another Python/pandas learning resource](https://python-public-policy.afeld.me/en/{{school_slug}}/resources.html)\n",`
`35`		`- " - Hear things explained another way\n",`
`36`		`- " - Ask in [Ed Discussion]({{discussions_url}}) if others have recommendations\n",`
	`35`	`+ " - Hear things explained another way\n",`
	`36`	`+ " - Ask in [Ed Discussion]({{discussions_url}}) if others have recommendations\n",`
`37`	`37`	`"- [Comment-driven development](https://www.sitepoint.com/comment-driven-development/)\n",`
`38`		`- " - Otherwise, trying to do two steps in your head:\n",`
`39`		`- " 1. Figuring out the logic\n",`
`40`		`- " 1. Figuring out the syntax"`
	`38`	`+ " - Otherwise, trying to do two steps in your head:\n",`
	`39`	`+ " 1. Figuring out the logic\n",`
	`40`	`+ " 1. Figuring out the syntax\n"`
`41`	`41`	`]`
`42`	`42`	`},`
`43`	`43`	`{`
`@@ -57,7 +57,7 @@`
`57`	`57`	"```python\n",
`58`	`58`	`"# find valid ZIP codes\n",`
`59`	`59`	`"# filter the DataFrame to only invalid ZIP codes\n",`
`60`		- "```"
	`60`	+ "```\n"
`61`	`61`	`]`
`62`	`62`	`},`
`63`	`63`	`{`
`@@ -70,7 +70,7 @@`
`70`	`70`	`"tags": []`
`71`	`71`	`},`
`72`	`72`	`"source": [`
`73`		`- "## [Boolean indexing](https://pandas.pydata.org/docs/user_guide/10min.html#boolean-indexing)"`
	`73`	`+ "## [Boolean indexing](https://pandas.pydata.org/docs/user_guide/10min.html#boolean-indexing)\n"`
`74`	`74`	`]`
`75`	`75`	`},`
`76`	`76`	`{`
`@@ -225,7 +225,7 @@`
`225`	`225`	`"tags": []`
`226`	`226`	`},`
`227`	`227`	`"source": [`
`228`		- "When we compare single values (like `x > 6`), we get a single boolean back. Here, we are checking a _bunch_ of values, so we're going to get multiple booleans, returned as a Series."
	`228`	+ "When we compare single values (like `x > 6`), we get a single boolean back. Here, we are checking a _bunch_ of values, so we're going to get multiple booleans, returned as a Series.\n"
`229`	`229`	`]`
`230`	`230`	`},`
`231`	`231`	`{`
`@@ -365,7 +365,7 @@`
`365`	`365`	`"\n",`
`366`	`366`	"```python\n",
`367`	`367`	`"people[people[\"age\"] > 40]\n",`
`368`		- "```"
	`368`	+ "```\n"
`369`	`369`	`]`
`370`	`370`	`},`
`371`	`371`	`{`
`@@ -382,7 +382,7 @@`
`382`	`382`	`"\n",`
`383`	`383`	`"> Data Cleansing is a process of removing or fixing incorrect, malformed, incomplete, duplicate, or corrupted data\n",`
`384`	`384`	`"\n",`
`385`		`- "https://hevodata.com/learn/data-cleansing-a-simplified-guide/"`
	`385`	`+ "https://hevodata.com/learn/data-cleansing-a-simplified-guide/\n"`
`386`	`386`	`]`
`387`	`387`	`},`
`388`	`388`	`{`
`@@ -395,7 +395,7 @@`
`395`	`395`	`"tags": []`
`396`	`396`	`},`
`397`	`397`	`"source": [`
`398`		`- "When have you needed to clean data?"`
	`398`	`+ "When have you needed to clean data?\n"`
`399`	`399`	`]`
`400`	`400`	`},`
`401`	`401`	`{`
`@@ -408,7 +408,7 @@`
`408`	`408`	`"tags": []`
`409`	`409`	`},`
`410`	`410`	`"source": [`
`411`		`- "What are continuous values?"`
	`411`	`+ "What are continuous values?\n"`
`412`	`412`	`]`
`413`	`413`	`},`
`414`	`414`	`{`
`@@ -421,7 +421,7 @@`
`421`	`421`	`"tags": []`
`422`	`422`	`},`
`423`	`423`	`"source": [`
`424`		`- "What are categorical values?"`
	`424`	`+ "What are categorical values?\n"`
`425`	`425`	`]`
`426`	`426`	`},`
`427`	`427`	`{`
`@@ -439,16 +439,16 @@`
`439`	`439`	`"From [my workshop on data cleaning](https://github.com/afeld/data-cleaning):\n",`
`440`	`440`	`"\n",`
`441`	`441`	`"- Missing data\n",`
`442`		`- " - Empty values\n",`
	`442`	`+ " - Empty values\n",`
`443`	`443`	`"- Bad (junk) values\n",`
`444`		`- " - Duplicates\n",`
`445`		`- " - Mismatched types/formatting\n",`
	`444`	`+ " - Duplicates\n",`
	`445`	`+ " - Mismatched types/formatting\n",`
`446`	`446`	`"- Categorical values\n",`
`447`		`- " - Uniqueness (cardinality)\n",`
`448`		`- " - Value counts\n",`
	`447`	`+ " - Uniqueness (cardinality)\n",`
	`448`	`+ " - Value counts\n",`
`449`	`449`	`"- Continuous values\n",`
`450`		`- " - Ranges\n",`
`451`		`- " - Spread (distribution)"`
	`450`	`+ " - Ranges\n",`
	`451`	`+ " - Spread (distribution)\n"`
`452`	`452`	`]`
`453`	`453`	`},`
`454`	`454`	`{`
`@@ -464,7 +464,7 @@`
`464`	`464`	`"Notes:\n",`
`465`	`465`	`"\n",`
`466`	`466`	`"- \"Values\" in this case can be a single cell (in the spreadsheet sense) or a whole row\n",`
`467`		`- "- \"Missing\" or \"duplicates\" can be columns (Series), tables (DataFrames), rows, or cells"`
	`467`	`+ "- \"Missing\" or \"duplicates\" can be columns (Series), tables (DataFrames), rows, or cells\n"`
`468`	`468`	`]`
`469`	`469`	`},`
`470`	`470`	`{`
`@@ -482,7 +482,7 @@`
`482`	`482`	`"- Empty\n",`
`483`	`483`	`"- Bad\n",`
`484`	`484`	`"- Unique\n",`
`485`		`- "- Spread"`
	`485`	`+ "- Spread\n"`
`486`	`486`	`]`
`487`	`487`	`},`
`488`	`488`	`{`
`@@ -496,7 +496,7 @@`
`496`	`496`	`"tags": []`
`497`	`497`	`},`
`498`	`498`	`"source": [`
`499`		`- "## Setup"`
	`499`	`+ "## Setup\n"`
`500`	`500`	`]`
`501`	`501`	`},`
`502`	`502`	`{`
`@@ -528,7 +528,7 @@`
`528`	`528`	`"tags": []`
`529`	`529`	`},`
`530`	`530`	`"source": [`
`531`		`- "### Read our cleaned 311 Service Requests dataset"`
	`531`	`+ "### Read our cleaned 311 Service Requests dataset\n"`
`532`	`532`	`]`
`533`	`533`	`},`
`534`	`534`	`{`
`@@ -571,7 +571,7 @@`
`571`	`571`	`"\n",`
`572`	`572`	`"More data cleaning!\n",`
`573`	`573`	`"\n",`
`574`		`- "![Minion character vacuuming](https://impulsecreative.com/hs-fs/hubfs/cleaning-minion-gif.gif?width=490&name=cleaning-minion-gif.gif)"`
	`574`	`+ "![Minion character vacuuming](https://impulsecreative.com/hs-fs/hubfs/cleaning-minion-gif.gif?width=490&name=cleaning-minion-gif.gif)\n"`
`575`	`575`	`]`
`576`	`576`	`},`
`577`	`577`	`{`
`@@ -586,7 +586,7 @@`
`586`	`586`	`"source": [`
`587`	`587`	"```\n",
`588`	`588`	`"DtypeWarning: Columns (8,20,31,34) have mixed types.\n",`
`589`		- "```"
	`589`	+ "```\n"
`590`	`590`	`]`
`591`	`591`	`},`
`592`	`592`	`{`
`@@ -1273,7 +1273,7 @@`
`1273`	`1273`	`"tags": []`
`1274`	`1274`	`},`
`1275`	`1275`	`"source": [`
`1276`		`- "ZIP codes _look_ numeric, but aren't really."`
	`1276`	`+ "ZIP codes _look_ numeric, but aren't really.\n"`
`1277`	`1277`	`]`
`1278`	`1278`	`},`
`1279`	`1279`	`{`
`@@ -1286,7 +1286,7 @@`
`1286`	`1286`	`"tags": []`
`1287`	`1287`	`},`
`1288`	`1288`	`"source": [`
`1289`		`- "[Read the ZIP codes in as strings.](https://pandas.pydata.org/pandas-docs/stable/user_guide/text.html#text-data-types)"`
	`1289`	`+ "[Read the ZIP codes in as strings.](https://pandas.pydata.org/pandas-docs/stable/user_guide/text.html#text-data-types)\n"`
`1290`	`1290`	`]`
`1291`	`1291`	`},`
`1292`	`1292`	`{`
`@@ -1323,7 +1323,7 @@`
`1323`	`1323`	`"tags": []`
`1324`	`1324`	`},`
`1325`	`1325`	`"source": [`
`1326`		- "We fixed the dtype warning for column 8 (`Incident Zip`)."
	`1326`	+ "We fixed the dtype warning for column 8 (`Incident Zip`).\n"
`1327`	`1327`	`]`
`1328`	`1328`	`},`
`1329`	`1329`	`{`
`@@ -1728,7 +1728,10 @@`
`1728`	`1728`	`"└─ start of string\n",`
`1729`	`1729`	"```\n",
`1730`	`1730`	`"\n",`
`1731`		`- "[regex101](https://regex101.com/) is useful for testing them."`
	`1731`	`+ "Helpful tools:\n",`
	`1732`	`+ "\n",`
	`1733`	`+ "- [Regexper](https://regexper.com/#%5E%5Cd%7B5%7D%28%3F%3A-%5Cd%7B4%7D%29%3F%24)\n",`
	`1734`	`+ "- [regex101](https://regex101.com/)\n"`
`1732`	`1735`	`]`
`1733`	`1736`	`},`
`1734`	`1737`	`{`
`@@ -1911,7 +1914,7 @@`
`1911`	`1914`	`"tags": []`
`1912`	`1915`	`},`
`1913`	`1916`	`"source": [`
`1914`		`- "[Clear](https://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html#inserting-missing-data) any invalid ZIP codes:"`
	`1917`	`+ "[Clear](https://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html#inserting-missing-data) any invalid ZIP codes:\n"`
`1915`	`1918`	`]`
`1916`	`1919`	`},`
`1917`	`1920`	`{`
`@@ -1939,7 +1942,7 @@`
`1939`	`1942`	`"tags": []`
`1940`	`1943`	`},`
`1941`	`1944`	`"source": [`
`1942`		- "[`.loc[]`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html) is used for overwriting a subset of values."
	`1945`	+ "[`.loc[]`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html) is used for overwriting a subset of values.\n"
`1943`	`1946`	`]`
`1944`	`1947`	`},`
`1945`	`1948`	`{`
`@@ -1956,7 +1959,7 @@`
`1956`	`1959`	`"\n",`
`1957`	`1960`	`"- Hard part is finding what needs to be done\n",`
`1958`	`1961`	`"- Will be specific to your use case\n",`
`1959`		`- "- Document what you did, since it will affect your results"`
	`1962`	`+ "- Document what you did, since it will affect your results\n"`
`1960`	`1963`	`]`
`1961`	`1964`	`},`
`1962`	`1965`	`{`
`@@ -1969,7 +1972,7 @@`
`1969`	`1972`	`"tags": []`
`1970`	`1973`	`},`
`1971`	`1974`	`"source": [`
`1972`		`- "## [In-class exercise](https://python-public-policy.afeld.me/en/{{school_slug}}/lecture_2_exercise.html)"`
	`1975`	`+ "## [In-class exercise](https://python-public-policy.afeld.me/en/{{school_slug}}/lecture_2_exercise.html)\n"`
`1973`	`1976`	`]`
`1974`	`1977`	`},`
`1975`	`1978`	`{`
`@@ -1984,7 +1987,7 @@`
`1984`	`1987`	`]`
`1985`	`1988`	`},`
`1986`	`1989`	`"source": [`
`1987`		`- "## [Concatenation](https://pandas.pydata.org/docs/user_guide/merging.html#concat)"`
	`1990`	`+ "## [Concatenation](https://pandas.pydata.org/docs/user_guide/merging.html#concat)\n"`
`1988`	`1991`	`]`
`1989`	`1992`	`},`
`1990`	`1993`	`{`
`@@ -2250,7 +2253,7 @@`
`2250`	`2253`	`"tags": []`
`2251`	`2254`	`},`
`2252`	`2255`	`"source": [`
`2253`		`- "## Simple [merge](https://pandas.pydata.org/docs/user_guide/merging.html#merge)"`
	`2256`	`+ "## Simple [merge](https://pandas.pydata.org/docs/user_guide/merging.html#merge)\n"`
`2254`	`2257`	`]`
`2255`	`2258`	`},`
`2256`	`2259`	`{`
`@@ -2263,7 +2266,7 @@`
`2263`	`2266`	`"tags": []`
`2264`	`2267`	`},`
`2265`	`2268`	`"source": [`
`2266`		`- "_I had [Copilot](https://code.visualstudio.com/docs/copilot/overview) generate the DataFrames, so no idea if the numbers are real._"`
	`2269`	`+ "_I had [Copilot](https://code.visualstudio.com/docs/copilot/overview) generate the DataFrames, so no idea if the numbers are real._\n"`
`2267`	`2270`	`]`
`2268`	`2271`	`},`
`2269`	`2272`	`{`
`@@ -2445,7 +2448,7 @@`
`2445`	`2448`	`"tags": []`
`2446`	`2449`	`},`
`2447`	`2450`	`"source": [`
`2448`		`- "How should we combine them?"`
	`2451`	`+ "How should we combine them?\n"`
`2449`	`2452`	`]`
`2450`	`2453`	`},`
`2451`	`2454`	`{`
`@@ -2617,7 +2620,7 @@`
`2617`	`2620`	`"source": [`
`2618`	`2621`	"To join DataFrames together, we will use the [pandas `.merge()` function](https://pandas.pydata.org/pandas-docs/stable/getting_started/intro_tutorials/08_combine_dataframes.html#join-tables-using-a-common-identifier).\n",
`2619`	`2622`	`"\n",`
`2620`		`- "![merge diagram](https://pandas.pydata.org/pandas-docs/stable/_images/08_merge_left.svg)"`
	`2623`	`+ "![merge diagram](https://pandas.pydata.org/pandas-docs/stable/_images/08_merge_left.svg)\n"`
`2621`	`2624`	`]`
`2622`	`2625`	`},`
`2623`	`2626`	`{`
`@@ -2635,7 +2638,7 @@`
`2635`	`2638`	"- [SQL `JOIN`](https://pandas.pydata.org/pandas-docs/stable/getting_started/comparison/comparison_with_sql.html#join)\n",
`2636`	`2639`	"- [Spreadsheet `VLOOKUP`](https://pandas.pydata.org/pandas-docs/stable/getting_started/comparison/comparison_with_spreadsheets.html#merging)\n",
`2637`	`2640`	`"\n",`
`2638`		`- "In general, called [\"record linkage\" or \"entity resolution\"](https://en.wikipedia.org/wiki/Record_linkage)."`
	`2641`	`+ "In general, called [\"record linkage\" or \"entity resolution\"](https://en.wikipedia.org/wiki/Record_linkage).\n"`
`2639`	`2642`	`]`
`2640`	`2643`	`},`
`2641`	`2644`	`{`
`@@ -2815,7 +2818,7 @@`
`2815`	`2818`	`"tags": []`
`2816`	`2819`	`},`
`2817`	`2820`	`"source": [`
`2818`		`- "[Different types of merges](https://www.geeksforgeeks.org/different-types-of-joins-in-pandas/)"`
	`2821`	`+ "[Different types of merges](https://www.geeksforgeeks.org/different-types-of-joins-in-pandas/)\n"`
`2819`	`2822`	`]`
`2820`	`2823`	`},`
`2821`	`2824`	`{`
`@@ -2832,7 +2835,7 @@`
`2832`	`2835`	`"source": [`
`2833`	`2836`	`"## In-class exercise 2\n",`
`2834`	`2837`	`"\n",`
`2835`		`- "Compute the migrant population as a percent of total by country using [UN data](https://data.un.org/). You're welcome to talk with your neighbors."`
	`2838`	`+ "Compute the migrant population as a percent of total by country using [UN data](https://data.un.org/). You're welcome to talk with your neighbors.\n"`
`2836`	`2839`	`]`
`2837`	`2840`	`},`
`2838`	`2841`	`{`
`@@ -2845,7 +2848,7 @@`
`2845`	`2848`	`"tags": []`
`2846`	`2849`	`},`
`2847`	`2850`	`"source": [`
`2848`		`- "## [Homework 2](https://python-public-policy.afeld.me/en/{{school_slug}}/hw_2.html)"`
	`2851`	`+ "## [Homework 2](https://python-public-policy.afeld.me/en/{{school_slug}}/hw_2.html)\n"`
`2849`	`2852`	`]`
`2850`	`2853`	`}`
`2851`	`2854`	`],`