|
35 | 35 | "cell_type": "markdown", |
36 | 36 | "metadata": {}, |
37 | 37 | "source": [ |
38 | | - "We will start by testing some of our functions:\n", |
| 38 | + "## Exceptions \n", |
| 39 | + "Remember when you tried to run `02_visualize-wines.py`? It woud not work unless you had created a figures directory beforehand.\n", |
| 40 | + "\n", |
| 41 | + "We can catch this kinds of errors by adding this piece of code:\n", |
| 42 | + "```python\n", |
| 43 | + "try:\n", |
| 44 | + " fig.savefig(fname, bbox_inches = 'tight')\n", |
| 45 | + " except OSError as e:\n", |
| 46 | + " os.makedirs('figures')\n", |
| 47 | + " print('Creating figures directory')\n", |
| 48 | + " fig.savefig(fname, bbox_inches='tight')\n", |
| 49 | + "```\n", |
| 50 | + "Now our `runall` should work!!! 🎉🎉" |
| 51 | + ] |
| 52 | + }, |
| 53 | + { |
| 54 | + "cell_type": "markdown", |
| 55 | + "metadata": {}, |
| 56 | + "source": [ |
| 57 | + "## Unit testing\n", |
39 | 58 | "Open `03_country-subset.py` and add the following function:\n", |
40 | 59 | " \n", |
41 | 60 | "```python \n", |
|
80 | 99 | "\n", |
81 | 100 | "Now we can create our tests:\n", |
82 | 101 | "```\n", |
83 | | - "$ touch tests/__init__.py\n", |
84 | | - "$ touch test_03_country_subset.py\n", |
| 102 | + "$ mkdir tests # Create tests directory\n", |
| 103 | + "$ touch tests/__init__.py # Help find the test\n", |
| 104 | + "$ touch test_03_country_subset.py # Create our first test\n", |
85 | 105 | "```\n", |
86 | 106 | "⭐ Your test scripts name must start with: `test`" |
87 | 107 | ] |
|
96 | 116 | "\n", |
97 | 117 | "country = importlib.import_module('.data.03_country-subset', 'src')\n", |
98 | 118 | "\n", |
99 | | - "interim_data = \"data/interim/2018-04-30-winemag_priceGBP.csv\"\n", |
100 | | - "processed_data = \"data/processed/2018-04-30-winemag_Chile.csv\"\n", |
| 119 | + "interim_data = \"data/interim/2018-05-09-winemag_priceGBP.csv\"\n", |
| 120 | + "processed_data = \"data/processed/2018-05-03-winemag_Chile.csv\"\n", |
101 | 121 | "\n", |
102 | 122 | "def test_get_mean_price():\n", |
103 | 123 | " mean_price = country.get_mean_price(processed_data)\n", |
|
106 | 126 | "\n", |
107 | 127 | "And you can run it from the shell using:\n", |
108 | 128 | "```\n", |
109 | | - "$ python -m pytest tests/test_03_country-subset.py\n", |
| 129 | + "$ pytest\n", |
110 | 130 | "```" |
111 | 131 | ] |
112 | 132 | }, |
113 | 133 | { |
114 | 134 | "cell_type": "markdown", |
115 | 135 | "metadata": {}, |
116 | 136 | "source": [ |
117 | | - "## What if you want all the decimal numbers?\n", |
| 137 | + "### What if you want all the decimal numbers?\n", |
118 | 138 | "\n", |
119 | 139 | "``` python\n", |
120 | 140 | "import importlib\n", |
|
166 | 186 | "``` " |
167 | 187 | ] |
168 | 188 | }, |
| 189 | + { |
| 190 | + "cell_type": "markdown", |
| 191 | + "metadata": {}, |
| 192 | + "source": [ |
| 193 | + "Pytest tells us which tests passed and which did not:\n", |
| 194 | + "\n", |
| 195 | + "```python\n", |
| 196 | + " {message}\n", |
| 197 | + " [left]: {left}\n", |
| 198 | + " [right]: {right}\"\"\".format(obj=obj, message=message, left=left, right=right)\n", |
| 199 | + "\n", |
| 200 | + " if diff is not None:\n", |
| 201 | + " msg += \"\\n[diff]: {diff}\".format(diff=diff)\n", |
| 202 | + "\n", |
| 203 | + "> raise AssertionError(msg)\n", |
| 204 | + "E AssertionError: DataFrame are different\n", |
| 205 | + "E\n", |
| 206 | + "E DataFrame shape mismatch\n", |
| 207 | + "E [left]: (4472, 6)\n", |
| 208 | + "E [right]: (4472, 7)\n", |
| 209 | + "```" |
| 210 | + ] |
| 211 | + }, |
| 212 | + { |
| 213 | + "cell_type": "markdown", |
| 214 | + "metadata": {}, |
| 215 | + "source": [ |
| 216 | + "We now know what kind of bugs we can encounter.\n", |
| 217 | + "Let's fix this, open `03_subset-country.py` and add the following lines\n", |
| 218 | + "\n", |
| 219 | + "```python\n", |
| 220 | + "def get_country(filename, country):\n", |
| 221 | + " # Load table\n", |
| 222 | + " wine = pd.read_csv(filename)\n", |
| 223 | + "\n", |
| 224 | + " # Use the country name to subset data\n", |
| 225 | + " subset_country = wine[wine['country'] == country ].copy()\n", |
| 226 | + " subset_country.reset_index(drop=True, inplace=True) \n", |
| 227 | + "\n", |
| 228 | + " # Constructing the fname\n", |
| 229 | + " today = datetime.datetime.today().strftime('%Y-%m-%d')\n", |
| 230 | + " fname = f'data/processed/{today}-winemag_{country}.csv'\n", |
| 231 | + "\n", |
| 232 | + " # Saving the csv\n", |
| 233 | + " subset_country.to_csv(fname, index =False)\n", |
| 234 | + " print(fname) # print the fname from here\n", |
| 235 | + "\n", |
| 236 | + " return(subset_country) #returns the data frame\n", |
| 237 | + "```" |
| 238 | + ] |
| 239 | + }, |
169 | 240 | { |
170 | 241 | "cell_type": "markdown", |
171 | 242 | "metadata": {}, |
|
189 | 260 | "cell_type": "markdown", |
190 | 261 | "metadata": {}, |
191 | 262 | "source": [ |
192 | | - "# Past as Truth\n", |
| 263 | + "## Past as Truth (regression tests)\n", |
193 | 264 | "\n", |
194 | 265 | "Regression tests assume that the past is “correct.” They are great for letting developers know when and how a code base has changed. They are not great for letting anyone know why the change occurred. The change between what a code produces now and what it computed before is called a regression.\n", |
195 | 266 | "\n", |
196 | 267 | "** How many times have you tried to run a script or a notebook you found online just to realize it is broken?**\n", |
197 | 268 | "\n", |
198 | | - "Let's do some regression testing on the Jupyter notebook using *nbval*" |
| 269 | + "Let's do some regression testing on the Jupyter notebook using [nbval](https://github.com/computationalmodelling/nbval)" |
199 | 270 | ] |
200 | 271 | }, |
201 | 272 | { |
|
244 | 315 | "cell_type": "markdown", |
245 | 316 | "metadata": {}, |
246 | 317 | "source": [ |
247 | | - "Make sure everything is commited to git before carrying on.\n" |
| 318 | + "<div class='warn'>Make sure everything is commited to git before carrying on.</div>\n", |
| 319 | + "<br>\n", |
| 320 | + "Add the following line to your `runall-wine-analysis` script\n", |
| 321 | + "\n", |
| 322 | + "```python\n", |
| 323 | + "import recipy\n", |
| 324 | + "```\n", |
| 325 | + "Run the script again `python -m src.runall-wine-analysis`" |
| 326 | + ] |
| 327 | + }, |
| 328 | + { |
| 329 | + "cell_type": "markdown", |
| 330 | + "metadata": {}, |
| 331 | + "source": [ |
| 332 | + "You can now track the provenance of your project. \n", |
| 333 | + "\n", |
| 334 | + "Try using `recipy latest` and `recipy gui`" |
248 | 335 | ] |
249 | 336 | }, |
250 | 337 | { |
|
0 commit comments