chore(sinan DAGS): create DAG to fetch dengue data from SINAN#201
chore(sinan DAGS): create DAG to fetch dengue data from SINAN#201luabida wants to merge 12 commits intothegraphnetwork:mainfrom
Conversation
89c1f65 to
c0cee37
Compare
| except ProgrammingError as error: | ||
| if str(error).startswith("(psycopg2.errors.UndefinedColumn)"): | ||
| # Include new columns to table | ||
| column_name = str(error).split('"')[1] |
There was a problem hiding this comment.
I think that obtaining the missing column name from the error message is not a good approach, because if psycopg2 changes the wording in their error messages it will break our code. I think we should instead look at the list of column names of the parquet files and compare them with the columns in the current schema. From the difference in these lists, which can be efficiently obtained as list(set(cols1)-set(cols2)), we can then create the alter table query adding the new columns to the database table. With this approach, we don't even need to rely on an exception being raised. This determination of the missing columns can be done before the first insert.
| logging.debug(f"{file} inserted into db") | ||
| try: | ||
| insert_parquets(parquets.path, year) | ||
| except ProgrammingError as error: |
fccoelho
left a comment
There was a problem hiding this comment.
I am thinking if it would make sense to merge these three DAGs into a Single SINAN DAG, which would take the disease name as a parameter, much like we have in PySUS, as a single function to fetch all the "agravos"
f6f0cd7 to
c76c431
Compare
No description provided.