Skip to content

refactor: unify CrossRef request and parsing methods (#26)#37

Merged
HzaCode merged 2 commits intomainfrom
fix/26-unify-crossref-methods
Apr 6, 2026
Merged

refactor: unify CrossRef request and parsing methods (#26)#37
HzaCode merged 2 commits intomainfrom
fix/26-unify-crossref-methods

Conversation

@HzaCode
Copy link
Copy Markdown
Owner

@HzaCode HzaCode commented Feb 19, 2026

Closes #26

Unify the 4 overlapping CrossRef methods in pipeline.py into a shared parsing function + 2 HTTP methods, eliminating redundant API calls.

See issue #26 for details.

HzaCode and others added 2 commits February 20, 2026 02:12
- Add journal_article_with_abstract template with pubmed_api source
- Extract abstract from Crossref API response with JATS cleanup
- Add _get_pubmed_abstract method (DOI->PMID->efetch path)
- Add pubmed_api source support in _fetch_missing_field
- Fix _strip_html_tags: replace tags with space, double-unescape entities
- Fix parser to preserve bare PMID inputs as query_string
- Add 6 unit tests covering all new abstract-related code paths
@HzaCode HzaCode merged commit 22299d0 into main Apr 6, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[pyos][design] Not one, not two, but four different methods to get and parse crossref metadata

1 participant