Not sure if this is the right community but seems close enough.
Ideally i want a url that i can just put any paywalled news article into that will return the unpaywalled version.
Ie: https://somedomain/https://somenewssite/somenewsartle
I need it to work with https://pypi.org/project/newspaper4k/
Alternativly if someone knows of another python library that can extract article text and images automaticly just from a link that would also solve my problem.
It does not use headless chrome it just uses the python requests library. Did u get got by an ai hallucination?
Source: i went digging in the source code.
No, just this example code from their site:
browser = p.chromium.launch(headless=True)
My mistake was not knowing where newspaper4k fits in the stack. They’re wrapping it with Playwright, which it seems you could do here.
Ahh i see. Im using newspaper4k to fetch articles directly it seems the example u found is just using it simply as a parser after using playwright as a html fetcher. I might try that approach.