oh, maybe you all would know this.
any tips for scraping a #wordpress-based site for the urls of all posts by a particular author? I tried a few combinations of lynx -dump, wget, & grep but don't know enough about any of them.
i.e. https://site.tld/author/authorsname, https://site.tld/author/authorsname/page/2, page/3, etc., where the posts are like https://site.tld/1970/01/01/title-of-post
@nev What you need is a spider. There was a tool that allowed you to download all the data from a site, or at least list all the links...
This list might help you.
https://en.wikipedia.org/wiki/Web_crawler#Open-source_crawlers
@nev HTTRACK! That was it! I used that to back up old websites of mine.
@nev Found a page with general instructions and comments:
https://wptavern.com/how-to-archive-a-site-you-dont-have-access-to
@nev Good luck! 😉 👍
@rick_777 this looks handy, thanks!
@nev Official httrack / winhttrack manual:
https://www.httrack.com/html/shelldoc.html