Rick Garcia is a user on cybre.space. You can follow them or interact with them if you have an account anywhere in the fediverse.

oh, maybe you all would know this.

any tips for scraping a #wordpress-based site for the urls of all posts by a particular author? I tried a few combinations of lynx -dump, wget, & grep but don't know enough about any of them.

i.e. https://site.tld/author/authorsname, https://site.tld/author/authorsname/page/2, page/3, etc., where the posts are like https://site.tld/1970/01/01/title-of-post

@nev What you need is a spider. There was a tool that allowed you to download all the data from a site, or at least list all the links...

This list might help you.

en.wikipedia.org/wiki/Web_craw

Rick Garcia @rick_777

@nev HTTRACK! That was it! I used that to back up old websites of mine.

· Web · 0 · 0

@nev You have options to download external lnks, and whether to download or not pages outside a given path. It needs a bit of trial and error, but it's excellent for backups.