Wikimedia requires a user agent to be set when downloading in bulk. It’s not a hard requirement, so downloads seem to work for the first few hundred, but will start to generate 403s if a user agent is not set.
Their policy can be seen here: https://meta.wikimedia.org/wiki/User-Agent_policy
When I ran into this issue I ran commons-downloader to generate URLs.txt, stopped it, and ran wget manually:
wget -i ./_URLS.txt -U “commons-downloader/1.0" -nc
Downloads were working again and no 403s were received..
As a side note, would an option to only generate the list of URLs be valuable? Do not download any files immediately.
- Wdavery