Is there any way to mirror an entire Online Dictionary website?



craigevil

Well-Known Member
Joined
Feb 24, 2021
Messages
274
Reaction score
274
Credits
1,903

Tolkem

Well-Known Member
Joined
Jan 6, 2019
Messages
1,254
Reaction score
1,022
Credits
8,745
I want to mirror https://dictionary.cambridge.org/us/ to my local directory to read the definitions offline. Is there any way to do it with the wget command?
Yes, there is. Use this.
Code:
wget --mirror --convert-links --backup-converted \
     --html-extension -o /home/me/weeklog        \
     https://www.gnu.org/
or
Code:
wget -m -k -K -E https://www.gnu.org/ -o /home/me/weeklog

You need to change values as per your use case, of course. It's all here https://www.gnu.org/software/wget/manual/wget.html#Advanced-Usage
 

stan

Well-Known Member
Joined
Mar 19, 2018
Messages
965
Reaction score
1,099
Credits
8,982
I don't suppose you have any idea how much data you're looking at if you download https://dictionary.cambridge.org/us/ -- but it's probably more than you realize. It was definitely more than I expected. I tried using the wget code below, and it seemed to work fine, but I aborted it after 1 hour and 20 minutes. I have a pretty fast cable connection, but it may be slowed somewhat by my VPN. I have no idea whether the download would have completed in another minute, or another hour, or 10 hours. But you can see in my screen shot I got almost 61,000 files and folders (over 10 GB) when I quit. For a dictionary... Wow! :eek:

This will at least give you some pause for thought. If you have a good connection and no data caps, maybe you'll go for it until finished. Or you may want to consider some other options. ;)

And to be clear... my partial download was not functional. There's no guarantee that a full download would be functional either. There could be critical resources that do not download. Or maybe my wget command is not quite right... it came from some very old notes I have.

Code:
wget --no-clobber --no-parent --convert-links --random-wait -r -p -E -e robots=off -U mozilla https://dictionary.cambridge.org/us/

dictionary.png


And inside the /us/ folder you'll find:
dictionary2.png
 
Last edited:
$100 Digital Ocean Credit
Get a free VM to test out Linux!

Members online


Top