Recently, I came about some e-books that are html only (sucks yeah), but they are good books and I want to really have them locally. So I need to download ‘em.
I know. There are GUI tools for it. But what if you are stuck in a terminal only server? I am behind a very strict proxy, but I have a server that I can FTP into and the server is not behind the proxy. But the server is terminal only, hence the wget option.
wget can download the whole internet if you so wish. and it’s simple
wget -r url
Now before you go there are a few caveats.
The sites will be downloaded, but will not be really suitable for offline viewing. To enable relative links do
wget -rk url
The above will convert the files to be suitable for offline viewing as necessary. You might want wget to keep the original files.
wget -rkK url
Also another caveat. This option will only download the html file. To tell wget to download all files necessary to display the page properly (images, sounds, linked css etc) use
wget -rkp url
Again, don’t go yet. The default level of links to follow is 5. This might be too much (or too small in case your plan is to download the whole internets). you can specify the link level thus
wget -rkpl 5 url
Finally, you might want wget to do all the hard work of downloading the internet and delete the files immediately after.
wget -r –delete-after url
is also a good place to start learning more about the things that wget can do.
That’s it. Happy interwebs downloading.