Jump to content

Download All Files From Website


Psionic Fungus

Recommended Posts

Hey guys.

Trying to create an offline copy of all the Security Now! podcasts, along with all the related show notes, and transcripts, etc. (Only mp3, pdf, htm/html, and txt files)

With 198 episodes, with six files each....

I don't wanna spend all day downloading 1188 files.

Any recommendations?

I've looked for torrents, but they don't contain all the files (or at least the ones I've found)

Any help?

Link to comment
Share on other sites

Hey guys.

Trying to create an offline copy of all the Security Now! podcasts, along with all the related show notes, and transcripts, etc. (Only mp3, pdf, htm/html, and txt files)

With 198 episodes, with six files each....

I don't wanna spend all day downloading 1188 files.

Any recommendations?

I've looked for torrents, but they don't contain all the files (or at least the ones I've found)

Any help?

DownThemAll! plugin for Mozilla Firefox

https://addons.mozilla.org/en-US/firefox/addon/201

Link to comment
Share on other sites

You can write a little script to have wget do it as well. Wget can follo wlinks on the site and you cna specify what files to download, like mp3, pdf, etc. Here is a BAT script for windows that I use when mirroring a site. Change the accept settings to the files you want, only including those you need.

DownloadSpider.bat

:123
@echo OFF
cls
echo Choose a site to download links from.
SET /P website="[example: www.google.com] : "

wget -erobots=off --accept="html,htm,php,phps,phtml,jpg,jpeg,gif,png,bmp,pl,txt,asp,aspx,jsp,js,chm,shtm
l,css,mov,avi,mpg,mp3,mp4,pdf,flv,swf,bz2,tar,rar,zip,exe" -l 1000 -rH -P SpiderDownloadShit/ -D%website% --no-check-certificate --user-agent="Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" %website% -do SpiderDebug.txt 

echo Links found in %website%. SpiderDownloadShit/ is to be ignored.  > Spiderlinks.txt

find "saved" SpiderDebug.txt >> Spiderlinks.txt

::del SpiderDebug.txt

::rmdir SpiderDownloadShit /s /q

::pause
goto:123

This will only go to a depth of 1000. Change it to 0 for infinite recursion(not recommended if you plan to run it and then walk away, as you could be at a site all day filling your hdd, and this one follows foreign links. Use -L for relative links only)

You can do the same thing on linux, just have to rewrite it as a shell script.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...