Download All Files From Website

Psionic Fungus · May 31, 2009

Hey guys.

Trying to create an offline copy of all the Security Now! podcasts, along with all the related show notes, and transcripts, etc. (Only mp3, pdf, htm/html, and txt files)

With 198 episodes, with six files each....

I don't wanna spend all day downloading 1188 files.

Any recommendations?

I've looked for torrents, but they don't contain all the files (or at least the ones I've found)

Any help?

Psionic Fungus · May 31, 2009

Hey guys.
Trying to create an offline copy of all the Security Now! podcasts, along with all the related show notes, and transcripts, etc. (Only mp3, pdf, htm/html, and txt files)

With 198 episodes, with six files each....

I don't wanna spend all day downloading 1188 files.

Any recommendations?

I've looked for torrents, but they don't contain all the files (or at least the ones I've found)

Any help?

DownThemAll! plugin for Mozilla Firefox

https://addons.mozilla.org/en-US/firefox/addon/201

ls · June 1, 2009

You can use HTTrack to create an offline copy from a website:

http://www.httrack.com/

digip · June 1, 2009

You can write a little script to have wget do it as well. Wget can follo wlinks on the site and you cna specify what files to download, like mp3, pdf, etc. Here is a BAT script for windows that I use when mirroring a site. Change the accept settings to the files you want, only including those you need.

DownloadSpider.bat

:123
@echo OFF
cls
echo Choose a site to download links from.
SET /P website="[example: www.google.com] : "

wget -erobots=off --accept="html,htm,php,phps,phtml,jpg,jpeg,gif,png,bmp,pl,txt,asp,aspx,jsp,js,chm,shtm
l,css,mov,avi,mpg,mp3,mp4,pdf,flv,swf,bz2,tar,rar,zip,exe" -l 1000 -rH -P SpiderDownloadShit/ -D%website% --no-check-certificate --user-agent="Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" %website% -do SpiderDebug.txt 

echo Links found in %website%. SpiderDownloadShit/ is to be ignored.  &gt; Spiderlinks.txt

find "saved" SpiderDebug.txt &gt;&gt; Spiderlinks.txt

::del SpiderDebug.txt

::rmdir SpiderDownloadShit /s /q

::pause
goto:123

This will only go to a depth of 1000. Change it to 0 for infinite recursion(not recommended if you plan to run it and then walk away, as you could be at a site all day filling your hdd, and this one follows foreign links. Use -L for relative links only)

You can do the same thing on linux, just have to rewrite it as a shell script.

Brian Sierakowski · June 1, 2009

Sounds like you can probably get it from the above suggestions, but flashget is basically the same idea. Put in the address and specify the file types, it will crawl the site and download the specified file types into a folder you select.

-Brian

Sign In

Download All Files From Website

Recommended Posts

Psionic Fungus

Link to comment

Share on other sites

Psionic Fungus

Link to comment

Share on other sites

ls

Link to comment

Share on other sites

digip

Link to comment

Share on other sites

Brian Sierakowski

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members

Browse

Activity