Jump to content

Looking for a program that can gather entire URL Directory


VilleValoV

Recommended Posts

Looking for a program that can find full branches of a submitted URL. i.e

if i typed

Zoo.com

the desired result would be

Zoo.com/shops

zoo.com/shops/shirts

zoo.com/shops/shirts/girls

zoo.com/animals

zoo.com/animals/bats

zoo.com/animals/bats/naturalhabitat

zoo.com/blah/blah

zoo.com/blah/blah/blah

And so on.

Does a program like this exist?

Link to comment
Share on other sites

Looking for a program that can find full branches of a submitted URL. i.e

if i typed

Zoo.com

the desired result would be

Zoo.com/shops

zoo.com/shops/shirts

zoo.com/shops/shirts/girls

zoo.com/animals

zoo.com/animals/bats

zoo.com/animals/bats/naturalhabitat

zoo.com/blah/blah

zoo.com/blah/blah/blah

And so on.

Does a program like this exist?

If they are public and indexed by a search engine, such as google, then you can try "site:zoo.com" as a google search to see what it finds.

ex: http://www.google.com/search?hl=en&q=s...G=Google+Search

You could also try a thrid party app to follow links in a domain and return all found links from each page within the site, excluding links out of the base domain.

Link to comment
Share on other sites

You could also try a thrid party app to follow links in a domain and return all found links from each page within the site, excluding links out of the base domain.

that's what im looking for, something automated that can be used along with another program to simplify a certain task.

Link to comment
Share on other sites

that's what im looking for, something automated that can be used along with another program to simplify a certain task.

You could try writing a wget script to follow all relative links of a domain and output it to a text file.

Link to comment
Share on other sites

Well, I am not sure this program will do what you say what you want. But I have used it before, and it does copy entire websites and a few directories connected to said website.

I say take a looks and see.

HTTrack Website Copier

Link to comment
Share on other sites

If you have wget installed and in your path, save this in a bat script called spider.bat and run it. It will do the rest for you.

:123
@echo OFF
cls
echo "Choose a site to spider links."
SET /P website="[example: www.google.com] : "

wget -erobots=off --accept="html,htm,php,asp,aspx,jsp,chm,shtml" -rH --spider -P SpiderDownloadShit/ -D%website% --user-agent="Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" %website% -do SpiderDebug.txt 

echo Links found in %website%. SpiderDownloadShit/ is to be ignored.  > Spiderlinks.txt

find "saved" SpiderDebug.txt >> Spiderlinks.txt

del SpiderDebug.txt

rmdir SpiderDownloadShit /s /q

::pause
goto:123

Edit: Cleaned it up and fixed it so you dont have any folders or files to delete afterwards!! Now it only spiders the site for available links.

Link to comment
Share on other sites

Problem is, you don't know which files and folders exist, so trying all possible filenames is gonna be a fail effort. AFAIK there isn't a 'get all the porn off this site' tool :(

This is only going to give you a quick list of links it can find from within the page you feed it, it doesn't know if hidden directories or ones with no links to follow. Without root or remote access to the machine itself, you won't be able to list all the directories under the site.

Link to comment
Share on other sites

it would be simple to create a http scanner that brute forces the url or uses a dictionary file.

I think I made that in another post somewhere, but yeah, it can be fed a list to go through. It was in a thread someone asked about a url brute forcer or something to that effect. I think I posted the code to do it too, just have to find the thread.

Link to comment
Share on other sites

I felt it was relevent to this thread.

It is because it is very true. The fact that just about every script kiddie would download it is a given.

Here is a start:

1.1.1.1
1.1.1.2
1.1.1.3
1.1.1.4
1.1.1.5
1.1.1.6
1.1.1.7
1.1.1.8
1.1.1.9
1.1.2.0

Link to comment
Share on other sites

It is because it is very true. The fact that just about every script kiddie would download it is a given.

Here is a start:

1.1.1.1
1.1.1.2
1.1.1.3
1.1.1.4
1.1.1.5
1.1.1.6
1.1.1.7
1.1.1.8
1.1.1.9
1.1.2.0

you do know that's not how it'd go?

Link to comment
Share on other sites

you do know that's not how it'd go?

You had to ask? Or was that a rhetorical question :)

Anyway, @nicatronTg, if you look at the thread, he is not trying to scan every ip address(which what you posted wouldn't work anyway). He wants to find directiories under a site, ex: somesite.com where it would search for

somesite.com/somevariable

somesite.com/somevariableetc

somesite.com/somevariableetcetc

somesite.com/etc/etc/etc

In doing so, when you generate enough misses, the webadmin(If he is not sleeping on the job) will see his error logs filling up with crap and ban your IP address anyway, so it really is of no use to use this method to enumerate a site.

BUT, if you have fingered your target system well and know certain flaws, common folder structures, etc, you can make separate scripts that can go out and check the server for specific directories, or even flaws in things like PHP, mysql, etc., just by navigating to the site or adding commands, scripts, and things to the URL. Just use common sense when testing this out. Preferably on your own site or network, we'll call it Pentesting. Do it to another site, and well, now your just asking for trouble.

Link to comment
Share on other sites

Lol, an actual every ip adress in the world program... now we need one for IPv6.

lol, like anyone cares about or fully understands IPv6. (honestly its easier to remember mac addresses).

shouldn't it filter out network addresses like:

10.x.x.x

172.16.x.x

169.168.x.x

and emphasize 127.0.0.1.

not that I'm bored or anything.

Link to comment
Share on other sites

Automated software can be configured to just ban you after too many attempts. Here is some stuff that works against people tyring to brute force SSH, but I am sure it can be configured site wide as well: http://isc.sans.org/diary.html?storyid=4408

Link to comment
Share on other sites

  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...