Blue Dragon Posted February 3, 2008 Share Posted February 3, 2008 Hi. I was searching for a tool to find out the folder-strukture of a webserver. For example if a have the server www.server.com I'd like to find all the folders that are on there, like www.server.com/data , www.server.com/pictures , www.server.com/videos/action and so on. I tried it with google by using site:www.server.com, but it showed only some of the folders and subdomains. Then I tried Maltego but still only got some subdomains and a few folders. So i decided to try to write a little brute force script in php because thats the only language I'm a little used to. Here is my script: <?php $from = "aaa"; $to = "zzz"; for($i=$from; $i<=$to; $i++) { $url = "http://localhost/testcheck/$i"; $test = @fopen($url,"r"); if($test) { echo $url." is reachable "; } } ?> It's very simple and just tries all the possible combinations from aaa to zzz and then tries to connect to http://localhost/testcheck/$i and if this is succesful, it tells me what urls are reachable. The folder testcheck contains the files abg and hkg wich are found in a couple of secounds - it works! The problem is, that it's really slow and it takes a long time to try all the possible combinations. In my Tests I only used aaa to zzz, but when I tried on a real server with aaaaa to zzzzz it took very long. And there was another problem: I think my IP-Adressed got banned after a few attemts. I looked at my connection with wireshark and a some time (I think is was a aaaazgh or sth) the connection stopped. Well, now a have a lot of questions and I hope you guys can help me with some of them. This is a great forum and I really like Hak5! 1) Are there other Programs out there that let me find out the folder-strukture of a webserver? Either by Bruteforcing or by any other method? 2) How can I speed up my php script to try more than one possible combination each 1 or 2 secs? 3) How can I solve the problem of getting blocked after a few attemts of connecting to the server? Is there some program that jumps from proxy to proxy to change my IP? 4) With my current script I can only test one folder-layer. How can I find stuff like www.server.com/data/pictures or www.server.com/data/10-home (numbers in the url)? 5) I've heard of programs called "webcrawler" wich are often used by searchengines like google. Can I use those to find the folder-stukture of a webserver? If yes, do you have any suggestions on what programs are good? Are there any good ones for free out there? 6) Do you have any other tips for me? Hope my english isn't that bad :-) Greetings form germany! Quote Link to comment Share on other sites More sharing options...
digip Posted February 3, 2008 Share Posted February 3, 2008 Aside from giving the program a list of pre defined things to search for, you really cant unveil all the folders located ont he server unless the server itself has a flaw that exposes all the files, or you have physical access to logon to the server and list them from the shell. Short of taking over the server or exploiting a flaw, there is not a real way to do it other than trial and error. Your script, I assume, is hammering the server you are attempting to look through and 1 of 2 things are happening. 1 they have IDS software and automatically blocked you for fear of a DoS attack 2 the Admin checks the logs and saw too many errors coming from your IP, so they blocked you. One way to work around this is create a list of predefined words to check for, and then just send the head command for each folder yoru searching for. Then have it wait so many seconds between requests. Still, if the Admin is doing their job, they will still see the requests and may ban you for fear of an attempted break in. You really have to hammer the server though with a lot of errors in the process. Just the shear bandwith of requests from using from an automated script would probably get you banned. If I were going to try this, I would probably figure out how to do it with wget and a timer and use something like a googlebot or msnbot user agent so it just looks like a webcrawler to the server and they might not ban it. Quote Link to comment Share on other sites More sharing options...
Blue Dragon Posted February 3, 2008 Author Share Posted February 3, 2008 Hey digip. Thx for your answer. I tried using wget, but afaik it only follows links on a website and downloads everything it finds on its way. But the problem is: The Server I'm trying to find the strukture of doesn't link to some of the folders I'd like to find. Also, wget downloads all the content, but I only want to know the urls and the folders that are on the server. Where can I get a "googlebot or msnbot user agent" and what exactly does it do? Quote Link to comment Share on other sites More sharing options...
digip Posted February 3, 2008 Share Posted February 3, 2008 Hey digip. Thx for your answer. I tried using wget, but afaik it only follows links on a website and downloads everything it finds on its way. But the problem is: The Server I'm trying to find the strukture of doesn't link to some of the folders I'd like to find. Also, wget downloads all the content, but I only want to know the urls and the folders that are on the server. Where can I get a "googlebot or msnbot user agent" and what exactly does it do? You would have to write a script that uses wget commands that loops through it until all the requests are done. It doesnt have to flollow links unless you tell it to. Quote Link to comment Share on other sites More sharing options...
digip Posted February 3, 2008 Share Posted February 3, 2008 Example bat del mylog.txt wget --append-output="mylog.txt" --input-file="URLS_TOCHECK.txt" --base="http://www.somesite.com/" --user-agent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)" Then in another text file such as "URLS_TOCHECK.txt" you place the search terms on each line by themself, so the output from your PHP script would go here on a line by itself and fed to wget. ex "URLS_TOCHECK.txt": aaa aaaa aaaaa aaa/aaa aaa/aaaa Place the bat file in a folder and run it . Then go through the log to see what it finds. See attached rar file as an example and edit the websites base url and the urls to search for. Its sloppy but it works. http://www.twistedpairrecords.com/digip/wg...erStructure.rar edit: The only thing this needs it a timer function because it will continue one after another and may hamme rthe server. I think wget has a function to do this but I am too lazy to look for it. Quote Link to comment Share on other sites More sharing options...
Blue Dragon Posted February 3, 2008 Author Share Posted February 3, 2008 Thanks a lot, digip! This script works perfect with my testsever and I'll try it with a "real" server soon. It's great that it uses something like a Dictionary-attack. This way I can save process-power by using a wordlist-file. Thx again for your help! Quote Link to comment Share on other sites More sharing options...
digip Posted February 3, 2008 Share Posted February 3, 2008 There is also a setting to write the output to a null (havent looked a tit yet) so it does not actually save any output other than the log. This way you dont end up with a bunch of index.htm(1), index.html(2), etc, files filling up the folder. Quote Link to comment Share on other sites More sharing options...
Sparda Posted February 3, 2008 Share Posted February 3, 2008 Brute forcing a web servers directory structure is very noisy (your basically screaming "I'M PROBING YOUR SERVER!" down the phone to the IDS). Any good IDS would blacklist your IP after the first 2 or 3 attempts (and all you got where 404's). This method is successful however. You won't necessarily be able to see what is in the directories, but you know they are there (how useful.) This is demonstrated with the different errors returned by these pages: http://sparda.hopto.org/hak5/ (this one exists) http://sparda.hopto.org/a/ (this one does not) Quote Link to comment Share on other sites More sharing options...
digip Posted February 3, 2008 Share Posted February 3, 2008 Brute forcing a web servers directory structure is very noisy (your basically screaming "I'M PROBING YOUR SERVER!" down the phone to the IDS). Any good IDS would blacklist your IP after the first 2 or 3 attempts (and all you got where 404's). This method is successful however. You won't necessarily be able to see what is in the directories, but you know they are there (how useful.) This is demonstrated with the different errors returned by these pages: http://sparda.hopto.org/hak5/ (this one exists) http://sparda.hopto.org/a/ (this one does not) Yeah, I mentiond that in my first post that with enough error hits that someone or software tiself would probably restrict access from his IP. Thats why he needs a timer to randomize the requests and delay them over a ling period of time. Even if you find one with 403 forbidden, you cant see what's in it, just that it exists. About the only thing I think this would be suefull for is findign porn (j/k). If anything, you will find yourself blacklisted around the sites your trying to get into. Quote Link to comment Share on other sites More sharing options...
ls Posted February 10, 2008 Share Posted February 10, 2008 I've written a little script to search a webserver for folders : #!/usr/bin/python import urllib2 site = raw_input("site : ") # http://www.google.com/ ---> this must be in this form list = open((raw_input("list with folders : "))) # a textfile , one folder/line for folder in list : try : url = site+folder urllib2.urlopen(url).read() msg = "[-] folder " + folder + " exist" print msg except : msg = "[-] folder " + folder + "does not exist" print msg print "" print "[-] done" you have to make a text file with the folders i hope this helps Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.