Jump to content

Build A Web Scanner... Grep Document Write?


Recommended Posts

Now this is just a example, im looking for help on building a list or a dictionary

i have built a few scanners and crawlers, i have a idea... im sure its already been done but i like to do things my way...

lets say you use wget to crawl and download a entire site, now you have all contents download into tmp directory then use a linux command like grep(just for example) to find a string...

cat /tmp/site_crawl/ * | grep exec

echo exec(var)

echo pcntl_exec(var)


example of a list i would like to build...





`` (backtick operator)

im looking for a universal list for finding ALL possibility's, not just php... i guess the goal is to find a way to execute or write data on the server...

maybe there are vulnerabilities for css, java, php etc... any string that may need further investigation...

this is just example, im sure you will never find any php exec etc in plain text...

Edited by i8igmac
Link to comment
Share on other sites

Thing you will want to know, is what types of files are executable, and which ones are vulnerable to attack. In most cases, its usually a quick find on php files, but there are other file types that can be used to attack servers, most of which are for uploading, or retrieving server side files. For example, a site might use an upload script, which would have a form in it with something like:


So grepping for that would be one to look for. Then, testing the site to see what types of files it allows to upload, and if it sends back the URL to where they are uploaded. Not all scripts that allow uploads, are vulnerable to attack, so manual testing on found search results would need to be tested. Things you might try with an upload form, upload: reverse shell, php files renamed to .jpg top see if still executable, dumping /etc/passwd or /etc/shadow, and so on.

Other things to look for, are urls that end in commands to pull results, either database driven, or manual file retrieval. For example, some server side scripts are used to send users stored PDF files and other documents. By manipulating the url, you can tell the script to grab any file on the server in the users home directory, including the server side retrieval script itself. If the script returns itself, it will show you the source code of the script itself, giving you the ability to read through it for more vulns, with the potential to do direcotry traversal above the users home directory, and pull root server files down, do directory listings, etc.

Some server side scripts also do compression, returning zipped up files based on user criteria selections in a form. Manually inserting files that weren't options the author had allowed for, can sometimes spit out files not intended to be publicly available.

On windows machines, checking forms and forcing bad urls to get error messages, can result in showing errors for databases, string paths, the existence of a database, as well as some times pulling source files (appeneding default.aspx files with default.aspx.vb for example - something that can spit out files if the server is not configured properly - PHP files have similar issues, with .inc files, ex: index.php to index.php.inc or index.inc)

Most of what you are looking to do, has been done though to some extent. I don't remember the exact tool names as I don't have backtrack up right now, but under backtracks recon scanner tools, there are a number of web vuln scanners built in, that do brute force directory requests and will show you hit results. These can be REALLY noisy, and in most cases, will just get your IP blocked, including your own built in scanner. If you want better results, I find that treading lightly, knowing more about the server first, then trying to random directories works much better than a Hail Mary send all known requests.

Start with determining the type of web server you are on, then the OS itself, a quick nmap scan for common ports to get more info, nothing heavy, just a light scan. Then after determining the known web server, confirm with key known objects in case the server in question uses an IDS that fakes info, like finding /icons/ for apache servers, some apache servers will return 404 not found, but if you did say /icons/a.gif and it shows 200 ok, then you know they are trying to stealth the fact that they are running apache. HTTP HEAD requests are also usefull for getting the server banner, but also another way servers can fake results, or even block HEAD requests all together.

Read up on the OWASP site too, they have tools and examples on brute forcing known directories as well.

Link to comment
Share on other sites

Those are great things to look for, but usually they are things you look for after you've found a way in and 9 times out of 10, aren't accessible in the clear.

I would be looking for things on the web facing side, not server side, then once in or finding a vulnerable script, get it to spit out the files from your pastebin list.

Instead of recreating the wheel, check out https://www.owasp.org/index.php/DirBuster

You can download its list of known vulnerable directories, or directories to identify known software packages, etc.

Link to comment
Share on other sites

Backtrack has a nice utility that brute forces directories. Here's a link to it.

By brute forcing, I mean it will try to determine what directories exist on a particular webserver.


Link to comment
Share on other sites

  • 1 year later...

Hi there

I am testing this recently.And i want to know that if there is a Web scanner[/color] which supports to scan document at the same time.

I want to refer to some useful knowledge.

Thanks a lot

There are tools that can scan documents, tools like "strings" in linux command or windows version by Sysinternals but also tools that can scan for metadata inside of pdf's, images, and other documents for file paths, usernames, etc.



Edited by digip
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Create New...