reftoken Posted November 25, 2017 Share Posted November 25, 2017 Hi all, I am trying to use CeWL in order to get a wordlist from a website I am developing locally. Since trying with the direct path leads to errors, I tried to start a local development server and did the following: # In the local directory $ php -S localhost:8080 $ cewl -m 5 -w output.txt http://localhost:8080/ However, CeWL aborts almost immediately, leaving me with a list of 30 - 40 words. It basically just spiders the homepage and doesn't go further. Even with -d 5 or -o, it doesn't seem to proceed as expected. Do you know of an alternative way to fetch words from local files? Quote Link to comment Share on other sites More sharing options...
digininja Posted November 25, 2017 Share Posted November 25, 2017 If you run it with --debug it will show you all the URLs it finds and will say either why it is following them or why it is ignoring them. My guess would be that the links coming off the homepage go to a different URL and so are considered offsite and not touched. Quote Link to comment Share on other sites More sharing options...
reftoken Posted November 25, 2017 Author Share Posted November 25, 2017 I have got the following error when trying with --debug: /usr/bin/cewl: unrecognized option `--debug' So I tried this and tried to allow it to go offsite: $ cewl -v -o -m 5 -c -w output.txt http://localhost:8080/ ...and almost immediately got: Starting at http://localhost:8080/blog/ Visiting: http://localhost:8080/blog/, got response code 200 Attribute text found: Unable to connect to the site (http://localhost:80/blog/index.html) The following error may help: Failed to open TCP connection to localhost:80 (Connection refused - connect(2) for "localhost" port 80) /usr/lib/ruby/2.3.0/net/http.rb:882:in `rescue in block in connect' /usr/lib/ruby/2.3.0/net/http.rb:879:in `block in connect' /usr/lib/ruby/2.3.0/timeout.rb:91:in `block in timeout' /usr/lib/ruby/2.3.0/timeout.rb:101:in `timeout' /usr/lib/ruby/2.3.0/net/http.rb:878:in `connect' /usr/lib/ruby/2.3.0/net/http.rb:863:in `do_start' /usr/lib/ruby/2.3.0/net/http.rb:852:in `start' /usr/lib/ruby/2.3.0/net/http.rb:1398:in `request' /usr/bin/cewl:281:in `get_page' /usr/bin/cewl:212:in `block (2 levels) in start!' /usr/bin/cewl:210:in `each' /usr/bin/cewl:210:in `block in start!' /usr/bin/cewl:198:in `each' /usr/bin/cewl:198:in `start!' /usr/bin/cewl:165:in `start_at' /usr/bin/cewl:744:in `block in <main>' /usr/bin/cewl:734:in `catch' /usr/bin/cewl:734:in `<main>' Caller /usr/bin/cewl:233:in `get_page' /usr/bin/cewl:212:in `block (2 levels) in start!' /usr/bin/cewl:210:in `each' /usr/bin/cewl:210:in `block in start!' /usr/bin/cewl:198:in `each' /usr/bin/cewl:198:in `start!' /usr/bin/cewl:165:in `start_at' /usr/bin/cewl:744:in `block in <main>' /usr/bin/cewl:734:in `catch' /usr/bin/cewl:734:in `<main>' Writing words to file Any idea what's going on here? Also, it's strange that it tries to access localhost:80 when I specify localhost:8080. Quote Link to comment Share on other sites More sharing options...
digip Posted November 25, 2017 Share Posted November 25, 2017 Is the web server started? Have apache installed? If you open "http://localhost:8080/blog/" in a browser, does it load properly? If not, that is where I would start. Quote Link to comment Share on other sites More sharing options...
reftoken Posted November 25, 2017 Author Share Posted November 25, 2017 Until now I only tried with the built-in PHP web server and could access the complete site in the browser. Now I will try to set up a virtual host to see if this works better. Quote Link to comment Share on other sites More sharing options...
digininja Posted November 25, 2017 Share Posted November 25, 2017 So you were trying to spider a site that didn't really exist, that could be your problem. If --debug isn't there you aren't using the latest version, get that from my git repo and use that instead. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.