Jump to content

python sniffing script high cpu consumption


Recommended Posts

Hi.

Based in this script http://edwardkeeble.com/2014/02/passive-wifi-tracking/

I have created my own version wich detects and filters the probes coming from nearby devices and reports this information to a central webservice in charge of storing this information in a database.

My main intentions is to replicate the rtls features offered by meraki's routers (https://meraki.cisco.com/lib/pdf/meraki_datasheet_location.pdf)

My main concern and problem is, the script use between 95%-100% of the cpu. I have just tested the script with a couple nearby wifi devices broadcasting probes, but I'm afraid if I use it in crowded places, it will fail or not report the data correctly.

Any idea how I could improve the cpu usage.

This is the code:

#!/usr/bin/python
from scapy.all import *
import time
import thread
import requests

PROBE_REQUEST_TYPE=0
PROBE_REQUEST_SUBTYPE=4
flag = 0
buf={'arrival':0,'source':0,'dest':0,'pwr':0,'probe':0}
uuid='1A2B3'

def PacketHandler(pkt):
    if pkt.haslayer(Dot11):
        if pkt.type==PROBE_REQUEST_TYPE and pkt.subtype == PROBE_REQUEST_SUBTYPE:
            PrintPacket(pkt)

def PrintPacket(pkt):
    #global flag
    arrival= int(time.mktime(time.localtime()))
    print "Probe Request Captured:"
    try:
        extra = pkt.notdecoded
    except:
        extra = None
    if extra!=None:
        signal_strength = -(256-ord(extra[-4:-3]))
    else:
        signal_strength = -100
        print "No signal strength found" 
    print arrival, pkt.addr2, pkt.addr3, signal_strength,pkt.getlayer(Dot11).info
    launcher(arrival, pkt.addr2, pkt.addr3, signal_strength,pkt.getlayer(Dot11).info)
    

def launcher (arrival,source,dest,pwr,probe):
    global buf
    if buf['source']==source and buf['probe']==probe:
        print 'do not report'
    else:
        print 'do report'
        buf={'arrival':arrival,'source':source,'dest':dest,'pwr':pwr,'probe':probe}
        
        
        try:
            thread.start_new_thread(exporter,(arrival,source,dest,pwr,probe))
            print 'start the thread'
        except:
            print 'error launching the thread'

def exporter (arrival,source,dest,pwr,probe):
    global uuid
    
    print 'this is the thread %r' % source
    urlg='http://webservice.com/?arrival='+str(arrival)+'&source='+str(source)+'&dest='+str(dest)+'&pwr='+str(pwr)+'&probe='+str(probe)+'&uuid='+uuid
    try:
        r=requests.get(urlg)
        print r.status_code
        #print r.headers
        print r.content
    except:
        print 'ERROR IN THREAD:::::: %r' % source
        print 'wait 2 secs'
        time.sleep(2)
        r=requests.get(urlg)
        print r.status_code
        print r.content

def main():
    from datetime import datetime
    print "[%s] Starting scan"%datetime.now()
    print "Scanning for:"
    
    sniff(iface=sys.argv[1],prn=PacketHandler,store=0)
    
if __name__=="__main__":
    main()

Any guidance would be really appreaciated.

Link to comment
Share on other sites

Ok, so I'm a C# coder, haven't done python in a while, so understand that I haven't ran your code. I am essentially firing from the hip here.

So I notice that you seem to be creating a thread each time you intercept a packet. Does this intercept happen very often? If it does you should know that threads have very high object creation times comparatively. You can think of a thread as essentially forking another run time or something (the details of this might be a little bit different, investigate it), so the object is necessarily complex. You might want to check if there is a thread pool associated with python, so then you could essentially reuse already created threads. This is in the pineapple forum, so make sure you remember that the pineapple is not really optimized for complex CPU tasks, you should try to think of it more as a "sensor", not a cracker or processing node.

You might want to check into Non-blocking IO (learned about NIO in a java book), I have only read about it in server software for java, but essentially it is another way of doing asynchronous server operations, and it is supposed to be an alternative to threading on the server. NIO is very complex however, and I am not sure if python supports it. Essentially it is a process whereby the program through use of a clever API can process many inbound/outbound connections via some very specific control structures/algorithms. Modern servers might have been very different today if people would have had better API's/algorithims etc for NIO when servers were first being built. Also, hardware today typically makes threading tremendously cheaper (so long as you don't stomp the connection). The overhead of threading today is one of the reasons we have problems with DoS attacks. The servers spawn threads each time a connection is made, so if you stomp the connection enough you deplete CPU cycles etc.

It also looks like every time you receive a packet you are printing to the console possibly more than once. This is great for debugging, but you should really consider not making an overly gabby program. Screens/Console out also suffer from slowing the program down. Consider only one print operation per packet.

400 MHz MIPS processor, 16 MB ROM and 64 MB RAM

I am pretty sure that this is only a single core processor, so your threads will probably be running serially (which proposes to me that some kind of non-blocking IO algorithm would help here).

Why don't you try running this program on a laptop, python is cross platform, so this should work. I would be interested in knowing how slow this program actually is in terms of modern CPUs.

Edited by overwraith
Link to comment
Share on other sites

Hi! Thank you for your help.

Before running the script in the pinneaple, I wrote the code in my laptop and in comparation, my pc executes the code much faster and seems that it captures more packets than the pinneaple.

I am aware of the several prints in the screen, those were for debugging, but even if I remove them the consumption is still high.

I am creating a threads to send the info of the packets because if in some case the connection to internet is slow, the script continue working, so the thread can take as long as is required to succesfully send the packet info to my server. (I am using a google app engine for the webservice, so in terms of serverside I am fully covered).

Do you think if instead of create a thread, I should do it in serial, do this could create a bottleneck?

Thanks

Edited by kavastudios
Link to comment
Share on other sites

It depends. If there is some sort of NIO programming in python it would be essentially processing all of the received packets a little bit at a time. It is a hard concept to explain, you may have to do a little bit of reading on it. It works just as well as threading does, except with a little less overhead. Understand I like threading, but I am just wondering if the object creation in this particular instance is a little bit much for the pineapple. Don't single thread it if you aren't going to use Non-blocking IO. Just read a few articles on Non-Blocking IO, I think you should be able to understand it, and whether or not it is actually in python. Also look for some example code.

...

I just saw the exporter code, there seems to be a call to sleep if something goes wrong. Just realized that if something tends to go wrong a little too often you will have a case of a lot of threads waiting for something. That wait is essentially telling the thread to wait, ... wasted CPU cycles, is it really necessary?

You might also be able to buffer data you send, but I don't think it is really necessary, and am unsure whether it would be detrimental to pineapple memory.

Edited by overwraith
Link to comment
Share on other sites

Place a sleep and a print msg before every piece of code you may think is stressfull. While you monitor your cpu, launch yourscript.

Sleep 3

Puts(we are now here!)

if you can't identify the proplem, then perform these same action's ON EVERY LINE...

if I need to find the problem for u. throw me a Benjamin

: -)

Link to comment
Share on other sites

Meraki's location feature works via mesh, it basically triangulates devices on a defined floor plan where access point locations are known and there are more than one. You can judge distance based on signal strength with one AP but you'd never really know exactly where a device is just that it's so many feet from the AP.

u8r0eo7.png

Link to comment
Share on other sites

Hi @schuchwun

That's precisely what I want to do, place at least 3 pinneaples in a closed space. Also I've seen Meraki routers providing those heatmaps with just one router, even in an open space where there's no other meraki router nearby (not even routers not belonging to me). I tried to figure how they do that, and the only answer is maybe each antenna in the router internally is a separeted radio, so they can check in wich antena the packet was recieved first. (microseconds in difference and also difference in milliwats of the percieved power)

I will modify my code to call less functions, remove the sleep (I add it if in some case we have a momentary lost of connection, keep the info in memory)

Also, How I could implement a buffer? Maybe an array and when this array reach X number of packets, call a function to send the info to the server.

I'll keep you posted, this could maybe be the beginning of an open source rtls project.

Thanks.

Link to comment
Share on other sites

UPDATE:

I have found is possible to filter directly from the sniff function:

def main():
    print "[%s] Starting scan"%datetime.now()
    sniff(iface=sys.argv[1],prn=PacketHandler, filter='link[26] = 0x40',store=0)

With that, the CPU consumption when I run it in my pc is between 1%-3% but when I run it on the pinneaple, the script crashes and throw this error:

Traceback (most recent call last):
  File "snrV2.py", line 66, in <module>
    main()
  File "snrV2.py", line 63, in main
    sniff(iface=sys.argv[1],prn=PacketHandler, filter='link[26] = 0x40', store=0)
  File "/usr/lib/python2.7/site-packages/scapy/sendrecv.py", line 550, in sniff
    s = L2socket(type=ETH_P_ALL, *arg, **karg)
  File "/usr/lib/python2.7/site-packages/scapy/arch/linux.py", line 460, in __init__
    attach_filter(self.ins, filter)
  File "/usr/lib/python2.7/site-packages/scapy/arch/linux.py", line 132, in attach_filter
    s.setsockopt(SOL_SOCKET, SO_ATTACH_FILTER, bpfh)
  File "/usr/lib/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
socket.error: [Errno 99] Protocol not available

I think I should update the libcap in the pinneaple, but how to do it?

in my pc I have libpcap version 1.5.3

and the pinneaple has libpcap version 1.1.1
Edited by kavastudios
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...