Jump to content

Best way to loop this script


Recommended Posts

So I'm working on little mashup using Snarl and PHP. Snarl is basically a notification system for windows similar to Growl on the mac. It displays notifications on screen (little boxes in the lower-right by default).

There is a spiffy little snarl command line tool called snarl_command.exe which accepts the arguments /T # for seconds to display notification, /M with "Subject" and "Body" and the path to a .PNG to use as an icon.

What I'm trying to do is fetch an RSS feed every few minutes and display the latest feed item as a notification. Ultimately what I'll use this for is a simple twitter notification script based on an RSS feed from Summize (now search.twitter.com). The feed http://search.twitter.com/search.atom?q=hak5 displays the results of whenever anyone twitters with the string hak5. Of course this could be used for any feed not just twitter related but hey, I like to see what people are saying about the show ;)

So far I'm using the magpierss class to handle RSS fetching and processing. So far it works great at grabbing the feed and passing the latest item to the snarl command line utility but I need help deciding how to best loop the program.

I thought about telling it to sleep for 5 minutes then loop, but then if nothing has changed I'll get a duplicate notification and eventually go stir crazy.

What do you guys think would be the best way to go about looping this so that it only displays unseen feed items every few minutes?

<?php
require_once('magpierss/rss_fetch.inc');

$url = "http://search.twitter.com/search.atom?q=hak5";
$num_items = 1;
$rss = fetch_rss($url);
$items = array_slice($rss->items, 0, $num_items);

foreach ($items as $item) {
	$href = $item['link'];
	$title = $item['title'];		//Really we just care about title but meh
	$desc = $item['description'];	//just incase we define link and description

	echo "Title: $title\n";  //turn off when not debugging
	//system("snarl_command.exe /T 15 /M \"New Tweet!\" \"$title\" >nul");

}
?>

It requires magpierss's rss_fetch.inc which can be downloaded from http://magpierss.sourceforge.net/

Thoughts?

Link to comment
Share on other sites

It looks like we're working on something very similar here, at least as far as trying to grab results from Summize/Twitter and do stuff with them.

I spent a bit of time last week to work on a script (also using Magpie) to essentially mimic the currently disabled Track feature on Twitter with a Summize RSS feed by parsing the results and SMSing them to me through teleflip. And of course, I'm willing to share with the community, so here's what I've got:

<?php

/*
 * TwitterTrack
 * A useless hack for supplementing Twitter's tracking until its fixed
 *
 * Nick Tabick - nicktabick@gmail.com
 */

// Start by importing MagpieRSS
require_once('./rss_fetch.inc');
require_once('./rss_utils.inc');

// Get the last run time from disk (in epoch)
// I used intval() here because PHP seemed to assume the value was a string otherwise.
$lastrun = intval(file_get_contents('./lastrun.txt'));

// Define the RSS Url for Summize/Twitter Search we're using, and get it.
$rss = fetch_rss('http://search.twitter.com/search.atom?q=%40nicktabick+OR+techcentric');

// Iterate through each item returned in the feed
foreach ($rss->items as $item) {
   // Convert the time to a usable standard (epoch)
   $published = parse_w3cdtf($item['published']);
   
   // If the epoch time is greater than our last run...
   if ( $published >= $lastrun ) {
      // ...get all the fun stuff out of the feed and e-mail it to Teleflip for SMS
      mail('nonono@youcanthavethis', 'Tweet', ' ' . preg_replace('/\s\((\w+)\)/', '', $item['author_name']) . ': ' . $item['title'] . ' (' . date('m/d/y H:i',$published) . ')', "From: useyourown@ccount");
   }
}

// Dump the current time (in epoch) so we can use it as a base value for next run.
// The + 300 is to help sync the difference in clocks.  This can be edited as necessary.
file_put_contents('./lastrun.txt',(date('U') + 300));

?>

The time difference math is, like the comment says, to help with differences in timestamps and avoid duplicates if your clocks are off. However, I just came to the realization that while this works, perhaps a more effective option would be to store the last timestamp from the feed.

This script should be friendly to any difference in time zones, as the timestamps are based upon GMT anyway. (I say 'epoch time' in my comments; by 'epoch time', I mean the number of seconds since the Unix epoch.) My script uses e-mail to send SMS notifications to my phone via Teleflip, but it would be easily adapted with the appropriate system() command from your source, Darren.

Feel free to improve this script as necessary, or do whatever with it. Credit would be lovely, but considering there's always someone who's going to rip off work, I really don't care. This didn't take that long to write, anyway.

Happy hacking!

EDIT: I forgot to mention, but I also made a small edit to rss_utils.inc (one of the Magpie files) that may or may not be necessary. If the script doesn't work at first, try changing this line in rss_utils (should be line 28):

$pat = "/(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2})(:(\d{2}))?(?:([-+])(\d{2}):?(\d{2})|(Z))?/";

to

$pat = "/(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2}):(\d{2})(Z)?/";

If it isn't noticeable already, this is the regular expression used to parse the time in the RSS feed.

Link to comment
Share on other sites

i dont know any php anymore (forgot it long ago) but wouldnt a while loop work fine? with some kind of escape character.

Link to comment
Share on other sites

Maybe you need to take the downloaded files, store it and compare with a previosuly downlaoded file. If they are the same, then discard, end the process, if they are different, ie: by size(or you could even md5 hash the files for comparison against the new files) then continue if different and it would then update the rss feed, else, end the script and wait for x amount of time again. Just an idea.

Link to comment
Share on other sites

10goto10's code below is much cleaner and works. I just changed the two lines from c:\\rsslogs.txt to ./rsslogs.txt

and some misc checking of the hashes. His hash check routine wasn't working for me, but his md5 part was.

main.php

<?php
header( 'refresh: 1800; url=./redirect.php' );


require_once('magpierss/rss_fetch.inc');

$url = "http://search.twitter.com/search.atom?q=hak5";
$num_items = 1;
$rss = fetch_rss($url);
$items = array_slice($rss->items, 0, $num_items);

// Open last logfile as array so we can check for duplicates
//$previous_items = file('./rsslogs.txt');

$previous_items = @fopen("./rsslogs.txt", "r");
if ($previous_items) {
while (!feof($previous_items)) {
$buffer = fgets($previous_items, 4096);
//echo $buffer;
$previous_itemsData = $buffer;
}
fclose($previous_items);
}


//$current_items = array();

foreach ($items as $item) {
    $href = $item['link'];
    $title = $item['title'];        //Really we just care about title but meh
    $desc = $item['description'];    //just incase we define link and description

    $item_checksum = md5($href);    // Use an MD5 hash of the link as unique identifier
    
    //echo "Item checksum: $item_checksum";
    //echo "Previous checksum: $previous_itemsData";
    
    // Check if we have seen this item before
    //if (!in_array($item_checksum, $previous_items))
     if ($item_checksum !== $previous_itemsData)  {
        echo "<font color='green'>Title: $title\n <BR><BR>";  //turn off when not debugging

    //system("snarl_command.exe /T 15 /M \"New Tweet!\" \"$title\" >nul");
    // Add this item to the new log
    //$current_items[] = $item_checksum."\n";    
    }
    else 
    echo "<font color='red'>Nothing New Yet! <BR><BR>";
    //$current_items[] = $item_checksum."\n";
}

// Write logfile

sleep ("5");
file_put_contents('./rsslogs.txt', $item_checksum);
?>

redirect.php

<?
header( 'refresh: 5; url=./main.php' );
?>

Link to comment
Share on other sites

<?php
require_once('magpierss/rss_fetch.inc');

$url = "http://search.twitter.com/search.atom?q=hak5";
$num_items = 1;
$rss = fetch_rss($url);
$items = array_slice($rss->items, 0, $num_items);

// Open last logfile as array so we can check for duplicates
$previous_items = file('c:\\rsslogs.txt');
$current_items = array();

foreach ($items as $item) {
    $href = $item['link'];
    $title = $item['title'];        //Really we just care about title but meh
    $desc = $item['description'];    //just incase we define link and description

    $item_checksum = md5($href);    // Use an MD5 hash of the link as unique identifier
    
    // Check if we have seen this item before
    if (!in_array($item_checksum, $previous_items))
    {
        echo "Title: $title\n";  //turn off when not debugging
        //system("snarl_command.exe /T 15 /M \"New Tweet!\" \"$title\" >nul");
    }
    
    // Add this item to the new log
    $current_items[] = $item_checksum."\n";    
}

// Write logfile
file_put_contents('c:\\rsslogs.txt', $current_items);
?>

Darren, try this. Like digip suggested, it makes an MD5 of the link you get back from the RSS class and saves it to a logfile. The MD5 is used as an unique identifier, so the the script can check against it. Next time the script runs and gets a new RSS message, it makes an MD5 hash of the link, and checks if that MD5 hash is in the logs yet. If it is, it's a duplicate and it does nothing. If it's not in the logs, it's a new message and you can send it to snarl_command.exe.

I haven't ran this code myself, but I figure this'll run. I don't know how file/file_put_contents works on Windows machines, so look into that if it doesn't work! (I guess Windows maybe uses a different newline in the $current_items[] = $item_checksum."\n"; bit.). You could also expand this to keep logs of the last 100 messages instead of just the ones from the last run. Room to play :)

Link to comment
Share on other sites

@darren - do you have the code to make snarl work? I have downloaded Snarl_CMD.exe from source forge, but I don't quite get how you pipe it to Snarl_CMD.exe to make the alerts. I have Snarl_CMD.exe set in my path variables as well as sitting in the same directory as my php scripts.

edit: @10goto10 - great job with the code there! I added the redirect part to loop the process and changed where it put the log so it isn't windows OS specific. Now it will work on any OS with php, since it puts it in the same directory as the scripts. Also, the compare part from yours wasn't really working for me after I checked it, so I rewrote it a bit. Now it won't echo the title under a matched hash, as with yours it echos it either way. This way we can avoid calling snarl every time.

Link to comment
Share on other sites

Thanks guys I'll test the code in the morning and report back.

@digip: Snarl_CMD.exe seems to be iffy depending on which version of Snarl you have installed. I'm using snarl_command.exe which was included with RC1 of Snarl. It works fine with the latest versions. I've posted it here so you don't have to go digging. http://www.darrenkitchen.net/temp/snarl_command.rar

Snarl_CMD.exe works too but instead of /M for message you specify snShowMessage. Try Snarl_CMD.exe snShowMessage 10 "subject" "body"

Link to comment
Share on other sites

@Digip: I kinda assumed something would go wrong with the file command (reads lines of a file into an array). I guess the problem was that it didn't return a proper array because of how the log file was written? (Changing "\n" to "\r\n" could fix that)

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...