Darren Kitchen Posted October 9, 2010 Share Posted October 9, 2010 I'm constantly processing files. I'll run a tool on one file, wait until it finishes, and run another tool on the same file. I can't start the second task until the fist ends and the only way to know it has finished is when the file size stops increasing. This is what I've got that checks to see that the date modified hasn't changed in the last 5 minutes. white [ $[ $(stat -f "%m" "/tmp/$FILE") + 300 ] -gt $(date +%s) ]; do sleep 2; done; /tool2.sh /tmp/$FILE This works great on a single file. Now the problem is I have a batch of files all being processed by the first tool. Each has a unique variable appended to the string. For example: file0001--a file0001--b file0001--c I don't want to start tool2.sh /tmp/file0001--* until the last file in the batch has finished processing. Again the only way to know is to check the date modified. Unfortunately they don't complete sequentially. Sometimes file0001--a will finish last. Sometimes file0001--c will finish last. So my question is, how would I go about adapting the code above to check that all files in this series have completed processing and haven't been touched for the last 5 minutes? I have tried wildcards with the stat command and it doesn't seem to work. Also, note that the stat command above uses "%m" which is minutes on the version of stat that I have. It's a BSD box. Thanks. I appreciate any help you can offer. Quote Link to comment Share on other sites More sharing options...
digip Posted October 9, 2010 Share Posted October 9, 2010 (edited) Did you write these tools, or are these things you installed? Shouldn't the job be able to give you a return code when done, so if say is complete return code 0, if failed, return code X where X equates to some predefined error codes. Most systems have this sort of thing and might be possible to mod the code to tell you when they are done, then chekc the value, if is done ok, do step b, else alert error code. edit: i might not be explainign myself very well, but something similar to this: http://stackoverflow.com/questions/393845/...r-code-strategy I know when we had work on the mainframe at my last job, that we checked each job for the return code, and anything over a 04 meant there was an abend or problem. How that works in linux, I'm not sure though. edit: Found this To check the exit status in a script, you may use the following pattern: somecommand argument1 argument2 RETVAL=$? [ $RETVAL -eq 0 ] && echo Success [ $RETVAL -ne 0 ] && echo Failure http://linuxcommando.blogspot.com/2008/03/...tatus-code.html Edited October 9, 2010 by digip Quote Link to comment Share on other sites More sharing options...
SomeoneE1se Posted October 9, 2010 Share Posted October 9, 2010 are they the only files in that directory? Quote Link to comment Share on other sites More sharing options...
Mr-Protocol Posted October 9, 2010 Share Posted October 9, 2010 (edited) If you set a number input for how many files you will be producing on the script, you can use that as a "complete" when the counter gets that high. So make the script take a decimal into a variable, So for example 5 if you have 5 files. Do your timestamp check and if the timestamp hasn't changed in 5 minutes, set a "file[x]" complete. So you will have an array file[x]. When all file[1] - file[5] have a 1 as the data (meaning 1 = complete) then continue to do your other script. So to recap because I suck at explaining. Have a variable you set to make an array of that size, (be careful because arrays will be 0 based so you will need to take the input number -1) check the files and if complete set the var of the array to 1. Make a function to check if all data in your array is a 1, then run the second script. Edited October 9, 2010 by Mr-Protocol Quote Link to comment Share on other sites More sharing options...
Darren Kitchen Posted October 9, 2010 Author Share Posted October 9, 2010 They are not the only files in the directory but the batch of files all begin with the same name. ie file0001--a, file0001--b, file0001--c as well as file0002--a, file0002--b and file0002--c. The job does not return any codes sadly. Quote Link to comment Share on other sites More sharing options...
Mr-Protocol Posted October 9, 2010 Share Posted October 9, 2010 (edited) Use your timestamp checking code to make it change the variables in the array. I'm not exactally sure what your whole process is. If you want to discuss in a more live environment we can skype, irc, or any other means of communication to discuss possible solutions? I can better explain verbally than in text. Edit: Just found a command you might be able to use instead of timestamp. lsof combine with grep to determine by the grep results if the files are open and inuse... http://www.netadmintools.com/html/lsof.man.html Edited October 9, 2010 by Mr-Protocol Quote Link to comment Share on other sites More sharing options...
digip Posted October 9, 2010 Share Posted October 9, 2010 (edited) The job does not return any codes sadly. Maybe not by default, but have you tried modifying or creating a shell script to do the steps and adding check at the end of the shell script for the return codes or are you just typing big long strings of commands out at the terminal? I would say try using a script to automate the process and put checks in for each section or against each file. When all files are done and if all files return code = 0, then move to next step, else alert job borked and show error level. If you could show us your process/code or what you are doing, we might be able to figure out a way to fix the problem without having to use timers to check for file activity. Edited October 9, 2010 by digip Quote Link to comment Share on other sites More sharing options...
Netshroud Posted October 9, 2010 Share Posted October 9, 2010 Would this work? start.sh: for file in `ls /tmp/file00?--?`; do ./wait.sh $file &; done wait.sh: (almost exactly what you had above) white [ $[ $(stat -f "%m" "/tmp/$1") + 300 ] -gt $(date +%s) ]; do sleep 2; done; /tool2.sh /tmp/$1 Quote Link to comment Share on other sites More sharing options...
digip Posted October 10, 2010 Share Posted October 10, 2010 Shouldnt it also say while, and not white? Quote Link to comment Share on other sites More sharing options...
Netshroud Posted October 10, 2010 Share Posted October 10, 2010 Probably, I just copy+pasted what Darren had in the OP though. Quote Link to comment Share on other sites More sharing options...
radarstorm Posted October 10, 2010 Share Posted October 10, 2010 I find shell scripts arcane and difficult to understand, especially the one liners! I would use a delightful python script, for example: import os,sys,glob,time if len(sys.argv) != 3: print 'Usage %s: wildcard time_to_wait' raise SystemExit stub = sys.argv[1] wait = sys.argv[2] try: wait = int(wait) except ValueError: print 'Invalid wait time, use an integer number of seconds' raise SystemExit done = False while not done: now = time.time() modify_times = [(now-os.stat(filename).st_mtime) for filename in glob.glob(stub)] modify_times.sort() #ascending sort, so the first one is the most recent if modify_times[0] > wait: done = True else: #Don't kill the cpu by sitting in this loop forever, the absolute quickest that we could #exit this loop is wait-modify_times[0], so sleep for that long time.sleep(wait-modify_times[0]) It waits until all of the files matched by a wildcard (e.g file*) are older than a given number of seconds. You need to escape the wildcard so it doesn't get expanded by the os though, ./wait.py "file*" 300 for example Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.