Need Help With A Shell Script

Darren Kitchen · October 9, 2010

I'm constantly processing files. I'll run a tool on one file, wait until it finishes, and run another tool on the same file. I can't start the second task until the fist ends and the only way to know it has finished is when the file size stops increasing. This is what I've got that checks to see that the date modified hasn't changed in the last 5 minutes.

white [ $[ $(stat -f "%m" "/tmp/$FILE") + 300 ] -gt $(date +%s) ]; do sleep 2; done; /tool2.sh /tmp/$FILE

This works great on a single file. Now the problem is I have a batch of files all being processed by the first tool. Each has a unique variable appended to the string. For example:

file0001--a

file0001--b

file0001--c

I don't want to start tool2.sh /tmp/file0001--* until the last file in the batch has finished processing. Again the only way to know is to check the date modified. Unfortunately they don't complete sequentially. Sometimes file0001--a will finish last. Sometimes file0001--c will finish last.

So my question is, how would I go about adapting the code above to check that all files in this series have completed processing and haven't been touched for the last 5 minutes?

I have tried wildcards with the stat command and it doesn't seem to work.

Also, note that the stat command above uses "%m" which is minutes on the version of stat that I have. It's a BSD box.

Thanks. I appreciate any help you can offer.

digip · October 9, 2010

Did you write these tools, or are these things you installed? Shouldn't the job be able to give you a return code when done, so if say is complete return code 0, if failed, return code X where X equates to some predefined error codes. Most systems have this sort of thing and might be possible to mod the code to tell you when they are done, then chekc the value, if is done ok, do step b, else alert error code.

edit: i might not be explainign myself very well, but something similar to this:

http://stackoverflow.com/questions/393845/...r-code-strategy

I know when we had work on the mainframe at my last job, that we checked each job for the return code, and anything over a 04 meant there was an abend or problem. How that works in linux, I'm not sure though.

edit: Found this

To check the exit status in a script, you may use the following pattern:
somecommand argument1 argument2

RETVAL=$?

[ $RETVAL -eq 0 ] && echo Success

[ $RETVAL -ne 0 ] && echo Failure

http://linuxcommando.blogspot.com/2008/03/...tatus-code.html

Edited October 9, 2010 by digip

SomeoneE1se · October 9, 2010

are they the only files in that directory?

Mr-Protocol · October 9, 2010

If you set a number input for how many files you will be producing on the script, you can use that as a "complete" when the counter gets that high.

So make the script take a decimal into a variable, So for example 5 if you have 5 files.

Do your timestamp check and if the timestamp hasn't changed in 5 minutes, set a "file[x]" complete. So you will have an array file[x].

When all file[1] - file[5] have a 1 as the data (meaning 1 = complete) then continue to do your other script.

So to recap because I suck at explaining. Have a variable you set to make an array of that size, (be careful because arrays will be 0 based so you will need to take the input number -1) check the files and if complete set the var of the array to 1. Make a function to check if all data in your array is a 1, then run the second script.

Edited October 9, 2010 by Mr-Protocol

Darren Kitchen · October 9, 2010

They are not the only files in the directory but the batch of files all begin with the same name. ie file0001--a, file0001--b, file0001--c as well as file0002--a, file0002--b and file0002--c.

The job does not return any codes sadly.

Mr-Protocol · October 9, 2010

Use your timestamp checking code to make it change the variables in the array.

I'm not exactally sure what your whole process is. If you want to discuss in a more live environment we can skype, irc, or any other means of communication to discuss possible solutions? I can better explain verbally than in text.

Edit: Just found a command you might be able to use instead of timestamp.

lsof

combine with grep to determine by the grep results if the files are open and inuse...

http://www.netadmintools.com/html/lsof.man.html

Edited October 9, 2010 by Mr-Protocol

digip · October 9, 2010

The job does not return any codes sadly.

Maybe not by default, but have you tried modifying or creating a shell script to do the steps and adding check at the end of the shell script for the return codes or are you just typing big long strings of commands out at the terminal? I would say try using a script to automate the process and put checks in for each section or against each file. When all files are done and if all files return code = 0, then move to next step, else alert job borked and show error level.

If you could show us your process/code or what you are doing, we might be able to figure out a way to fix the problem without having to use timers to check for file activity.

Edited October 9, 2010 by digip

Netshroud · October 9, 2010

Would this work?

start.sh:

for file in `ls /tmp/file00?--?`; do ./wait.sh $file &; done

wait.sh: (almost exactly what you had above)

white [ $[ $(stat -f "%m" "/tmp/$1") + 300 ] -gt $(date +%s) ]; do sleep 2; done; /tool2.sh /tmp/$1

digip · October 10, 2010

Shouldnt it also say while, and not white?

Netshroud · October 10, 2010

Probably, I just copy+pasted what Darren had in the OP though.

radarstorm · October 10, 2010

I find shell scripts arcane and difficult to understand, especially the one liners! I would use a delightful python script, for example:

import os,sys,glob,time

if len(sys.argv) != 3:
    print 'Usage %s: wildcard time_to_wait'
    raise SystemExit

stub = sys.argv[1]
wait = sys.argv[2]

try:
    wait = int(wait)
except ValueError:
    print 'Invalid wait time, use an integer number of seconds'
    raise SystemExit

done = False
while not done:
    now = time.time()
    modify_times = [(now-os.stat(filename).st_mtime) for filename in glob.glob(stub)]
    modify_times.sort() #ascending sort, so the first one is the most recent

    if modify_times[0] &gt; wait:
        done = True
    else:
        #Don't kill the cpu by sitting in this loop forever, the absolute quickest that we could
        #exit this loop is wait-modify_times[0], so sleep for that long
        time.sleep(wait-modify_times[0])

It waits until all of the files matched by a wildcard (e.g file*) are older than a given number of seconds. You need to escape the wildcard so it doesn't get expanded by the os though, ./wait.py "file*" 300 for example

Sign In

Need Help With A Shell Script

Recommended Posts

Darren Kitchen

Link to comment

Share on other sites

digip

Link to comment

Share on other sites

SomeoneE1se

Link to comment

Share on other sites

Mr-Protocol

Link to comment

Share on other sites

Darren Kitchen

Link to comment

Share on other sites

Mr-Protocol

Link to comment

Share on other sites

digip

Link to comment

Share on other sites

Netshroud

Link to comment

Share on other sites

digip

Link to comment

Share on other sites

Netshroud

Link to comment

Share on other sites

radarstorm

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members

Browse

Activity