He missed to send the email again to HELP..
Devdatta Majumder wrote:
>
> When you say that "Previously this problem never occurred", I am
> assuming that these jobs were running fine a couple of days ago, but
> now they don't i.e. all of a sudden you are seeing that the jobs
> can't find/open the input files ? And you have been running these
> jobs via the condor batch system all this time, but not
> interactively, right ?
>
>
> Previously I could run jobs using *condor*... this problem started since
> last week. Of course it is true that the datafiles then resided in my
> afs directory, but as I said, the file sizes are large (some 200Mb) I
> was forced to keep those in my /scratch area.
> Interactive running on a few events are fine, the problem is with the
> condor jobs.
>
> Interactive running on the login machine (i.e. cmsRun < input file)
> is different than condor jobs. Are you using the FarmOut analysis
> scripts here to submit your jobs to condor ?
>
>
> Yes, I am using the scripts from this webpage :
> http://www.hep.wisc.edu/cms/comp/index.html They are not the analysis
> scripts but
>
> farmoutRandomSeedJobs jobs.
>
>
> Assuming that you are familiar with condor submission mechanism
> (including the input file xfers option in condor), I see that your
> ascii file
> (/scratch/devdatta/BaurAsciiFiles/lhc_Wp_VeryLowCuts/baurWp_lhcVeryLowCuts.ascii)
> resides in the /scratch in login01 and that directory is not
> accessible from the worker nodes where your jobs end up when you
> submit via condor. So the file needs to the xfered to the Worker
> nodes for the jobs to find it and that can be configured in your
> condor command/JDL.
>
>
> I am not really familiar with this mechanism of xfering files to worker
> nodes. I am using the following for submitting my jobs from afs directory:
>
> farmoutRandomSeedJobs <outputFileName> 100000 500 ~/CMSSW_2_0_9/ ~/CMSSW_2_0_9/src/GeneratorInterface/BaurWgamInterface/test/<cfgFileName>
>
>
> If you can provide some tips, it would be helpful. Or, could you manage
> to find some spare time to try one of my cfgs and certify that I am not
> doing something stupid?
>
>
> Just to make sure i.e. these are 200MB (each) ascii files ?
>
>
> There are smaller fielsa s well, but there are some with >100MB size. I
> am going to produce some more of these, but I require storage for log-term.
>
>
> Both your AFS home area and the /scratch of the login machines are
> safe. We don't remove anything from /scratch that the users want to
> keep for a bit longer than the usual time. In that case it would be
> useful for us to know.
>
>
> I want to keep my datafile around for quite some time, at least until I
> have the whole gen+sim+digi+reco chain done.
>
>
> Condor jobs have the options to xfer input files directly to the
> worker nodes where the job can use them, eliminating the need to put
> big files in AFS for sharing. In that case you can have your big
> files in /scracth in any of the login machines.
>
> Let me know how I can be of further help in resolving this.
>
>
> Well, for now, I would like to resolve the problem of submitting jobs to
> condor where the input is in my scratch area and get the condor jobs
> running. Second, I would like to have access to some tape archive (I
> assume it would be /pnfs) where I can keep my datafiles for long-term
> storage.
>
> With regards,
>
> Devdatta.
> |