Message14518

Author ajit
Recipients ajit, dan, dasu, rader, radtke, wcmaier
Date 2008.07.15 17:14
Content
He missed to send the email again to HELP..


Devdatta Majumder wrote:
> 
>     When you say that "Previously this problem never occurred", I am
>     assuming that these jobs were running fine a couple of days ago, but
>     now they don't i.e. all of a sudden you are seeing that the jobs
>     can't find/open the input files ? And you have been running these
>     jobs via the condor batch system all this time, but not
>     interactively, right ?
> 
> 
> Previously I could run jobs using *condor*... this problem started since 
> last week. Of course it is true that the datafiles then resided in my 
> afs directory, but as I said, the file sizes are large (some 200Mb) I 
> was forced to keep those in my /scratch area.
> Interactive running on a few events are fine, the problem is with the 
> condor jobs.
> 
>     Interactive running on the login machine (i.e. cmsRun < input file)
>     is different than condor jobs. Are you using the FarmOut analysis
>     scripts here to submit your jobs to condor ?
> 
> 
> Yes, I am using the scripts from this webpage :  
> http://www.hep.wisc.edu/cms/comp/index.html They are not the analysis 
> scripts but
> 
> farmoutRandomSeedJobs jobs.
> 
> 
>     Assuming that you are familiar with condor submission mechanism
>     (including the input file xfers option in condor), I see that your
>     ascii file
>     (/scratch/devdatta/BaurAsciiFiles/lhc_Wp_VeryLowCuts/baurWp_lhcVeryLowCuts.ascii)
>     resides in the /scratch in login01 and that directory is not
>     accessible from the worker nodes where your jobs end up when you
>     submit via condor. So the file needs to the xfered to the Worker
>     nodes for the jobs to find it and that can be configured in your
>     condor command/JDL.
> 
> 
> I am not really familiar with this mechanism of xfering files to worker 
> nodes. I am using the following for submitting my jobs from afs directory:
> 
> farmoutRandomSeedJobs <outputFileName> 100000 500 ~/CMSSW_2_0_9/ ~/CMSSW_2_0_9/src/GeneratorInterface/BaurWgamInterface/test/<cfgFileName>
> 
> 
> If you can provide some tips, it would be helpful. Or, could you manage 
> to find some spare time to try one of my cfgs and certify that I am not 
> doing something stupid?
>  
> 
>     Just to make sure i.e. these are 200MB (each) ascii files ?
> 
> 
> There are smaller fielsa s well, but there are some with >100MB size. I 
> am going to produce some more of these, but I require storage for log-term.
>  
> 
>     Both your AFS home area and the /scratch of the login machines are
>     safe. We don't remove anything from /scratch that the users want to
>     keep for a bit longer than the usual time. In that case it would be
>     useful for us to know.
> 
> 
> I want to keep my datafile around for quite some time, at least until I 
> have the whole gen+sim+digi+reco chain done.
>  
> 
>     Condor jobs have the options to xfer input files directly to the
>     worker nodes where the job can use them, eliminating the need to put
>     big files in AFS for sharing. In that case you can have your big
>     files in /scracth in any of the login machines.
> 
>     Let me know how I can be of further help in resolving this.
> 
> 
> Well, for now, I would like to resolve the problem of submitting jobs to 
> condor where the input is in my scratch area and get the condor jobs 
> running. Second, I would like to have access to some tape archive (I 
> assume it would be /pnfs) where I can keep my datafiles for long-term 
> storage.
> 
> With regards,
> 
> Devdatta.
>
History
Date User Action Args
2008-07-15 17:14:20ajitsetrecipients: + ajit, wcmaier, rader, dan, dasu, radtke
2008-07-15 17:14:20ajitlinkissue5355 messages
2008-07-15 17:14:20ajitcreate