Issue4164

Title condor status graphs need to be updated
Priority normal Status chatting
Superseder Nosy List dan, rader
Assigned To Topic CMS, Condor, Monitoring
Group IT

Created 2007.01.11 11:00 by dan.
Last changed 2007.02.08 10:55 by rader.

Messages
msg902 (view) From: rader To: dan, rader Date: 2007.01.31 08:07
> >  > "GeneralPurposeVM" is unclaimed, it reports the status of the 
 > >  > "SuspendVM" instead.  Without this, our reports of GLOW utilization are 
 > >  > not showing the full utilization that really exists.
 > >  > 
 > >  > Here's an example of a condor_status query that does the right thing.  
 > >  > It shows the state of each general purpose VM or, if unclaimed, the 
 > >  > state of the corresponding suspension VM.
 > >  > 
 > >  > condor_status -pool glow.cs.wisc.edu -constraint 'IsGeneralPurposeVM && OpSys == "LINUX"' \
 > >  >  -f "%s\n" 'ifThenElse(State=?="Unclaimed",ifThenElse(SuspendVMClaimed,"Claimed","Unclaimed"),State)'
 > >
 > > So change
 > >
 > >  -constraint 'IsGeneralPurposeVM =?= True'
 > >
 > > to that crazy one above, right?
 > 
 > Notice that the constraint isn't all that crazy in the command above.  
 > What is crazy is the -f (-format) option to print out the state of the 
 > VM.  It prints the state of the VM in a way that takes into account the 
 > state of the suspension VM.

Sorry, I can't parse that ifThenElse format thing...

Here's what the graphing backend does...

condor_remoteuser_status does "condor_status -constraint 'IsGeneralPurposeVM =?= True' -long"
and pretty prints the results

extract_condor_status parses the pretty print output of condor_remoteuser_status 

So...

grepping on ^host|claimed on the -long output for some host I get...

 Name = "vm3@glow-c176.cs.wisc.edu"
 SuspendVMClaimed = (vm4_State == "Claimed")
 State = "Claimed"
 vm3_State = "Claimed"
 vm2_State = "Claimed"
 vm4_State = "Claimed"

...and I need help parsing that.  Let's do it in person sometime.

steve
- - -
msg755 (view) From: dan To: dan, rader Date: 2007.01.22 13:40
Steve Rader via UW-HEP Help System wrote:
> Steve Rader <rader@ginseng.hep.wisc.edu> added the comment:
>
>   
>> The condor status graphs should be fixed so that when the 
>>     
>  > "GeneralPurposeVM" is unclaimed, it reports the status of the 
>  > "SuspendVM" instead.  Without this, our reports of GLOW utilization are 
>  > not showing the full utilization that really exists.
>  > 
>  > Here's an example of a condor_status query that does the right thing.  
>  > It shows the state of each general purpose VM or, if unclaimed, the 
>  > state of the corresponding suspension VM.
>  > 
>  > condor_status -pool glow.cs.wisc.edu -constraint 'IsGeneralPurposeVM && 
>  > OpSys == "LINUX"' -f "%s\n" 
>  > 'ifThenElse(State=?="Unclaimed",ifThenElse(SuspendVMClaimed,"Claimed","Unclaimed"),State)'
>
> So change
>
>  -constraint 'IsGeneralPurposeVM =?= True'
>
> to that crazy one above, right?
>   

Notice that the constraint isn't all that crazy in the command above.  
What is crazy is the -f (-format) option to print out the state of the 
VM.  It prints the state of the VM in a way that takes into account the 
state of the suspension VM.

>
>
>  > We should also change the graph to report the "Backfill" state.
>
> Tell me more.  I don't know about the backfill state!
>   

When condor doesn't have any other job to run, it can be configured to 
run "backfill" processes.  (This was a new feature added in the last 
year or so.)  This is being used on glow.  The state of the condor VM 
will show up as "Backfill".

--Dan
msg752 (view) From: rader To: dan, rader Date: 2007.01.22 13:20
> The condor status graphs should be fixed so that when the 
 > "GeneralPurposeVM" is unclaimed, it reports the status of the 
 > "SuspendVM" instead.  Without this, our reports of GLOW utilization are 
 > not showing the full utilization that really exists.
 > 
 > Here's an example of a condor_status query that does the right thing.  
 > It shows the state of each general purpose VM or, if unclaimed, the 
 > state of the corresponding suspension VM.
 > 
 > condor_status -pool glow.cs.wisc.edu -constraint 'IsGeneralPurposeVM && 
 > OpSys == "LINUX"' -f "%s\n" 
 > 'ifThenElse(State=?="Unclaimed",ifThenElse(SuspendVMClaimed,"Claimed","Unclaimed"),State)'

So change

 -constraint 'IsGeneralPurposeVM =?= True'

to that crazy one above, right?



 > We should also change the graph to report the "Backfill" state.

Tell me more.  I don't know about the backfill state!



 > Also note that the description of the condor_status query in the 
 > following web page is slightly wrong (probably my fault).
 > 
 > http://noc.hep.wisc.edu/nrg/condor/pools/GLOW-condor-pool.cgi
 > 
 > It should have OpSys == "LINUX" rather than Arch == "LINUX".

Fixed.

steve
- - -
msg555 (view) From: dan To: dan Date: 2007.01.11 11:00
Steve,

The condor status graphs should be fixed so that when the 
"GeneralPurposeVM" is unclaimed, it reports the status of the 
"SuspendVM" instead.  Without this, our reports of GLOW utilization are 
not showing the full utilization that really exists.

Here's an example of a condor_status query that does the right thing.  
It shows the state of each general purpose VM or, if unclaimed, the 
state of the corresponding suspension VM.

condor_status -pool glow.cs.wisc.edu -constraint 'IsGeneralPurposeVM && 
OpSys == "LINUX"' -f "%s\n" 
'ifThenElse(State=?="Unclaimed",ifThenElse(SuspendVMClaimed,"Claimed","Unclaimed"),State)'

We should also change the graph to report the "Backfill" state.

Also note that the description of the condor_status query in the 
following web page is slightly wrong (probably my fault).

http://noc.hep.wisc.edu/nrg/condor/pools/GLOW-condor-pool.cgi

It should have OpSys == "LINUX" rather than Arch == "LINUX".

Thanks,
--Dan
History
Date User Action Args
2007-02-08 10:55:25radersettopic: + CMS
2007-01-31 08:07:48radersetmessages: + msg902
2007-01-22 13:40:00dansetmessages: + msg755
2007-01-22 13:20:14radersetstatus: unread -> chatting
messages: + msg752
2007-01-11 11:19:19wcmaiersetpriority: triage -> normal
assignedto: rader
topic: + Condor, Monitoring
nosy: + rader
2007-01-11 11:00:54dancreate