[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Request for additional work space for Hall D
Hi Elke,
Thanks for the response. I'll just throw in that in the discussions
at the time, we used a factor of 3 rather than 10 since the 10 came from
earlier, lower statistics experiments which had statistically driven
error bars. Clearly you're right that something will need to be saved,
but the question was whether saving the raw data was faster and/or
cheaper than regenerating and reconstructing on the farm. The logic was
the following: Reconstruction of GlueX data takes longer than the
simulation. Saving the simulated data to disk would be for the purpose
of re-reconstructing it again assuming whatever problem motivated this
was limited to the reconstruction and not the simulation itself. Saving
some kind of simulation DSTs was thought to be the way to go, but they
would have a tiny footprint on the disk by comparison. RHIC is probably
a whole other beast, especially Au Au scattering where the number of
tracks per event is orderS of magnitude greater than GlueX. Anyway, I'm
not really disagreeing (even though it may sound like it!) I'm just
trying to convey the earlier reasoning. I'll let others revisit the plan
and revise it for the future as they see fit.
Regards,
-David
elke-caroline aschenauer wrote:
> On Tue, 26 May 2009, David Lawrence wrote:
>
> Dear David et al.,
>
> okay I cannot keep myself from replying. For hermes we need for a
> fully reconstructed event in Geant-3 between 0.5 to 2.5 s, at rhic Au
> on Au an event takes a between 1 - 2 min CPU time.
> So I'm not sure what you mean it costs nothing to produce MC events.
> It costs time and the rule of thum is at least 10 times the MC
> statistics compared to the data. So with the data statistics GlueX is
> expecting you will need continously to generate MC to reach 10 times
> the data statistics. This udst need to be stored somewhere, so you
> will need disk-space or tape-space.
>
> I would revise the xls sheets to include this in your estimate.
> I know the only valid comparision is CLAS, I have no idea what they
> do, but from my limited knowledge on CLAS I know they are normally not
> heavy on MC studies.
>
> Cheers elke
>
>
> Date: Tue, 26 May 2009 15:34:42 -0400
> From: David Lawrence <davidl@jlab.org>
> To: Mark M. Ito <marki@jlab.org>
> Cc: HallD Software Group <halld-offline@jlab.org>
> Subject: Re: Request for additional work space for Hall D
> Hi Mark,
> This may be a little late, but here are the latest spreadsheet
> (and it's
> explanation) used to compute the predicted GlueX disk space. The
> main thing
> you'll notice is that there is virtually no disk space allocated for
> simulation. This was mainly due to a calculation that it costs more
> to store
> the simulated data than reproduce it. As such, we wanted the IT
> division to
> focus their budget on more CPU power for the farm as opposed to work
> disk
> space. This is not to say that philosophy should still be followed,
> just that
> that was the motivation in the past. You are likely to get questions
> from
> Sandy and Chip referring back to the spreadsheet.
> Regards,
> -David
> Mark M. Ito wrote:
> > In recent months we have run out of space on our works disk from
> time to
> > time. Our activity generating and reconstructing simulated data is
> ramping
> > up. We discussed the situation in the Hall D Offline Meeting and
> we would
> > like request more space. Our current allocation is 2.7 TB. We
> request that
> > this be increased to 5 TB as soon as possible and that the total be
> > increased to about 10 TB over the next year.
> > > Once in operation, the GlueX detector will generate a large
> volume of data.
> > To be ready to analyze real data once it comes in we will have to
> generate
> > large amounts of simulated data to develop and test all the of
> necessary
> > tools well in advance of the arrival of real data. As we approach
> > operations we will likely need an amount of space equal to or
> exceeding the
> > amount used by Hall B (about 35 TB). Recall that raw data is
> generally not
> > stored on the work disks in any case, work space is used for
> intermediate
> > files needed to analyze raw or simulated data. The fact that GlueX
> has not
> > taken real data yet does not imply our current disk needs are
> > insignificant. We note that this request is quite modest when
> compared to
> > the total amount of work disk space deployed at present.
> > > --
> ------------------------------------------------------------------------
> David Lawrence Ph.D.
> Staff Scientist Office: (757)269-5567 [[[ [ [ [
> Jefferson Lab Pager: (757)584-5567 [ [ [ [ [ [
> http://www.jlab.org/~davidl davidl@jlab.org [[[ [[ [[ [[[
>
> ------------------------------------------------------------------------
>
> ( `,_' )+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=
> ) `\ -
> / '. | +
> | `, Elke-Caroline Aschenauer =
> \,_ `-/ -
> ,&&&&&V Brookhaven National Lab +
> ,&&&&&&&&: Physics Dept., 8 Shore Road =
> ,&&&&&&&&&&; Bldg. 510D / 2-195 East Patchogue, NY, -
> | |&&&&&&&;\ 20 Pennsylvania Avenue 11772-5963 +
> | | :_) _ Upton, NY 11973 =
> | | ;--' | Tel.: 001-631-344-4769 Tel.: 001-631-569-4290 -
> '--' `-.--. | Fax.: 001-631-344-1334 Cell: 001-757-256-5224 +
> \_ | |---' =
> `-._\__/ Mail:
> elke@jlab.org -
>
> =-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=-+=
--
------------------------------------------------------------------------
David Lawrence Ph.D.
Staff Scientist Office: (757)269-5567 [[[ [ [ [
Jefferson Lab Pager: (757)584-5567 [ [ [ [ [ [
http://www.jlab.org/~davidl davidl@jlab.org [[[ [[ [[ [[[
------------------------------------------------------------------------