[qudi-dev] Re: Managing saved measurement data

From: "Dr. Kay Jahnke" <kay.jahnke@xxxxxxxxxxxxxxxxx>
To: Dan Yudilevich <dan.yudilevich@xxxxxxxxxxxxxx>
Date: Mon, 4 May 2020 19:20:27 +0200

Hi Dan,

you are correct, the current save logic only saves hierarchical for
dates and then the modules the saved data was created it.
It was designed that way because qudi is not dictating what modules
there are, which parameters they have or which and how the data is
saved. So going by date was the best thing to get any structure.
There are some ways, that can enhance this structure and give you a way
to access your data more conveniently: The current save_logic has
additional_parameters
(https://github.com/Ulm-IQO/qudi/blob/master/logic/save_logic.py#L633).
These are global and can be set from any module or even from a notebook
script. So here you could save sample, center or any other parameters
you like, into the data files. You should best not save parameters in
file names, as this can lead to massive problems afterwards.

Some of my colleagues then query the whole data directory automatically
with a script and load ALL the data files. These data files can then be
pushed into for example a "pandas" data set and this let's you query the
specific parameters per experiment.
The disadvantage is that the querying of all the data takes a while and
you might run out of RAM, as the data might get big (just imagine you
want to look at all the data from one PhD). Also in principle this has
nothing really to do with qudi, because at this point you are writing
you analysis scripts and should not need the qudi core functionality.

Therefore the much more elegant way of solving your problem would be to
write your own save-logic. The current module is just the default qudi
logic module and can be replaced at any time. You will just need to
support the same functions, but you can freely change what happens in
the background.
For your case it would probably be best if you set up a (elastic)
database and write a new save-logic that connects to that database and
safes the data in a clever way. A word of caution: Be sure to put a lot
of though into the design of the database beforehand and define
explicitly what you want to save and how (e.g. which parameters, which
modules, pictures or only data sets, dimensionality of the data).
I tried to write a save-logic for a database connection once, but the
users could not agree on any standardized parameters and structure. So
the database became maximally flexible and therefore was extremely
complicated to query. It never got used productively. Therefore, define
what you want beforehand, put thought in it and then keep to your structure.

And finally, if something good comes out, other people might also want
to use something similar. So please than open a Pull Request and commit
back to Upstream. This also has the advantage that other people might
fix bugs for you or enhance the project.

An additional general remark: It is always very useful to keep some kind
of notebook, even if the data is saved automatically. We had very good
experience by going fully digital on the lab notebooks and using a wiki
system for that (https://www.dokuwiki.org/dokuwiki). The most used
plug-in it turns out is than the one that let's you paste screenshots
directly into the wiki page.

If there are more questions, please don't hesitate to ask.

Cheers,
Kay

Am 04.05.2020 um 17:59 schrieb Dan Yudilevich:

Hi everyone,

We are a new group out of the Weizmann Institute (Israel), and we are
slowly but surely getting to know qudi, with most of the important
features running smoothly. The software is impressive, so kudos to the
developers.

One thing I am struggling with is data management. Although we are
only beginning to acquire data, I already feel it is getting quite
cluttered. The apparent organization hierarchy of date/modules makes
it challenging to trace specific data later on. I would like, for
example, a convenient way to find data related to a specific sample
(or defect); data from specific types of pulse sequences, etc.

By managing a lab notebook I can, of course, refer to the specific

dates and files, but I feel it somewhat defeats the purpose.

So, I wanted to ask if someone has any recommendation –

Does anyone have a particularly elegant way of organizing the information?

Am I missing something in the save logic, so that I’m under-utilizing
this feature?

Thank you all, and stay healthy,

Dan Yudilevich

Finkler Group | Dept. of Chemical and Biological Physics

Weizmann Institute of Science

--
Dr. Kay Daniel Jahnke

Küfergasse 1
89073 Ulm
[T] +49 176 444 346 51
[@] kay.jahnke@xxxxxxxxxxxxxxxxx

Attachment: 0xA6D4CF2042861040.asc
Description: application/pgp-keys

Attachment: signature.asc
Description: OpenPGP digital signature

Follow-Ups:
- [qudi-dev] Re: Managing saved measurement data
  - From: Alrik Durand

References:
- [qudi-dev] Managing saved measurement data
  - From: Dan Yudilevich

[qudi-dev] Re: Managing saved measurement data

Other related posts: