Case 1: Many a thesis writer has been there: after a break, you find it difficult to start again because you no longer remember all the details about the material.
► Metadata, or information that describes the data, helps you understand the nature of the data.
Case 2: It is difficult to share data with other members in a group if everyone has produced and processed the data independently, without a shared plan.
► The selection of file formats facilitates the sharing and joint use of data in the long term.
► Naming files and organising folders makes the data easier to find.
► An editable table template, including heading rows and designed for shared or individual use, helps collect data systematically. A structured table is handy when analysing data with statistical software, such as SPSS or R.
Case 3: You have made changes to your data that prove to be wrong – but there is no returning to the old version.
► Version control makes data processing safe.
The selection of the file format has an impact on both research work and data usability in the long run. While there is no clear-cut recommendation for formats, keep the following basic principles in mind when selecting one:
By naming your files and organising your folders appropriately, you make your data easier to find and the data content easier to comprehend. There are a few rules of thumb for naming:
Plan your folder structure based on your needs. For example, store raw data, edited data, methods, documentation, the manuscript and presentations in separate folders.
Version control is an important part of organising data. Data processing results in various versions, and you want to be able to return to an earlier version, if required. Version control can be automated (the recommended alternative) or done manually. Store the original file or raw data separately to avoid accidentally editing it.
In automated version control, the system creates and keeps track of the different versions.
► Tools such as Git are also available for more advanced version control (see also Instructions for using Git).
In manual version control, the user is responsible for creating and managing versions (NB! The importance of naming).
► This is suitable for small amounts of data which the producer manages alone.
Documenting data means describing data.
Metadata (metadata = ‘data about data’), or descriptive data, makes data understandable, findable and usable. It indicates
The name of the dataset is the simplest kind of metadata. Descriptive data can also be related to the
Matters to consider when producing metadata