Organise your Data

Make a plan from the start of your project to save you time.

Planning for this activity from the start of your project can save time and prevent errors.

Naming files

Develop a file naming convention for your project. Effective file names are consistent, concise, meaningful and findable:

be consistent with your file and folder naming
number the versions in an appropriate scale. For example, if you are going to collect 15 samples, number the first one 01 rather than 1
include a version control table for important documents that includes version number, author, purpose/ changes and date
if required, date the documents at the beginning of the file/ folder using YYYY_MM_DD
some software can’t read file names with spaces so it may be easier to avoid using them. Alternatives include underscores, hyphen and camel case

Examples

2016_10_31_Eye_Tracking__012
ActiveDataStorage_05
2016-10-31-Interview-audio-V1

File structure

Accessing files, avoiding duplication and adequate back-up of the files requires a little planning. Much like naming files it is important to be consistent:

most operating systems default to a hierarchical file structure: files inside folders, which may be stored in other folders. Aim for a balance between breadth and depth so there isn’t endless clicking to find a folder. It might be easier to start with a limited number of broad topics and create folders within
make sure your folders are backed up: this is automatic when using the University filestore
name folders after the work rather than individuals
review your folders and files to make sure they aren’t kept needlessly, and to separate current and completed activity. An archive folder within the hierarchical structure can be used to move files and folders so there isn’t a cluttered workspace
depending on your work, it may be helpful to tag your files and folders to support their discoverability across overlapping folders

File formats

How data are collected and analysed will determine the file formats used during a research project. For long term storage, consider using formats that don't have restrictions on their use. This means they are more likely to be accessible in the future. The table below shows popular data formats and the file format options.

Format	Great for preservation	Okay for preservation
Textual data	.rtf; .txt; .xml	doc; .docx; .html
Tabular data	.por; .csv; .tab	xls; .xlsx; .sav; .mdb; .txt; .dbf; .dta; .ods
Image	.tif (verison 6)	.jpeg; .jpg; .pdf; .raw; .psd
Video	.mj2; .mp4
Audio	.flac; .wav	.mp3; aif
Geospatial data	.shp; .shx; .dbf; .tiff; .tfw;	.mdb; .mif; .kml; .dxf; .svg

Documentation and metadata

Metadata (data about data) and supporting documentation provide the context to research data. Providing the context will allow the data to be easily retrievable and, importantly, understood in the future.

Metadata will be required to describe the data when you deposit in a repository. Supporting documentation outlining the data collection method will be required too. This is often easier to collect during the research project.

Examples

Metadata provides an overview of the data, location, access conditions and reuse of the data. Differing metadata requirements exist but the following standard information is required:

title – how the data are known
description – a brief methodological overview of the data. It is similar to an abstract for a paper and can contain information on what the data is, how and why it was collected and how it has been processed
keywords – related to the content of the data
creators – the main researchers involved in creating the data
funders – the source of financial support for the collection of the data
access conditions – how the data can be accessed and whether there are any restrictions in place

Supporting documentation describes the data and includes:

code, field and label descriptions
software used
methodology
dates of collection
geographic location
some software will automatically create the metadata during the data collection process
subject specific metadata and documentation may exist within disciplines, data repositories and funding agencies and the DCC provide an external overview

When uploading to data.ncl it is strongly recommended you complete a README to provide the context to the data record.

Handling Sensitive Data

Data Management Plans

Share your Data

Naming files

Examples

File structure

File formats

UK Data Service

Documentation and metadata

Examples