GUIDELINES
One-Page Quick Reference Cards
Starting a New Project
☑ NEW PROJECT CHECKLIST (15 minutes)
Each Llyod lab project must follow the standardized directory structure and include required documentation files. Projects should be stored in the Llyod lab RDS in the relevant users folder name.
Example path:
RDS/USERS/YourName/Project_YYYYMM_ShortName/001_EXPERIMENT_NAME/
- Create folder:
Project_YYYYMM_ShortName
when starting a new project - Copy template README.md from
/99_DOCUMENTATION/Templates/
(ADD LINK) - Fill in README.md (project title, PI, description, data types)
- Copy template metadata.yaml (ADD LINK)
- Fill in metadata.yaml (at minimum: project_id, pi, start_date)
- Create subdirectories:
- 01_raw_data/ (with data type folders)
- 02_processed/
- 03_analysis/
- 04_figures/
- Add project to lab inventory:
/99_DOCUMENTATION/project_inventory.xlsx
Time investment: 15 minutes
Prevents: Hours of future confusion and data loss
Saving New Data
☑ NEW DATA CHECKLIST (5 minutes per dataset)
- Save to correct location:
- Imaging →
raw_data/imaging/
- IMC →
raw_data/imc/
- scRNA-seq →
raw_data/scrnaseq/
-
Spatial →
raw_data/spatial/
-
Use naming convention:
YYYYMMDD_DataType_Sample_Condition.ext
-
Update metadata.yaml:
- Add new dataset entry
- Record date, sample info, experimental conditions
Time investment: 5 minutes
Prevents: Data corruption, lost samples, cannot reproduce results
Monthly Data Hygiene
☑ MONTHLY CHECKLIST (1 hour per month)
BALAZS: Is this realistic to expect lab members to do this every month? How can we adapt to make it easier? Automate at all?
Week 1:
- Run storage monitor:
python storage_monitor.py /RDS
- Review storage report
- If >80% full, proceed to emergency cleanup
Week 2:
- Compress imaging data >3 months old
- Expected recovery: 100-200GB
- Run:
python compress_old_imaging.py --days 90
Week 3:
- Move projects >6 months inactive to
01_ARCHIVED_PROJECTS/
- Delete temporary analysis files (cache/, *.tmp)
- Find:
find /RDS -name "cache" -type d -exec rm -rf {} \;
Week 4:
- Verify backup integrity (spot check 5 random files)
- Rotate external HDD (swap onsite/offsite)
- Update lab data inventory
Time investment: 1 hour per month
Prevents: Storage crises, data loss, backup failures
Mandatory Templates
README.md Template (Required for all projects)
# Project: [Title]
## Project Info
- **ID:** Project_YYYYMM_Name
- **PI:** [Name]
- **Lead:** [Name]
- **Start:** YYYY-MM-DD
- **Status:** Active/Archived/Published
## Description
[2-3 sentences describing research question]
## Data Types
- [ ] Imaging
- [ ] IMC
- [ ] scRNA-seq
- [ ] Spatial transcriptomics
## Key Files
- Raw data: `01_raw_data/`
- Processed: `02_processed/`
- Analysis: `03_analysis/`
- Figures: `04_figures/`
## Data storage location
- OneDrive Link: [Insert link]
- RDS Path: `/RDS/usrs/YourName/Project_YYYYMM_ShortName/`
- External HDD ID: [Insert ID]
## Notes
[Any important notes about this project]
## Publications
[List any papers from this project]
metadata.yaml Template (Required for all projects)
# Project Metadata
project_id: "Project_YYYYMM_Name"
pi_name: "Dr. Name"
pi_email: "pi@university.edu"
lead_researcher: "Researcher Name"
start_date: "YYYY-MM-DD"
status: "active" # active, archived, published
funding_source: "Grant XYZ"
# Data Summary
data_types:
- imaging
- imc
- scrnaseq
- spatial
total_size_gb: 0 # Update periodically
# Raw Data Inventory
raw_data:
- dataset_id: "20240115_Imaging_Experiment1"
date_acquired: "2024-01-15"
data_type: "imaging"
modality: "confocal"
sample_type: "mouse_brain"
size_gb: 50
location: "01_raw_data/imaging/20240115_Experiment1/"
notes: "Initial pilot experiment"
# Processing History
processing_log:
- date: "2024-01-20"
action: "Segmentation and quantification"
software: "CellProfiler 4.2"
output: "02_processed/imaging/20240120_segmented/"
# Publications
publications: []
# Last Updated
last_updated: "YYYY-MM-DD"
updated_by: "Name"