STANDARDIZED ORGANIZATION
Mandatory Directory Structure
/RDS_Lab_Storage/ # General lab RDS storage
│
├── USERS/
│ └── YourName/
│ └── Project_YYYYMM_ShortName/ # Active projects, use when relevant
│ ├── project_overview.md # REQUIRED – summary of project goals, collaborators, dates
│ ├── 001_EXPERIMENT_NAME/
│ │ ├── README.md # REQUIRED – overview of this specific experiment
│ │ ├── metadata.yaml # REQUIRED – structured metadata for reproducibility
│ │ ├── 01_raw_data/
│ │ │ ├── ELISA/
│ │ │ ├── PCR/
│ │ │ └── Viability/
│ │ ├── 02_processed/
│ │ ├── 03_analysis/
│ │ ├── 04_figures/
│ │ └── 05_other/
│ │
│ └── 002_EXPERIMENT_NAME2/ # Additional experiments follow same structure
│
│
├── 01_SHARED_DATASETS/ # Shared or large datasets
│ ├── flow_cytometry/
│ ├── scRNAseq/
│ ├── spatial_transcriptomics/
│ └── IMC/
│
├── 02_ANALYSIS_PIPELINES/ # Cloned Github repo and validated analysis code for finalized pipelines
│ ├── imaging_pipeline/
│ ├── imc_pipeline/
│ └── scrna_pipeline/
│
│
└── 03_DOCUMENTATION/ # Templates, SOPs, and guides
├── Templates/
└── SOP_Protocols/
File Naming Convention (Mandatory)
Naming files consistently is crucial for easy identification, retrieval, and organization. Do not use spaces, special characters or change capalization strategy in the file names.
Follow this convention for all files:
Format:
YYYYMMDD_DataType_Sample_Condition.extension
Examples:
Imaging:
20240115_Confocal_MouseBrain_Section1_DAPI.tif
20240115_Confocal_MouseBrain_Section1_GFP.tif
IMC:
20240115_IMC_Tumor_Patient01_Core1.mcd
20240115_IMC_Panel_37markers.csv
scRNA-seq:
20240115_scRNAseq_PBMC_Donor01_counts.h5ad
20240115_scRNAseq_PBMC_Donor01_filtered.h5ad
Analysis outputs:
20240115_DiffExp_TumorVsNormal_DESeq2.csv
20240115_Clustering_Res08_UMAP.pdf
Decision Tree: "Where Does This File Go?"
START: I have a new file
❓ Is it RAW data from instrument?
→ YES: 01_raw_data/[data_type]/
→ NO: Continue
❓ Is it PROCESSED/QC'd data?
→ YES: 02_processed/
→ NO: Continue
❓ Is it ANALYSIS code or results?
→ YES: 03_analysis/
→ NO: Continue
❓ Is it a FIGURE for publication?
→ YES: 04_figures/
→ NO: Continue
❓ Is it SHARED reference data?
→ YES: 02_SHARED_DATASETS/
→ NO: Ask Data Steward
Enforcement Mechanisms
BALAZS: Is this realistic to expect lab members to do this every month? How can we adapt to make it easier? Automate at all?
Automated weekly check (Mondays):
# Check all projects have required files
find /RDS/00_ACTIVE_PROJECTS -type d -maxdepth 1 | while read project; do
if [ ! -f "$project/README.md" ]; then
echo "MISSING README: $project" >> /var/log/compliance_issues.txt
mv "$project" /RDS/NEEDS_ORGANIZATION/
fi
if [ ! -f "$project/metadata.yaml" ]; then
echo "MISSING METADATA: $project" >> /var/log/compliance_issues.txt
mv "$project" /RDS/NEEDS_ORGANIZATION/
fi
done
# Email compliance report to Data Steward
mail -s "Weekly Organization Compliance Report" data-steward@uni.edu < /var/log/compliance_issues.txt
Manual enforcement:
- Projects in
/NEEDS_ORGANIZATION/
cannot be worked on - Must add missing files to restore access
- Data Steward reviews and approves restoration