D-WS1-001 / D-WS2-001
SCF Multi-Omics Dataset
PROJECT HELIX-HTT | Huntington’s-Adjacent Neuro-Oncology Vaccine Program
Document Control
Program Code: SCF-ARGEN-HTT-NO-01
Workflow Stage: Discovery Phase
Deliverable Type: Core Multi-Omics Dataset
Purpose: Establish the integrated data foundation for antigen discovery, antigen ranking, vaccine candidate assembly, and SCF microenvironment scoring.
1. Dataset Purpose
The SCF Multi-Omics Dataset is the foundational evidence package for identifying vaccine-relevant tumor antigens. It integrates tumor genomics, transcriptomics, HLA typing, immunopeptidomics, and SCF microenvironment biomarkers into one structured dataset.
This dataset answers five core questions:
- What mutations are present in the tumor?
- Which mutated or tumor-associated genes are actually expressed?
- Which peptides can be presented by the patient’s HLA system?
- Which peptides are confirmed on the tumor surface?
- Is the tumor microenvironment suitable for vaccine-mediated immune attack?
2. Required Dataset Components
Dataset Layer | Source Material | Output File | Primary Use |
Tumor Genomics | Tumor DNA | MCAT | Mutation discovery |
Matched Normal Genomics | Blood, buccal, or normal tissue | Germline filter | Remove inherited variants |
Tumor Transcriptomics | Tumor RNA | EXAT | Expression validation |
HLA Typing | Blood or tumor-normal DNA | HCM | Antigen presentation prediction |
Immunopeptidomics | Tumor lysate | PAL | Confirm presented peptides |
RHENOVA / SCF Microenvironment | Tumor, plasma, CSF, PBMC | MES Input File | Immune-terrain scoring |
3. Biospecimen Registry
Sample ID | Sample Type | Priority | Required? | Intended Analysis |
TUM-001 | Fresh tumor tissue | 1 | Preferred | WES/WGS, RNA-seq, immunopeptidomics |
TUM-002 | Frozen tumor tissue | 2 | Acceptable | WES/WGS, RNA-seq, immunopeptidomics |
TUM-003 | FFPE tumor tissue | 3 | Backup | WES/WGS, limited RNA-seq |
NOR-001 | Peripheral blood | 1 | Required | Matched normal DNA, HLA typing |
NOR-002 | Buccal swab | 2 | Backup | Matched normal DNA |
PBMC-001 | PBMC fraction | Optional | Preferred | Immune profiling |
CSF-001 | CSF | Optional | Exploratory | CNS biomarkers |
PLA-001 | Plasma | Optional | Exploratory | ctDNA, inflammatory markers |
LYS-001 | Tumor lysate | Preferred | Required for PAL | Immunopeptidomics |
4. Data Layers and Minimum Specifications
4.1 Genomics Layer
Objective: Identify somatic mutations and tumor-specific sequence changes.
Parameter | Minimum Requirement |
Assay | WES or WGS |
Matched normal required | Yes |
Variant classes | SNVs, indels, structural variants, fusions |
Output | Mutation Catalog |
File format | FASTQ, BAM/CRAM, VCF, annotated TSV |
Primary Output: MCAT — Mutation Catalog
4.2 Transcriptomics Layer
Objective: Confirm that candidate antigen source genes are actively expressed.
Parameter | Minimum Requirement |
Assay | Bulk RNA-seq |
Optional extension | Single-cell RNA-seq, spatial transcriptomics |
Output | Expression Atlas |
File format | FASTQ, BAM, count matrix, TPM/FPKM table |
Primary Output: EXAT — Expression Atlas
4.3 HLA Typing Layer
Objective: Determine which peptides the patient can present to immune cells.
Parameter | Minimum Requirement |
HLA Class I | HLA-A, HLA-B, HLA-C |
HLA Class II | HLA-DR, HLA-DQ |
Output | HLA Compatibility Matrix |
File format | HLA allele table |
Primary Output: HCM — HLA Compatibility Matrix
4.4 Immunopeptidomics Layer
Objective: Identify naturally presented HLA-bound peptides from tumor material.
Parameter | Minimum Requirement |
Assay | LC-MS/MS immunopeptidomics |
Input | Tumor lysate |
Output | Presented peptide inventory |
File format | Peptide table, spectral evidence file |
Primary Output: PAL — Presented Antigen Library
4.5 SCF Microenvironment Layer
Objective: Determine whether the tumor environment supports vaccine response.
Domain | Marker Set |
Hypoxia | HIF-1α, CAIX, VEGF, pO₂ |
Redox stress | 8-OHdG, MDA, GSH:GSSG |
Immune access | CD8, IFNγ, CXCL9, CXCL10 |
Immune suppression | PD-L1, T-cell exhaustion markers |
Microglial state | IBA1, TREM2, TNFα, IL-1β |
Primary Output: MES Input File — Microenvironment Suitability Dataset
5. Integrated Dataset Schema
Field | Description |
Patient/Case ID | De-identified case identifier |
Sample ID | Linked biospecimen identifier |
Tumor type | Neuro-oncology diagnosis |
Tumor region | Anatomical or spatial location |
Variant ID | Mutation identifier |
Gene | Gene symbol |
Variant type | SNV, indel, fusion, splice, structural variant |
RNA expression | TPM / normalized expression |
HLA allele | Presenting allele |
Predicted peptide | Candidate antigen peptide |
Binding score | HLA binding prediction |
Immunopeptidomics status | Confirmed / predicted only / absent |
SCF-MES score | Microenvironment suitability score |
CNS risk flag | Low / moderate / high |
Immune escape flag | Low / moderate / high |
Antigen class | Neoantigen / TAA / SCF experimental antigen |
6. Quality Control Requirements
QC Domain | Acceptance Rule |
Tumor purity | Must be sufficient for variant calling |
Matched normal | Required for germline subtraction |
RNA integrity | Must support reliable expression calling |
HLA typing | Must resolve Class I and Class II alleles |
Immunopeptidomics | Spectral confidence must be documented |
Microenvironment markers | Must include hypoxia, redox, and immune-access domains |
7. Data Integration Logic
The dataset is integrated through the following sequence:
Tumor DNA
→ Somatic mutation discovery
→ RNA expression confirmation
→ HLA presentation prediction
→ Immunopeptidomics confirmation
→ SCF microenvironment scoring
→ CNS safety filtering
→ Antigen ranking
8. Deliverable Output Package
The completed Multi-Omics Dataset must include:
- Biospecimen Registry
- Mutation Catalog input files
- Expression Atlas input files
- HLA Compatibility Matrix input files
- Presented Antigen Library input files
- SCF Microenvironment biomarker table
- Integrated antigen-ready master dataset
- QC summary report
9. Decision Readiness
This deliverable is considered complete when the dataset can support:
- Neoantigen discovery
- Tumor-associated antigen filtering
- HLA presentation prediction
- Immunopeptidomics confirmation
- SCF-MES calculation
- Ranked antigen candidate generation
Status
Deliverable 1 Status: Draft Complete
Next Core Deliverable: Mutation Catalog