Proteomics

Single-Cell Proteomics Analysis Service — Carrier-Multiplexed MS Quantification from PSM Tables to Annotated Cell-State Clusters

Single-cell proteomics quantifies proteins per cell to resolve cell-state heterogeneity bulk LC-MS/MS averages away. Pepkio delivers version-pinned analysis from PSM or protein matrices to cell-state clusters and DE tables, with custom and bespoke support scoped at kickoff. For academic, biotech, and pharma clients; SCoPE2 reports ~1,000 proteins per cell (Specht et al., 2021). R scripts, figures, and Methods draft included.

Key facts

Key facts about Single-Cell Proteomics
Fact	Value
Supported platforms / instruments	Thermo Orbitrap (Exploris, Fusion Lumos, Eclipse); Bruker timsTOF (dia-PASEF, Ultra when scoped); TMT/TMTpro isobaric carrier multiplexing (SCoPE2-style; Specht et al., 2021); plexDIA and nPOP; label-free DIA single-cell; Proteome Discoverer, MaxQuant, and DIA-NN exports — scoped at kickoff
Input requirements	PSM or protein quantification tables plus colData with cell ID, MS batch/run, condition, and carrier channel; often hundreds of cells per condition for clustering (Deutsch et al., 2023); diluted bulk reference sample per batch recommended (Deutsch et al., 2023)
Reference builds supported	UniProt Swiss-Prot reviewed proteomes (human UP000005640, mouse UP000000589); CPTAC-style contaminant sequences; organism-specific FASTA on request
Primary tools (with versions)	scp 1.22+; QFeatures; SingleCellExperiment; QuantQC; DIA-NN 2.0+; MaxQuant 2.8+; proDA 1.0; limma 3.60 — pinned per project
Typical turnaround time	4–8 weeks (standard SCP cohort, one primary contrast); longer for multi-batch integration, matched scRNA-seq, or atlas-scale cell counts — confirmed at kickoff
Deliverable formats	Protein matrices (.csv, .tsv); SingleCellExperiment .rds objects; PDF/SVG figures; HTML QC report; commented R scripts; Methods draft
Key cited best-practice reference	Deutsch et al. (2023), Nature Methods; Specht et al. (2021), Genome Biology (SCoPE2)
Custom / bespoke analysis	Non-standard matrices, cross-modality integration, client-specified QC thresholds, trajectory analysis, or phosphoproteomics extensions scoped at kickoff

What is single-cell proteomics?

Single-cell proteomics (SCP) converts PSM tables into a cell-by-protein matrix, then applies per-cell QC, batch correction, clustering, and differential testing—measuring which proteins define cell states rather than tissue averages (Deutsch et al., 2023). Unlike bulk DDA/DIA proteomics, SCP resolves subpopulations and continuous proteome gradients (Specht et al., 2021); unlike Olink PEA, it is discovery-scale MS without a fixed antibody panel. SCoPE2 quantifies ~1,000 proteins per cell across 1,490 monocytes and macrophages (Specht et al., 2021). Pepkio starts from client PSM or protein matrices with documented QC; custom workflows are agreed at kickoff. See the single-cell proteomics glossary.

When should you use single-cell proteomics?

Single-cell proteomics fits when the research question requires protein-level cell-state heterogeneity—differentiation trajectories, rare subpopulations, or post-transcriptional regulation invisible to bulk or RNA measurements (Specht et al., 2021; Deutsch et al., 2023).

Comparison of single-cell MS proteomics, bulk DDA/DIA, scRNA-seq, and Olink PEA approaches
Approach	Best for	Limitations	Approximate cost range
Single-cell MS proteomics (SCoPE2 / DIA / nPOP)	Cell-state discovery, proteome gradients, low-input heterogeneity, matched multi-omics	Sparse matrices; carrier or library design affects depth; MS time per cell limits throughput	Higher per-cell MS and sample-prep cost than bulk; moderate bioinformatics cost
Bulk DDA/DIA proteomics	Cohort-scale differential abundance when tissue is homogeneous	Averages across cell types; rare states below detection	Lower per-sample MS cost than SCP for modest cohorts
scRNA-seq	Cell-type atlases, trajectory inference, large cell counts	mRNA proxy; post-transcriptional regulation and protein stability not measured directly (Specht et al., 2021: ~20-fold more protein than RNA copies per gene)	Library prep + sequencing; bioinformatics varies by cell count
Olink PEA	Targeted plasma or CSF panels, clinical-scale validation	Fixed panel size; not discovery-scale MS depth	Panel and plate cost; efficient for targeted biomarker testing

Macrophage differentiation: Specht et al. (2021) profiled 1,490 monocytes and macrophages by SCoPE2, revealing a continuous proteome gradient during differentiation—not discrete cytokine-polarization clusters.
ES cell subclusters: Rosenberger G et al. (2023) applied sensitivity-tailored DIA to mouse embryonic stem cells, identifying proteome subclusters with distinct metabolic enzyme expression.
Hepatocyte zonation: Rosenberger FA et al. (2023) used scDVP to map ~1,700 proteins per hepatocyte slice; half of the proteome was differentially regulated near the central vein.

How the analysis works — step by step

1. Validate inputs and experimental design
Pepkio confirms PSM or protein matrix integrity, MS batch structure, carrier channels, and contrast definitions in sample_manifest.csv. Confounded batch designs or very few cells per group are flagged before processing (Deutsch et al., 2023).
Tools and outputs
Tools used: Custom validation scripts; scp annotation schema check
Output: sample_manifest.csv with cell IDs, MS batch, condition labels, carrier design notes, and QC flags
2. Import PSM tables into QFeatures structure
PSM quantification tables from MaxQuant, Proteome Discoverer, or DIA-NN are imported per MS batch using readSCP() into a QFeatures object with PSM → peptide → protein hierarchy (Vanderaa & Gatto, 2023; Cox & Mann, 2008; Demichev et al., 2020).
Tools and outputs
Tools used: scp 1.22+; QFeatures; SingleCellExperiment
Output: Per-batch QFeatures object; import log with row and column counts per set
3. Run MS-batch and cell-level QC
Run-level and per-cell metrics include reporter intensity (RI), CV, peptide counts, and sample-to-carrier ratio (SCR). QuantQC is applied when nPOP exports include it (Leduc et al., 2024; Deutsch et al., 2023).
Tools and outputs
Tools used: scp 1.22+; QuantQC; custom R diagnostics
Output: cell_qc_summary.csv; SCR distribution plots; RI and peptide-count histograms per batch
4. Filter low-confidence PSMs
PSMs are filtered at 1% peptide FDR. SCR-based filtering removes PSMs with unreliable single-cell quantification; carrier-only PSMs (missing SCR) are excluded (Deutsch et al., 2023; Vanderaa & Gatto, 2023).
Tools and outputs
Tools used: scp 1.22+
Output: Filtered PSM assay; psm_filter_log.csv documenting FDR and SCR thresholds
5. Aggregate PSM → peptide → protein
PSM intensities are aggregated to peptide and protein levels using median aggregation per scp conventions, preserving links between assay layers for audit and re-aggregation (Vanderaa & Gatto, 2023).
Tools and outputs
Tools used: scp 1.22+; QFeatures aggregation functions
Output: Peptide and protein assays within QFeatures; aggregation_log.csv
6. Convert zeros to missing values and normalize
Zero intensities are converted to NA before normalization—treating zeros as missing rather than true absence (Deutsch et al., 2023; Vanderaa & Gatto, 2023). Protein matrices are log-transformed; MNAR versus MCAR missing-value patterns are profiled before any imputation (Lazar et al., 2016).
Tools and outputs
Tools used: scp 1.22+
Output: protein_matrix_raw.csv; missing_value_profile.csv; normalization parameters in analysis_parameters.yaml
7. Filter low-quality cells
Cells with low RI, high CV, or very few quantified peptides are removed using batch-adaptive thresholds. Retained and excluded counts are documented per filter rule (Deutsch et al., 2023).
Tools and outputs
Tools used: scp 1.22+
Output: Filtered protein matrix; updated cell_qc_summary.csv; cell-exclusion plots
8. Correct batch effects across MS runs
SCP acquires few multiplexed cells per MS run, so batch structure is inherent. scplainer removes technical variation when shared cell states span batches; batch is modeled as a covariate when correction is not possible (Deutsch et al., 2023).
Tools and outputs
Tools used: scp 1.22+ (scplainer); limma 3.60 when batch is modeled
Output: Batch-corrected protein matrix; before/after PCA by batch and condition
9. Cluster, embed, and annotate cell states
Pepkio builds a neighbor graph on the corrected protein matrix, runs Leiden community detection at data-driven resolution, and computes PCA/UMAP embeddings. Cell states receive provisional labels from top marker proteins per cluster.
Tools and outputs
Tools used: scp 1.22+; leidenalg 0.10+; custom R visualization
Output: Cluster assignments in cell_metadata.csv; PCA/UMAP plots; marker_protein_table.csv
10. Test differential protein abundance and package deliverables
Cluster-wise or condition-wise differential testing uses proDA or limma with Benjamini–Hochberg FDR control, accounting for missing values without blind imputation (Korkmaz et al., 2020; Ritchie et al., 2015). Optional GO/KEGG enrichment is scoped at kickoff. Figures, scripts, README, and Methods draft are bundled.
Tools and outputs
Tools used: proDA 1.0; limma 3.60
Output: dep_by_cluster.csv or dep_<contrast>.csv; volcano plots; final .rds object; scripts; Methods draft

What Pepkio delivers

Processed data

Raw and filtered protein matrices (.csv, .tsv)
SingleCellExperiment .rds objects
cell_metadata.csv with QC metrics, cluster, and provisional cell-state labels
analysis_parameters.yaml documenting FDR, SCR, normalization, and batch-correction choices

Figures (PDF/SVG)

SCR and RI histograms
Per-cell peptide and protein count plots
PCA and UMAP (cluster, condition, MS batch)
Marker-protein heatmaps
Composition bar plots
DE volcano plots per contrast or cluster

Tables

sample_manifest.csv, cell_qc_summary.csv, psm_filter_log.csv, missing_value_profile.csv
marker_protein_table.csv
dep_by_cluster.csv or dep_<contrast>.csv with columns ProteinID, Gene, log2FC, SE, pvalue, adj.pvalue

Code

Commented R scripts per analysis stage
Environment lock files (sessionInfo(), renv lock, or conda export)
Delivery via private Git repository or agreed file transfer

Documentation

HTML QC report with thresholds and outlier flags
README with reproduction instructions
Methods draft citing pinned software versions and filtering parameters
Post-delivery reviewer support for clarification and minor revisions within agreed scope (typically ≤20% of project scope)

Technical decisions we make — and why

Data structure: QFeatures with one set per MS batch: Preserves SCR, carrier channel, and batch metadata that flat matrices discard (Vanderaa & Gatto, 2023). Alternative: single flat protein matrix (loses per-batch QC and SCR filtering).
PSM filtering: 1% FDR plus SCR-based filtering: SCR removes PSMs where single-cell reporter signal is unreliable relative to the isobaric carrier (Deutsch et al., 2023). Alternative: FDR-only filtering (retains peptides identified via carrier but poorly quantified in single cells).
Missing values: convert zeros to NA; proDA for DE: SCP missingness is predominantly MNAR, and blind imputation inflates false positives (Vanderaa & Gatto, 2023; Lazar et al., 2016). Alternative: KNN or MinDet imputation (scoped when design and missingness profile support it).
Batch correction: scplainer default; batch covariate in limma when not confounded: MS-run batch effects are unavoidable in SCP and can mimic biology if ignored (Deutsch et al., 2023). Alternative: no correction when batch and condition are confounded (documented in QC report).
Carrier-aware channel handling: exclude affected TMT channels at high carrier ratios when needed: Isotopic impurities from the carrier channel (126) affect channels 127N and 128C at high carrier ratios (Cheung et al., 2022; Leduc et al., 2020). Channel exclusion strategy is documented at kickoff.

Common questions

What is the minimum number of cells and proteins per cell for single-cell proteomics analysis?

Often hundreds of cells per condition support clustering; Deutsch et al. (2023) recommend reporting counts per group, and Specht et al. (2021) analyzed 1,490 cells. SCoPE2 reports ~1,000 proteins per cell (Specht et al., 2021); nPOP with plexDIA reports ~3,000–3,700 proteins per human cell (Leduc et al., 2024). Smaller cohorts are accepted with reduced-power notes in the QC report. Exact targets are confirmed at kickoff.

Can you analyze low-yield or failed single cells?

Yes, with caveats documented in the QC report. Cells with very low reporter intensity, high CV, or very few quantified peptides are flagged and typically excluded before clustering (Deutsch et al., 2023). Empty carrier channels and carrier-only PSMs are removed during SCR filtering. Re-acquisition or pooling strategies are discussed when cell yield threatens the study question.

Do you support Thermo Orbitrap, Bruker timsTOF, TMT multiplex, and plexDIA data?

Yes, for formats we can load after kickoff review. Thermo Orbitrap .raw files from Exploris, Fusion Lumos, and Eclipse support TMT/TMTpro carrier multiplexing and DIA workflows when scoped. Bruker timsTOF .d files support dia-PASEF and Ultra single-cell acquisitions when scoped. MaxQuant, Proteome Discoverer, and DIA-NN PSM or protein exports are the primary computational entry points. nPOP and plexDIA exports are supported when scoped at kickoff.

How long does single-cell proteomics analysis take at Pepkio?

A standard SCP project (roughly 100–500 cells, one primary contrast, one or two MS batch groups, and clustering plus DE) typically completes in 4–8 weeks from data receipt. Multi-batch integration, matched scRNA-seq correlation, or cohorts exceeding 1,000 cells may require longer timelines. Milestone check-ins occur during the project; exact timelines are confirmed at kickoff.

How do you handle batch effects across MS acquisition runs or prep dates?

When batch is known and not fully confounded with condition, Pepkio applies scplainer batch correction or includes batch as a covariate in limma/proDA models (Deutsch et al., 2023). PCA and RI/CV diagnostics are reviewed before modeling. Designs without batch randomization are flagged at kickoff; biological and technical factors may be inseparable when randomization was not performed.

Do I own the code — and in what format is it delivered?

Yes — you retain full ownership of all code, scripts, and results delivered under the project agreement. Pepkio provides commented R scripts with environment lock files so you can rerun analyses when the execution environment matches the pinned setup. Matrices use standard .csv and .tsv formats; SingleCellExperiment objects are delivered as .rds files readable in R/Bioconductor.

Can I be involved during analysis?

Yes. Checkpoint reviews occur after cell-level QC, after batch correction, and before final delivery when scoped at kickoff. You can review cluster annotations, SCR filtering thresholds, contrast definitions, and marker-protein tables within agreed scope. A dedicated scientific contact leads the project, coordinates milestone feedback, and documents decisions in the QC report and Methods draft.

What does post-delivery reviewer support include?

Support covers clarification of methods, QC thresholds, SCR filtering, and minor figure or table revisions within agreed scope (typically ≤20% of project scope), consistent with Pepkio's standard post-delivery policy. It does not include open-ended reanalysis or new biological contrasts. Substantial new work—additional contrasts, alternate imputation, or scRNA-seq integration—is scoped as separate milestones.

Is co-authorship required?

No. Pepkio is fee-for-service unless co-authorship is agreed in advance. You retain ownership of results and code; billing is separate from publication credit.

How should I design the isobaric carrier ratio for TMT-based single-cell proteomics?

Carrier amount should reflect project priorities: isobaric carriers ~200-fold larger than single-cell samples enhance peptide identification with limited adverse effect on quantification (Leduc et al., 2020). Larger carrier ratios increase identification depth but can compress single-cell dynamic range and worsen TMT channel impurity effects (Cheung et al., 2022). Pepkio documents carrier design assumptions and SCR thresholds in the QC report; channel exclusion is applied when impurity correction is insufficient.

Should I use TMT carrier multiplexing, label-free DIA, or plexDIA for single-cell proteomics?

TMT with isobaric carrier (SCoPE2-style) quantifies ~1,000 proteins per cell in published benchmarks (Specht et al., 2021). Label-free DIA and plexDIA can increase depth—nPOP reports ~3,000–3,700 proteins per cell with plexDIA (Leduc et al., 2024)—but require workflow-specific informatics (Wang et al., 2025). Pepkio documents trade-offs at kickoff based on your export format and biological question.

Can Pepkio run custom or non-standard single-cell proteomics analyses?

Yes, when a feasibility review confirms inputs, outputs, and timeline. Bespoke workflows—custom matrices, client QC thresholds, scRNA-seq integration, trajectory inference, phosphoproteomics, or non-standard deliverables—are scoped at kickoff. Raw .raw reprocessing can be scoped via the DDA/DIA proteomics service when needed.

Related services

DDA/DIA proteomics — Bulk cohort quantification or spectral library building when single-cell depth is not required.
Olink proximity extension — Targeted plasma or CSF validation of MS-discovered protein signatures.
Single-cell RNA-seq — Matched transcriptome profiling when sample IDs align for proteogenomic correlation.
Multi-omics integration — Cross-layer modeling when proteomics, transcriptomics, and metabolomics datasets share sample identifiers.
Custom analysis — Non-standard SCP extensions, phosphoproteomics, or client-specified toolchains beyond the standard pipeline.

References

Deutsch EW, Orsburn BC, Aebersold R, et al. Initial recommendations for performing, benchmarking and reporting single-cell proteomics experiments. Nature Methods. 2023;20(3):375–386. https://doi.org/10.1038/s41592-023-01785-3 (PMID: 36864200)
Specht H, Emmott E, Petelski AA, et al. Single-cell proteomic and transcriptomic analysis of macrophage heterogeneity using SCoPE2. Genome Biology. 2021;22(1):50. https://doi.org/10.1186/s13059-021-02267-5 (PMID: 33504367)
Leduc A, Huffman RG, Specht H, et al. Optimizing accuracy and depth of protein quantification in experiments using isobaric carriers. Journal of Proteome Research. 2020;19(11):4356–4365. https://doi.org/10.1021/acs.jproteome.0c00675 (PMID: 33190502)
Cheung TK, Koller A, Slavov N. A deeper look at carrier proteome effects for single-cell proteomics. Communications Biology. 2022;5:1170. https://doi.org/10.1038/s42003-022-03095-4 (PMID: 35194133)
Vanderaa C, Gatto L. Revisiting the thorny issue of missing values in single-cell proteomics. Journal of Proteome Research. 2023;22(9):2775–2784. https://doi.org/10.1021/acs.jproteome.3c00227 (PMID: 37530557)
Leduc A, Derks J, Slavov N. Massively parallel sample preparation for multiplexed single-cell proteomics using nPOP. Nature Protocols. 2024;19(12):3750–3776. https://doi.org/10.1038/s41596-024-01033-8 (PMID: 39117766)
Rosenberger G, Yu F, Teo GC, et al. Exploration of cell state heterogeneity using single-cell proteomics through sensitivity-tailored data-independent acquisition. Nature Communications. 2023;14:5910. https://doi.org/10.1038/s41467-023-41602-1 (PMID: 37737208)
Rosenberger FA, Thielert M, Strauss MT, et al. Spatial single-cell mass spectrometry defines zonation of the hepatocyte proteome. Nature Methods. 2023;20(10):1530–1536. https://doi.org/10.1038/s41592-023-02007-6 (PMID: 37783884)
Wang J, Huang Y, Lu F, et al. Benchmarking informatics workflows for data-independent acquisition single-cell proteomics. Nature Communications. 2025;16:10276. https://doi.org/10.1038/s41467-025-65174-4 (PMID: 41271703)
Demichev V, Messner CB, Vernardoeil S, Lilley KS, Ralser M. DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput. Nature Methods. 2020;17(1):41–44. https://doi.org/10.1038/s41592-020-00998-0 (PMID: 31768060)
Cox J, Mann M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature Biotechnology. 2008;26(12):1367–1372. https://doi.org/10.1038/nbt.1511 (PMID: 19029910)
Korkmaz S, Cox J, Grosse I, et al. Accurate and robust Bayesian inference of proteome-wide differential expression. Nature Methods. 2020;17(12):1215–1221. https://doi.org/10.1038/s41592-020-00949-9
Lazar C, Gatto L, Ferro M, Bruley C, Burger T. Accounting for the multiple natures of missing values in label-free quantitative proteomics data sets to compare imputation strategies. Journal of Proteome Research. 2016;15(4):1116–1125. https://doi.org/10.1021/acs.jproteome.5b00981 (PMID: 26906401)
Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research. 2015;43(7):e47. https://doi.org/10.1093/nar/gkv007 (PMID: 25605792)
Vanderaa C, Gatto L. The current state of single-cell proteomics data analysis. Current Protocols. 2023;3(1):e658. https://doi.org/10.1002/cpz1.658 (PMID: 36633424)

Let's Talk About Your Science

Tell us:

• Your biological question
• Data type and size
• Timeline constraints

We'll tell you:

• What's feasible
• How long it will take
• Exactly what it will cost

Single-Cell Proteomics Analysis Service — Carrier-Multiplexed MS Quantification from PSM Tables to Annotated Cell-State Clusters

Key facts

What is single-cell proteomics?

When should you use single-cell proteomics?

How the analysis works — step by step

1. Validate inputs and experimental design

2. Import PSM tables into QFeatures structure

3. Run MS-batch and cell-level QC

4. Filter low-confidence PSMs

5. Aggregate PSM → peptide → protein

6. Convert zeros to missing values and normalize

7. Filter low-quality cells

8. Correct batch effects across MS runs

9. Cluster, embed, and annotate cell states

10. Test differential protein abundance and package deliverables