Computational AnalysisThe raw data obtained from the mass spectrometry experiments is processed based on a well defined procedure. First, our ligandomics processing pipline MHCquant is used to perform an initial peptide identification. The identified peptides then proceed to the HLA allele association pipeline.
MS data obtained from HLA ligand extracts was analyzed using the nf-core containerized, computational pipeline MHCquant (revision 1.2.6) with default settings. Database search was performed without enzymatic restriction and methionine oxidation as the only variable modification at 1% FDR against the human reference proteome (Swiss-Prot, UP000005640).
HLA Allele Association
MHC Class and Candidate Alleles
Whether a peptide was bound to an MHC-I or MHC-II molecule is determined experimentally by enrichment using an antibody that is specific for either of the two MHCs. Furthermore, the HLA alleles of the tissue donors are known from NGS-based HLA-typing. Therefore, prior to the computational analysis, every MS experiment can already be associated with at most three candidate HLA alleles.
MHC Binding Prediction
Based on the candidate HLA alleles, epitope binding predictions are computed. For HLA-I presented peptides we use NetMHCpan 4.0 both in binding affinity (‑BA) and in ligand (default) mode. For MHC-II presented peptides we use NetMHCIIpan 3.2 in binding affinity (default) mode. Additionally, for both MHC-I and -II we compute binding predictions using SYFPEITHI.
Selection and Quality Control
Peptide-allele associations are filtered using the following criteria: If NetMHC(II)pan reported a binding affinity %Rank score lower than 0.5, the peptide allele binding association is stored as strong. Otherwise, if the score is lower than 2.0 or the SYFPEITHI score is at least 0.5, it is stored as weak. Peptide-allele associations that do not fulfill any of these criteria are discarded. Lastly, replicates with less than 50% successfully MHC associated peptides are discarded as a whole.
Finally, peptide-allele associations that pass the previously described quality criteria are complemented with the tissue association based on the sample origin and stored in the database.