| Title: | Glycan-Centric Functional Enrichment Analysis |
|---|---|
| Description: | Provides functional enrichment analysis for glycoproteomics data, including both protein-centric and glycan-centric approaches. The 'enrich_xxx()' functions answer "Which functions are enriched for proteins with dysregulated glycosylation?" (traditional protein-level enrichment), while 'enrich_gc_xxx()' functions answer "Which functions are enriched for dysregulated glycan traits?" (glycan-centric, linking glycan traits like core-fucosylation with functional annotations). Supports both Over Representation Analysis (ORA) and Gene Set Enrichment Analysis (GSEA) with common databases including GO, KEGG, Reactome, WikiPathways, DO (Disease Ontology), and NCG (Network of Cancer Genes). Integrates seamlessly with 'glydet' (for site-specific derived traits) and 'glystats' (for differential expression analysis). |
| Authors: | Bin Fu [aut, cre] (ORCID: <https://orcid.org/0000-0001-8567-2997>) |
| Maintainer: | Bin Fu <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-20 03:29:13 UTC |
| Source: | https://github.com/glycoverse/glyfun |
universe parameterThis function extracts all detected proteins in a glyexp::experiment() or a glystats result.
It can be readily passed to the universe parameter of all glyfun functions.
detected_universe(x)detected_universe(x)
x |
A |
A character vector of protein UniProt IDs.
library(glyexp) universe <- detected_universe(real_experiment) length(universe) universe[1:5]library(glyexp) universe <- detected_universe(real_experiment) length(universe) universe[1:5]
Performs glycan-centric Disease Ontology (DO) Gene Set Enrichment Analysis (GSEA). It ranks proteins within each glycan trait separately, then compares enriched disease terms across traits by running GSEA for each trait.
enrich_gc_gsea_do( dea_res, rank_by = "signed_log10p", aggr = "median", ont = "HDO", organism = "hsa", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )enrich_gc_gsea_do( dea_res, rank_by = "signed_log10p", aggr = "median", ont = "HDO", organism = "hsa", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )
dea_res |
Differential analysis result. Can be one of:
|
rank_by |
Criteria for ranking proteins. One of the following:
|
aggr |
Aggregation method for combining multiple scores across different traits and sites for the same protein. One of "median", "mean", or "max". Defaults to "median". |
ont |
One of "HDO" (Human Disease Ontology), "MPO" (Mammalian Phenotype Ontology),
or "VDO" (Vector Disease Ontology). Passed to |
organism |
"hsa" (Homo sapiens) or "mmu" (Mus musculus).
Passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
exponent |
Weight of each step. Passed to |
eps |
Epsilon for calculating p-values. Passed to |
seed |
Logical indicating whether to set a random seed for reproducibility.
Passed to |
A clusterProfiler compareClusterResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
In traditional glycoproteomics data analysis,
we usually perform differential expression analysis (DEA) on glycoforms,
extract proteins that have dysregulated glycosylation,
then perform functional enrichment (e.g. GO) on these proteins.
This is what enrich_xxx() functions do (e.g. enrich_gsea_go()).
enrich_gc_xxx() functions differ in that they link specific glycan traits with functional annotations.
Instead of answering the question
"Which functions are enriched in dysregulated glycoproteins?",
enrich_gc_xxx() answers questions like
"Which functions are enriched in proteins ranked highly for core-fucosylation changes?"
Higher specificity, deeper insights. By focusing on distinct glycan motifs,
it helps you pinpoint the functional relevance of specific glycosylation changes.
GSEA requires a ranked list of proteins as input.
This function first splits the DEA result by glycan trait, then ranks proteins
within each trait separately. For each trait, it applies the same ranking and
aggregation logic as enrich_gsea_go(), producing one ranked protein list per trait.
Those trait-specific ranked lists are then compared with
clusterProfiler::compareCluster() using its formula interface.
A common pattern of using this function is:
# 1. Use `glydet` to calculate derived traits or motif quantification. trait_exp <- derive_traits(exp) # or `quantify_motifs()` # 2. Perform differential analysis with `glystats`. dea_res <- gly_ttest(trait_exp) # 3. Use this function. go_res <- enrich_gc_gsea_go(dea_res) # or other `enrich_gc_gsea_xxx()` functions
clusterProfiler::compareCluster(), DOSE::gseDO()
Performs glycan-centric Gene Ontology (GO) Gene Set Enrichment Analysis (GSEA). Instead of traditional protein-centric enrichment, this function links specific glycan traits (e.g., core-fucosylation, sialylation) to functional annotations. It ranks proteins within each glycan trait separately, then compares enriched biological functions across traits by running GSEA for each trait.
enrich_gc_gsea_go( dea_res, rank_by = "signed_log10p", aggr = "median", orgdb = "org.Hs.eg.db", ont = "MF", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )enrich_gc_gsea_go( dea_res, rank_by = "signed_log10p", aggr = "median", orgdb = "org.Hs.eg.db", ont = "MF", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )
dea_res |
Differential analysis result. Can be one of:
|
rank_by |
Criteria for ranking proteins. One of the following:
|
aggr |
Aggregation method for combining multiple scores across different traits and sites for the same protein. One of "median", "mean", or "max". Defaults to "median". |
orgdb |
An OrgDb object. Passed to |
ont |
Ontology type. Passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
exponent |
Weight of each step. Passed to |
eps |
Epsilon for calculating p-values. Passed to |
seed |
Logical indicating whether to set a random seed for reproducibility.
Passed to |
A clusterProfiler compareClusterResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
In traditional glycoproteomics data analysis,
we usually perform differential expression analysis (DEA) on glycoforms,
extract proteins that have dysregulated glycosylation,
then perform functional enrichment (e.g. GO) on these proteins.
This is what enrich_xxx() functions do (e.g. enrich_gsea_go()).
enrich_gc_xxx() functions differ in that they link specific glycan traits with functional annotations.
Instead of answering the question
"Which functions are enriched in dysregulated glycoproteins?",
enrich_gc_xxx() answers questions like
"Which functions are enriched in proteins ranked highly for core-fucosylation changes?"
Higher specificity, deeper insights. By focusing on distinct glycan motifs,
it helps you pinpoint the functional relevance of specific glycosylation changes.
GSEA requires a ranked list of proteins as input.
This function first splits the DEA result by glycan trait, then ranks proteins
within each trait separately. For each trait, it applies the same ranking and
aggregation logic as enrich_gsea_go(), producing one ranked protein list per trait.
Those trait-specific ranked lists are then compared with
clusterProfiler::compareCluster() using its formula interface.
A common pattern of using this function is:
# 1. Use `glydet` to calculate derived traits or motif quantification. trait_exp <- derive_traits(exp) # or `quantify_motifs()` # 2. Perform differential analysis with `glystats`. dea_res <- gly_ttest(trait_exp) # 3. Use this function. go_res <- enrich_gc_gsea_go(dea_res) # or other `enrich_gc_gsea_xxx()` functions
clusterProfiler::compareCluster(), clusterProfiler::gseGO()
Performs glycan-centric KEGG pathway Gene Set Enrichment Analysis (GSEA). It ranks proteins within each glycan trait separately, then compares enriched pathways across traits by running GSEA for each trait.
enrich_gc_gsea_kegg( dea_res, rank_by = "signed_log10p", aggr = "median", organism = "hsa", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )enrich_gc_gsea_kegg( dea_res, rank_by = "signed_log10p", aggr = "median", organism = "hsa", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )
dea_res |
Differential analysis result. Can be one of:
|
rank_by |
Criteria for ranking proteins. One of the following:
|
aggr |
Aggregation method for combining multiple scores across different traits and sites for the same protein. One of "median", "mean", or "max". Defaults to "median". |
organism |
KEGG organism code. Defaults to "hsa" (Homo sapiens).
See |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
exponent |
Weight of each step. Passed to |
eps |
Epsilon for calculating p-values. Passed to |
seed |
Logical indicating whether to set a random seed for reproducibility.
Passed to |
A clusterProfiler compareClusterResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
In traditional glycoproteomics data analysis,
we usually perform differential expression analysis (DEA) on glycoforms,
extract proteins that have dysregulated glycosylation,
then perform functional enrichment (e.g. GO) on these proteins.
This is what enrich_xxx() functions do (e.g. enrich_gsea_go()).
enrich_gc_xxx() functions differ in that they link specific glycan traits with functional annotations.
Instead of answering the question
"Which functions are enriched in dysregulated glycoproteins?",
enrich_gc_xxx() answers questions like
"Which functions are enriched in proteins ranked highly for core-fucosylation changes?"
Higher specificity, deeper insights. By focusing on distinct glycan motifs,
it helps you pinpoint the functional relevance of specific glycosylation changes.
GSEA requires a ranked list of proteins as input.
This function first splits the DEA result by glycan trait, then ranks proteins
within each trait separately. For each trait, it applies the same ranking and
aggregation logic as enrich_gsea_go(), producing one ranked protein list per trait.
Those trait-specific ranked lists are then compared with
clusterProfiler::compareCluster() using its formula interface.
A common pattern of using this function is:
# 1. Use `glydet` to calculate derived traits or motif quantification. trait_exp <- derive_traits(exp) # or `quantify_motifs()` # 2. Perform differential analysis with `glystats`. dea_res <- gly_ttest(trait_exp) # 3. Use this function. go_res <- enrich_gc_gsea_go(dea_res) # or other `enrich_gc_gsea_xxx()` functions
clusterProfiler::compareCluster(), clusterProfiler::gseKEGG()
Performs glycan-centric Network of Cancer Genes (NCG) Gene Set Enrichment Analysis (GSEA). It ranks proteins within each glycan trait separately, then compares enriched cancer gene sets across traits by running GSEA for each trait.
enrich_gc_gsea_ncg( dea_res, rank_by = "signed_log10p", aggr = "median", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )enrich_gc_gsea_ncg( dea_res, rank_by = "signed_log10p", aggr = "median", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )
dea_res |
Differential analysis result. Can be one of:
|
rank_by |
Criteria for ranking proteins. One of the following:
|
aggr |
Aggregation method for combining multiple scores across different traits and sites for the same protein. One of "median", "mean", or "max". Defaults to "median". |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
exponent |
Weight of each step. Passed to |
eps |
Epsilon for calculating p-values. Passed to |
seed |
Logical indicating whether to set a random seed for reproducibility.
Passed to |
A clusterProfiler compareClusterResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
In traditional glycoproteomics data analysis,
we usually perform differential expression analysis (DEA) on glycoforms,
extract proteins that have dysregulated glycosylation,
then perform functional enrichment (e.g. GO) on these proteins.
This is what enrich_xxx() functions do (e.g. enrich_gsea_go()).
enrich_gc_xxx() functions differ in that they link specific glycan traits with functional annotations.
Instead of answering the question
"Which functions are enriched in dysregulated glycoproteins?",
enrich_gc_xxx() answers questions like
"Which functions are enriched in proteins ranked highly for core-fucosylation changes?"
Higher specificity, deeper insights. By focusing on distinct glycan motifs,
it helps you pinpoint the functional relevance of specific glycosylation changes.
GSEA requires a ranked list of proteins as input.
This function first splits the DEA result by glycan trait, then ranks proteins
within each trait separately. For each trait, it applies the same ranking and
aggregation logic as enrich_gsea_go(), producing one ranked protein list per trait.
Those trait-specific ranked lists are then compared with
clusterProfiler::compareCluster() using its formula interface.
A common pattern of using this function is:
# 1. Use `glydet` to calculate derived traits or motif quantification. trait_exp <- derive_traits(exp) # or `quantify_motifs()` # 2. Perform differential analysis with `glystats`. dea_res <- gly_ttest(trait_exp) # 3. Use this function. go_res <- enrich_gc_gsea_go(dea_res) # or other `enrich_gc_gsea_xxx()` functions
clusterProfiler::compareCluster(), DOSE::gseNCG()
Performs glycan-centric Reactome pathway Gene Set Enrichment Analysis (GSEA). It ranks proteins within each glycan trait separately, then compares enriched pathways across traits by running GSEA for each trait.
enrich_gc_gsea_reactome( dea_res, rank_by = "signed_log10p", aggr = "median", organism = "human", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )enrich_gc_gsea_reactome( dea_res, rank_by = "signed_log10p", aggr = "median", organism = "human", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )
dea_res |
Differential analysis result. Can be one of:
|
rank_by |
Criteria for ranking proteins. One of the following:
|
aggr |
Aggregation method for combining multiple scores across different traits and sites for the same protein. One of "median", "mean", or "max". Defaults to "median". |
organism |
Reactome organism name. Passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
exponent |
Weight of each step. Passed to |
eps |
Epsilon for calculating p-values. Passed to |
seed |
Logical indicating whether to set a random seed for reproducibility.
Passed to |
A clusterProfiler compareClusterResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
In traditional glycoproteomics data analysis,
we usually perform differential expression analysis (DEA) on glycoforms,
extract proteins that have dysregulated glycosylation,
then perform functional enrichment (e.g. GO) on these proteins.
This is what enrich_xxx() functions do (e.g. enrich_gsea_go()).
enrich_gc_xxx() functions differ in that they link specific glycan traits with functional annotations.
Instead of answering the question
"Which functions are enriched in dysregulated glycoproteins?",
enrich_gc_xxx() answers questions like
"Which functions are enriched in proteins ranked highly for core-fucosylation changes?"
Higher specificity, deeper insights. By focusing on distinct glycan motifs,
it helps you pinpoint the functional relevance of specific glycosylation changes.
GSEA requires a ranked list of proteins as input.
This function first splits the DEA result by glycan trait, then ranks proteins
within each trait separately. For each trait, it applies the same ranking and
aggregation logic as enrich_gsea_go(), producing one ranked protein list per trait.
Those trait-specific ranked lists are then compared with
clusterProfiler::compareCluster() using its formula interface.
A common pattern of using this function is:
# 1. Use `glydet` to calculate derived traits or motif quantification. trait_exp <- derive_traits(exp) # or `quantify_motifs()` # 2. Perform differential analysis with `glystats`. dea_res <- gly_ttest(trait_exp) # 3. Use this function. go_res <- enrich_gc_gsea_go(dea_res) # or other `enrich_gc_gsea_xxx()` functions
clusterProfiler::compareCluster(), ReactomePA::gsePathway()
Performs glycan-centric WikiPathways Gene Set Enrichment Analysis (GSEA). It ranks proteins within each glycan trait separately, then compares enriched pathways across traits by running GSEA for each trait.
enrich_gc_gsea_wp( dea_res, rank_by = "signed_log10p", aggr = "median", organism = "Homo sapiens", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )enrich_gc_gsea_wp( dea_res, rank_by = "signed_log10p", aggr = "median", organism = "Homo sapiens", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )
dea_res |
Differential analysis result. Can be one of:
|
rank_by |
Criteria for ranking proteins. One of the following:
|
aggr |
Aggregation method for combining multiple scores across different traits and sites for the same protein. One of "median", "mean", or "max". Defaults to "median". |
organism |
WikiPathways organism name. Passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
exponent |
Weight of each step. Passed to |
eps |
Epsilon for calculating p-values. Passed to |
seed |
Logical indicating whether to set a random seed for reproducibility.
Passed to |
A clusterProfiler compareClusterResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
In traditional glycoproteomics data analysis,
we usually perform differential expression analysis (DEA) on glycoforms,
extract proteins that have dysregulated glycosylation,
then perform functional enrichment (e.g. GO) on these proteins.
This is what enrich_xxx() functions do (e.g. enrich_gsea_go()).
enrich_gc_xxx() functions differ in that they link specific glycan traits with functional annotations.
Instead of answering the question
"Which functions are enriched in dysregulated glycoproteins?",
enrich_gc_xxx() answers questions like
"Which functions are enriched in proteins ranked highly for core-fucosylation changes?"
Higher specificity, deeper insights. By focusing on distinct glycan motifs,
it helps you pinpoint the functional relevance of specific glycosylation changes.
GSEA requires a ranked list of proteins as input.
This function first splits the DEA result by glycan trait, then ranks proteins
within each trait separately. For each trait, it applies the same ranking and
aggregation logic as enrich_gsea_go(), producing one ranked protein list per trait.
Those trait-specific ranked lists are then compared with
clusterProfiler::compareCluster() using its formula interface.
A common pattern of using this function is:
# 1. Use `glydet` to calculate derived traits or motif quantification. trait_exp <- derive_traits(exp) # or `quantify_motifs()` # 2. Perform differential analysis with `glystats`. dea_res <- gly_ttest(trait_exp) # 3. Use this function. go_res <- enrich_gc_gsea_go(dea_res) # or other `enrich_gc_gsea_xxx()` functions
clusterProfiler::compareCluster(), clusterProfiler::gseWP()
Performs glycan-centric Disease Ontology (DO) Over-Representation Analysis (ORA). Instead of traditional protein-centric enrichment, this function links specific glycan traits to disease associations. It helps answer questions like "Which diseases are enriched in proteins with a specific dysregulated glycan motif?", by grouping differential analysis results by glycan traits and computing disease enrichment for each trait.
enrich_gc_ora_do( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), ont = "HDO", organism = "hsa", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )enrich_gc_ora_do( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), ont = "HDO", organism = "hsa", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )
dea_res |
Differential analysis result. Can be one of:
|
dea_p_cutoff |
P-value cutoff for statistical significance. Defaults to 0.05.
For |
dea_log2fc_cutoff |
Log2 fold change cutoff statistical significance.
A length-2 numeric vector, being negative and positive boundaries, respectively.
For example, |
ont |
One of "HDO" (Human Disease Ontology), "MPO" (Mammalian Phenotype Ontology),
or "VDO" (Vector Disease Ontology). Passed to |
organism |
"hsa" (Homo sapiens) or "mmu" (Mus musculus).
Passed to |
universe |
Background genes Uniprot IDs, directly passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
q_cutoff |
Q-value (FDR) cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
A clusterProfiler compareClusterResult object with additional glyfun classes.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
In traditional glycoproteomics data analysis,
we usually perform differential expression analysis (DEA) on glycoforms,
extract proteins that have dysregulated glycosylation,
then perform functional enrichment (e.g. GO) on these proteins.
This is what enrich_xxx() functions do (e.g. enrich_ora_go()).
enrich_gc_xxx() functions differ in that they link specific glycan traits with functional annotations.
Instead of answering the question
"Which functions are enriched in dysregulated glycoproteins?",
enrich_gc_xxx() answers questions like
“Which functions are enriched in proteins with dysregulated core-fucosylation?”
Higher specificity, deeper insights. By focusing on distinct glycan motifs,
it helps you pinpoint the functional relevance of specific glycosylation changes.
A common pattern of using this function is:
# 1. Use `glydet` to calculate derived traits or motif quantification. trait_exp <- derive_traits(exp) # or `quantify_motifs()` # 2. Perform differential analysis with `glystats`. dea_res <- gly_ttest(trait_exp) # 3. Use this function. go_res <- enrich_gc_ora_go(dea_res) # or other `enrich_gc_xxx()` functions
clusterProfiler::compareCluster(), DOSE::enrichDO()
Performs glycan-centric Gene Ontology (GO) Over-Representation Analysis (ORA). Instead of traditional protein-centric enrichment, this function links specific glycan traits (e.g., core-fucosylation, sialylation) to functional annotations. It identifies which biological functions are significantly enriched in proteins exhibiting specific glycosylation changes, grouping the differential analysis results by trait before performing ORA.
enrich_gc_ora_go( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), orgdb = "org.Hs.eg.db", ont = "MF", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )enrich_gc_ora_go( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), orgdb = "org.Hs.eg.db", ont = "MF", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )
dea_res |
Differential analysis result. Can be one of:
|
dea_p_cutoff |
P-value cutoff for statistical significance. Defaults to 0.05.
For |
dea_log2fc_cutoff |
Log2 fold change cutoff statistical significance.
A length-2 numeric vector, being negative and positive boundaries, respectively.
For example, |
orgdb |
An OrgDb object. Passed to |
ont |
Ontology type. Passed to |
universe |
Background genes Uniprot IDs, directly passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
q_cutoff |
Q-value (FDR) cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
A clusterProfiler compareClusterResult object with additional glyfun classes.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
In traditional glycoproteomics data analysis,
we usually perform differential expression analysis (DEA) on glycoforms,
extract proteins that have dysregulated glycosylation,
then perform functional enrichment (e.g. GO) on these proteins.
This is what enrich_xxx() functions do (e.g. enrich_ora_go()).
enrich_gc_xxx() functions differ in that they link specific glycan traits with functional annotations.
Instead of answering the question
"Which functions are enriched in dysregulated glycoproteins?",
enrich_gc_xxx() answers questions like
“Which functions are enriched in proteins with dysregulated core-fucosylation?”
Higher specificity, deeper insights. By focusing on distinct glycan motifs,
it helps you pinpoint the functional relevance of specific glycosylation changes.
A common pattern of using this function is:
# 1. Use `glydet` to calculate derived traits or motif quantification. trait_exp <- derive_traits(exp) # or `quantify_motifs()` # 2. Perform differential analysis with `glystats`. dea_res <- gly_ttest(trait_exp) # 3. Use this function. go_res <- enrich_gc_ora_go(dea_res) # or other `enrich_gc_xxx()` functions
clusterProfiler::compareCluster(), clusterProfiler::enrichGO()
Performs glycan-centric KEGG pathway Over-Representation Analysis (ORA). Instead of traditional protein-centric enrichment, this function links specific glycan traits to biological pathways. It helps answer questions like "Which pathways are enriched in proteins with a specific dysregulated glycan motif?", by grouping differential analysis results by glycan traits and computing pathway enrichment for each trait.
enrich_gc_ora_kegg( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), organism = "hsa", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )enrich_gc_ora_kegg( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), organism = "hsa", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )
dea_res |
Differential analysis result. Can be one of:
|
dea_p_cutoff |
P-value cutoff for statistical significance. Defaults to 0.05.
For |
dea_log2fc_cutoff |
Log2 fold change cutoff statistical significance.
A length-2 numeric vector, being negative and positive boundaries, respectively.
For example, |
organism |
KEGG organism code. Passed to |
universe |
Background genes Uniprot IDs, directly passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
q_cutoff |
Q-value (FDR) cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
A clusterProfiler compareClusterResult object with additional glyfun classes.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
In traditional glycoproteomics data analysis,
we usually perform differential expression analysis (DEA) on glycoforms,
extract proteins that have dysregulated glycosylation,
then perform functional enrichment (e.g. GO) on these proteins.
This is what enrich_xxx() functions do (e.g. enrich_ora_go()).
enrich_gc_xxx() functions differ in that they link specific glycan traits with functional annotations.
Instead of answering the question
"Which functions are enriched in dysregulated glycoproteins?",
enrich_gc_xxx() answers questions like
“Which functions are enriched in proteins with dysregulated core-fucosylation?”
Higher specificity, deeper insights. By focusing on distinct glycan motifs,
it helps you pinpoint the functional relevance of specific glycosylation changes.
A common pattern of using this function is:
# 1. Use `glydet` to calculate derived traits or motif quantification. trait_exp <- derive_traits(exp) # or `quantify_motifs()` # 2. Perform differential analysis with `glystats`. dea_res <- gly_ttest(trait_exp) # 3. Use this function. go_res <- enrich_gc_ora_go(dea_res) # or other `enrich_gc_xxx()` functions
clusterProfiler::compareCluster(), clusterProfiler::enrichKEGG()
Performs glycan-centric Network of Cancer Genes (NCG) Over-Representation Analysis (ORA). Instead of traditional protein-centric enrichment, this function links specific glycan traits to cancer gene associations. It helps answer questions like "Which cancer gene sets are enriched in proteins with a specific dysregulated glycan motif?", by grouping differential analysis results by glycan traits and computing cancer gene enrichment for each trait.
enrich_gc_ora_ncg( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )enrich_gc_ora_ncg( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )
dea_res |
Differential analysis result. Can be one of:
|
dea_p_cutoff |
P-value cutoff for statistical significance. Defaults to 0.05.
For |
dea_log2fc_cutoff |
Log2 fold change cutoff statistical significance.
A length-2 numeric vector, being negative and positive boundaries, respectively.
For example, |
universe |
Background genes Uniprot IDs, directly passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
q_cutoff |
Q-value (FDR) cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
A clusterProfiler compareClusterResult object with additional glyfun classes.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
In traditional glycoproteomics data analysis,
we usually perform differential expression analysis (DEA) on glycoforms,
extract proteins that have dysregulated glycosylation,
then perform functional enrichment (e.g. GO) on these proteins.
This is what enrich_xxx() functions do (e.g. enrich_ora_go()).
enrich_gc_xxx() functions differ in that they link specific glycan traits with functional annotations.
Instead of answering the question
"Which functions are enriched in dysregulated glycoproteins?",
enrich_gc_xxx() answers questions like
“Which functions are enriched in proteins with dysregulated core-fucosylation?”
Higher specificity, deeper insights. By focusing on distinct glycan motifs,
it helps you pinpoint the functional relevance of specific glycosylation changes.
A common pattern of using this function is:
# 1. Use `glydet` to calculate derived traits or motif quantification. trait_exp <- derive_traits(exp) # or `quantify_motifs()` # 2. Perform differential analysis with `glystats`. dea_res <- gly_ttest(trait_exp) # 3. Use this function. go_res <- enrich_gc_ora_go(dea_res) # or other `enrich_gc_xxx()` functions
clusterProfiler::compareCluster(), DOSE::enrichNCG()
Performs glycan-centric Reactome pathway Over-Representation Analysis (ORA). Instead of traditional protein-centric enrichment, this function links specific glycan traits to biological pathways. It helps answer questions like "Which Reactome pathways are enriched in proteins with a specific dysregulated glycan motif?", by grouping differential analysis results by glycan traits and computing pathway enrichment for each trait.
enrich_gc_ora_reactome( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), organism = "human", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )enrich_gc_ora_reactome( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), organism = "human", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )
dea_res |
Differential analysis result. Can be one of:
|
dea_p_cutoff |
P-value cutoff for statistical significance. Defaults to 0.05.
For |
dea_log2fc_cutoff |
Log2 fold change cutoff statistical significance.
A length-2 numeric vector, being negative and positive boundaries, respectively.
For example, |
organism |
Reactome organism name. Passed to |
universe |
Background genes Uniprot IDs, directly passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
q_cutoff |
Q-value (FDR) cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
A clusterProfiler compareClusterResult object with additional glyfun classes.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
In traditional glycoproteomics data analysis,
we usually perform differential expression analysis (DEA) on glycoforms,
extract proteins that have dysregulated glycosylation,
then perform functional enrichment (e.g. GO) on these proteins.
This is what enrich_xxx() functions do (e.g. enrich_ora_go()).
enrich_gc_xxx() functions differ in that they link specific glycan traits with functional annotations.
Instead of answering the question
"Which functions are enriched in dysregulated glycoproteins?",
enrich_gc_xxx() answers questions like
“Which functions are enriched in proteins with dysregulated core-fucosylation?”
Higher specificity, deeper insights. By focusing on distinct glycan motifs,
it helps you pinpoint the functional relevance of specific glycosylation changes.
A common pattern of using this function is:
# 1. Use `glydet` to calculate derived traits or motif quantification. trait_exp <- derive_traits(exp) # or `quantify_motifs()` # 2. Perform differential analysis with `glystats`. dea_res <- gly_ttest(trait_exp) # 3. Use this function. go_res <- enrich_gc_ora_go(dea_res) # or other `enrich_gc_xxx()` functions
clusterProfiler::compareCluster(), ReactomePA::enrichPathway()
Performs glycan-centric WikiPathways Over-Representation Analysis (ORA). Instead of traditional protein-centric enrichment, this function links specific glycan traits to biological pathways. It helps answer questions like "Which WikiPathways are enriched in proteins with a specific dysregulated glycan motif?", by grouping differential analysis results by glycan traits and computing pathway enrichment for each trait.
enrich_gc_ora_wp( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), organism = "Homo sapiens", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )enrich_gc_ora_wp( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), organism = "Homo sapiens", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )
dea_res |
Differential analysis result. Can be one of:
|
dea_p_cutoff |
P-value cutoff for statistical significance. Defaults to 0.05.
For |
dea_log2fc_cutoff |
Log2 fold change cutoff statistical significance.
A length-2 numeric vector, being negative and positive boundaries, respectively.
For example, |
organism |
WikiPathways organism name. Passed to |
universe |
Background genes Uniprot IDs, directly passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
q_cutoff |
Q-value (FDR) cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
A clusterProfiler compareClusterResult object with additional glyfun classes.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
In traditional glycoproteomics data analysis,
we usually perform differential expression analysis (DEA) on glycoforms,
extract proteins that have dysregulated glycosylation,
then perform functional enrichment (e.g. GO) on these proteins.
This is what enrich_xxx() functions do (e.g. enrich_ora_go()).
enrich_gc_xxx() functions differ in that they link specific glycan traits with functional annotations.
Instead of answering the question
"Which functions are enriched in dysregulated glycoproteins?",
enrich_gc_xxx() answers questions like
“Which functions are enriched in proteins with dysregulated core-fucosylation?”
Higher specificity, deeper insights. By focusing on distinct glycan motifs,
it helps you pinpoint the functional relevance of specific glycosylation changes.
A common pattern of using this function is:
# 1. Use `glydet` to calculate derived traits or motif quantification. trait_exp <- derive_traits(exp) # or `quantify_motifs()` # 2. Perform differential analysis with `glystats`. dea_res <- gly_ttest(trait_exp) # 3. Use this function. go_res <- enrich_gc_ora_go(dea_res) # or other `enrich_gc_xxx()` functions
clusterProfiler::compareCluster(), clusterProfiler::enrichWP()
Performs Disease Ontology (DO) Gene Set Enrichment Analysis (GSEA) on glycoproteins with dysregulated glycosylation.
enrich_gsea_do( dea_res, rank_by = "signed_log10p", aggr = "median", ont = "HDO", organism = "hsa", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )enrich_gsea_do( dea_res, rank_by = "signed_log10p", aggr = "median", ont = "HDO", organism = "hsa", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )
dea_res |
Differential analysis result. Can be one of:
|
rank_by |
Criteria for ranking proteins. One of the following:
|
aggr |
Aggregation method for combining multiple scores across different traits and sites for the same protein. One of "median", "mean", or "max". Defaults to "median". |
ont |
One of "HDO" (Human Disease Ontology), "MPO" (Mammalian Phenotype Ontology),
or "VDO" (Vector Disease Ontology). Passed to |
organism |
"hsa" (Homo sapiens) or "mmu" (Mus musculus).
Passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
exponent |
Weight of each step. Passed to |
eps |
Epsilon for calculating p-values. Passed to |
seed |
Logical indicating whether to set a random seed for reproducibility.
Passed to |
A clusterProfiler gseaResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::ridgeplot().
GSEA requires a ranked list of proteins as input.
This function ranks proteins based on the median absolute log2 fold change across all traits and sites.
This reflects the overall glycosylation dysregulation degree of each glycoprotein.
You can use rank_by to specify other ranking criteria, such as p-values or signed log2 fold changes.
You can also use aggr to specify how to aggregate multiple scores for the same protein across different traits and sites.
A common pattern of using this function is:
# 1. Perform differential analysis with `glystats`. dea_res <- gly_ttest(exp) # 2. Use this function. go_res <- enrich_gsea_go(dea_res) # or `enrich_gsea_xxx()` functions
Performs Gene Ontology (GO) Gene Set Enrichment Analysis (GSEA) on glycoproteins with dysregulated glycosylation.
enrich_gsea_go( dea_res, rank_by = "signed_log10p", aggr = "median", orgdb = "org.Hs.eg.db", ont = "MF", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )enrich_gsea_go( dea_res, rank_by = "signed_log10p", aggr = "median", orgdb = "org.Hs.eg.db", ont = "MF", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )
dea_res |
Differential analysis result. Can be one of:
|
rank_by |
Criteria for ranking proteins. One of the following:
|
aggr |
Aggregation method for combining multiple scores across different traits and sites for the same protein. One of "median", "mean", or "max". Defaults to "median". |
orgdb |
An OrgDb object. Passed to |
ont |
Ontology type. Passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
exponent |
Weight of each step. Passed to |
eps |
Epsilon for calculating p-values. Passed to |
seed |
Logical indicating whether to set a random seed for reproducibility.
Passed to |
A clusterProfiler gseaResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::ridgeplot().
GSEA requires a ranked list of proteins as input.
This function ranks proteins based on the median absolute log2 fold change across all traits and sites.
This reflects the overall glycosylation dysregulation degree of each glycoprotein.
You can use rank_by to specify other ranking criteria, such as p-values or signed log2 fold changes.
You can also use aggr to specify how to aggregate multiple scores for the same protein across different traits and sites.
A common pattern of using this function is:
# 1. Perform differential analysis with `glystats`. dea_res <- gly_ttest(exp) # 2. Use this function. go_res <- enrich_gsea_go(dea_res) # or `enrich_gsea_xxx()` functions
Performs KEGG pathway Gene Set Enrichment Analysis (GSEA) on glycoproteins with dysregulated glycosylation.
enrich_gsea_kegg( dea_res, rank_by = "signed_log10p", aggr = "median", organism = "hsa", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )enrich_gsea_kegg( dea_res, rank_by = "signed_log10p", aggr = "median", organism = "hsa", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )
dea_res |
Differential analysis result. Can be one of:
|
rank_by |
Criteria for ranking proteins. One of the following:
|
aggr |
Aggregation method for combining multiple scores across different traits and sites for the same protein. One of "median", "mean", or "max". Defaults to "median". |
organism |
KEGG organism code. Defaults to "hsa" (Homo sapiens).
See |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
exponent |
Weight of each step. Passed to |
eps |
Epsilon for calculating p-values. Passed to |
seed |
Logical indicating whether to set a random seed for reproducibility.
Passed to |
A clusterProfiler gseaResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::ridgeplot().
GSEA requires a ranked list of proteins as input.
This function ranks proteins based on the median absolute log2 fold change across all traits and sites.
This reflects the overall glycosylation dysregulation degree of each glycoprotein.
You can use rank_by to specify other ranking criteria, such as p-values or signed log2 fold changes.
You can also use aggr to specify how to aggregate multiple scores for the same protein across different traits and sites.
A common pattern of using this function is:
# 1. Perform differential analysis with `glystats`. dea_res <- gly_ttest(exp) # 2. Use this function. go_res <- enrich_gsea_go(dea_res) # or `enrich_gsea_xxx()` functions
Performs Network of Cancer Genes (NCG) Gene Set Enrichment Analysis (GSEA) on glycoproteins with dysregulated glycosylation.
enrich_gsea_ncg( dea_res, rank_by = "signed_log10p", aggr = "median", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )enrich_gsea_ncg( dea_res, rank_by = "signed_log10p", aggr = "median", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )
dea_res |
Differential analysis result. Can be one of:
|
rank_by |
Criteria for ranking proteins. One of the following:
|
aggr |
Aggregation method for combining multiple scores across different traits and sites for the same protein. One of "median", "mean", or "max". Defaults to "median". |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
exponent |
Weight of each step. Passed to |
eps |
Epsilon for calculating p-values. Passed to |
seed |
Logical indicating whether to set a random seed for reproducibility.
Passed to |
A clusterProfiler gseaResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::ridgeplot().
GSEA requires a ranked list of proteins as input.
This function ranks proteins based on the median absolute log2 fold change across all traits and sites.
This reflects the overall glycosylation dysregulation degree of each glycoprotein.
You can use rank_by to specify other ranking criteria, such as p-values or signed log2 fold changes.
You can also use aggr to specify how to aggregate multiple scores for the same protein across different traits and sites.
A common pattern of using this function is:
# 1. Perform differential analysis with `glystats`. dea_res <- gly_ttest(exp) # 2. Use this function. go_res <- enrich_gsea_go(dea_res) # or `enrich_gsea_xxx()` functions
Performs Reactome pathway Gene Set Enrichment Analysis (GSEA) on glycoproteins with dysregulated glycosylation.
enrich_gsea_reactome( dea_res, rank_by = "signed_log10p", aggr = "median", organism = "human", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )enrich_gsea_reactome( dea_res, rank_by = "signed_log10p", aggr = "median", organism = "human", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )
dea_res |
Differential analysis result. Can be one of:
|
rank_by |
Criteria for ranking proteins. One of the following:
|
aggr |
Aggregation method for combining multiple scores across different traits and sites for the same protein. One of "median", "mean", or "max". Defaults to "median". |
organism |
Reactome organism name. Passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
exponent |
Weight of each step. Passed to |
eps |
Epsilon for calculating p-values. Passed to |
seed |
Logical indicating whether to set a random seed for reproducibility.
Passed to |
A clusterProfiler gseaResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::ridgeplot().
GSEA requires a ranked list of proteins as input.
This function ranks proteins based on the median absolute log2 fold change across all traits and sites.
This reflects the overall glycosylation dysregulation degree of each glycoprotein.
You can use rank_by to specify other ranking criteria, such as p-values or signed log2 fold changes.
You can also use aggr to specify how to aggregate multiple scores for the same protein across different traits and sites.
A common pattern of using this function is:
# 1. Perform differential analysis with `glystats`. dea_res <- gly_ttest(exp) # 2. Use this function. go_res <- enrich_gsea_go(dea_res) # or `enrich_gsea_xxx()` functions
Performs WikiPathways Gene Set Enrichment Analysis (GSEA) on glycoproteins with dysregulated glycosylation.
enrich_gsea_wp( dea_res, rank_by = "signed_log10p", aggr = "median", organism = "Homo sapiens", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )enrich_gsea_wp( dea_res, rank_by = "signed_log10p", aggr = "median", organism = "Homo sapiens", p_adj_method = "BH", p_cutoff = 0.05, min_gs_size = 10, max_gs_size = 500, exponent = 1, eps = 1e-10, seed = FALSE )
dea_res |
Differential analysis result. Can be one of:
|
rank_by |
Criteria for ranking proteins. One of the following:
|
aggr |
Aggregation method for combining multiple scores across different traits and sites for the same protein. One of "median", "mean", or "max". Defaults to "median". |
organism |
WikiPathways organism name. Passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
exponent |
Weight of each step. Passed to |
eps |
Epsilon for calculating p-values. Passed to |
seed |
Logical indicating whether to set a random seed for reproducibility.
Passed to |
A clusterProfiler gseaResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::ridgeplot().
GSEA requires a ranked list of proteins as input.
This function ranks proteins based on the median absolute log2 fold change across all traits and sites.
This reflects the overall glycosylation dysregulation degree of each glycoprotein.
You can use rank_by to specify other ranking criteria, such as p-values or signed log2 fold changes.
You can also use aggr to specify how to aggregate multiple scores for the same protein across different traits and sites.
A common pattern of using this function is:
# 1. Perform differential analysis with `glystats`. dea_res <- gly_ttest(exp) # 2. Use this function. go_res <- enrich_gsea_go(dea_res) # or `enrich_gsea_xxx()` functions
Performs Disease Ontology (DO) Over-Representation Analysis (ORA) on glycoproteins with dysregulated glycosylation.
enrich_ora_do( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), ont = "HDO", organism = "hsa", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )enrich_ora_do( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), ont = "HDO", organism = "hsa", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )
dea_res |
Differential analysis result. Can be one of:
|
dea_p_cutoff |
P-value cutoff for statistical significance. Defaults to 0.05.
For |
dea_log2fc_cutoff |
Log2 fold change cutoff statistical significance.
A length-2 numeric vector, being negative and positive boundaries, respectively.
For example, |
ont |
One of "HDO" (Human Disease Ontology), "MPO" (Mammalian Phenotype Ontology),
or "VDO" (Vector Disease Ontology). Passed to |
organism |
"hsa" (Homo sapiens) or "mmu" (Mus musculus).
Passed to |
universe |
Background genes Uniprot IDs, directly passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
q_cutoff |
Q-value (FDR) cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
A clusterProfiler enrichResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
A common pattern of using this function is:
# 1. Perform differential analysis with `glystats`. dea_res <- gly_ttest(exp) # 2. Use this function. go_res <- enrich_gc_ora_go(dea_res) # or other glyfun functions
Performs Gene Ontology (GO) Over-Representation Analysis (ORA) on glycoproteins with dysregulated glycosylation.
enrich_ora_go( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), orgdb = "org.Hs.eg.db", ont = "MF", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )enrich_ora_go( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), orgdb = "org.Hs.eg.db", ont = "MF", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )
dea_res |
Differential analysis result. Can be one of:
|
dea_p_cutoff |
P-value cutoff for statistical significance. Defaults to 0.05.
For |
dea_log2fc_cutoff |
Log2 fold change cutoff statistical significance.
A length-2 numeric vector, being negative and positive boundaries, respectively.
For example, |
orgdb |
An OrgDb object. Passed to |
ont |
Ontology type. Passed to |
universe |
Background genes Uniprot IDs, directly passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
q_cutoff |
Q-value (FDR) cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
A clusterProfiler enrichResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
A common pattern of using this function is:
# 1. Perform differential analysis with `glystats`. dea_res <- gly_ttest(exp) # 2. Use this function. go_res <- enrich_gc_ora_go(dea_res) # or other glyfun functions
Performs KEGG pathway Over-Representation Analysis (ORA) on glycoproteins with dysregulated glycosylation.
enrich_ora_kegg( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), organism = "hsa", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )enrich_ora_kegg( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), organism = "hsa", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )
dea_res |
Differential analysis result. Can be one of:
|
dea_p_cutoff |
P-value cutoff for statistical significance. Defaults to 0.05.
For |
dea_log2fc_cutoff |
Log2 fold change cutoff statistical significance.
A length-2 numeric vector, being negative and positive boundaries, respectively.
For example, |
organism |
KEGG organism code. Passed to |
universe |
Background genes Uniprot IDs, directly passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
q_cutoff |
Q-value (FDR) cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
A clusterProfiler enrichResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
A common pattern of using this function is:
# 1. Perform differential analysis with `glystats`. dea_res <- gly_ttest(exp) # 2. Use this function. go_res <- enrich_gc_ora_go(dea_res) # or other glyfun functions
Performs Network of Cancer Genes (NCG) Over-Representation Analysis (ORA) on glycoproteins with dysregulated glycosylation.
enrich_ora_ncg( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )enrich_ora_ncg( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )
dea_res |
Differential analysis result. Can be one of:
|
dea_p_cutoff |
P-value cutoff for statistical significance. Defaults to 0.05.
For |
dea_log2fc_cutoff |
Log2 fold change cutoff statistical significance.
A length-2 numeric vector, being negative and positive boundaries, respectively.
For example, |
universe |
Background genes Uniprot IDs, directly passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
q_cutoff |
Q-value (FDR) cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
A clusterProfiler enrichResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
A common pattern of using this function is:
# 1. Perform differential analysis with `glystats`. dea_res <- gly_ttest(exp) # 2. Use this function. go_res <- enrich_gc_ora_go(dea_res) # or other glyfun functions
Performs Reactome pathway Over-Representation Analysis (ORA) on glycoproteins with dysregulated glycosylation.
enrich_ora_reactome( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), organism = "human", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )enrich_ora_reactome( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), organism = "human", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )
dea_res |
Differential analysis result. Can be one of:
|
dea_p_cutoff |
P-value cutoff for statistical significance. Defaults to 0.05.
For |
dea_log2fc_cutoff |
Log2 fold change cutoff statistical significance.
A length-2 numeric vector, being negative and positive boundaries, respectively.
For example, |
organism |
Reactome organism name. Passed to |
universe |
Background genes Uniprot IDs, directly passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
q_cutoff |
Q-value (FDR) cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
A clusterProfiler enrichResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
A common pattern of using this function is:
# 1. Perform differential analysis with `glystats`. dea_res <- gly_ttest(exp) # 2. Use this function. go_res <- enrich_gc_ora_go(dea_res) # or other glyfun functions
Performs WikiPathways Over-Representation Analysis (ORA) on glycoproteins with dysregulated glycosylation.
enrich_ora_wp( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), organism = "Homo sapiens", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )enrich_ora_wp( dea_res, dea_p_cutoff = 0.05, dea_log2fc_cutoff = c(-1, 1), organism = "Homo sapiens", universe = NULL, p_adj_method = "BH", p_cutoff = 0.05, q_cutoff = 0.2, min_gs_size = 10, max_gs_size = 500 )
dea_res |
Differential analysis result. Can be one of:
|
dea_p_cutoff |
P-value cutoff for statistical significance. Defaults to 0.05.
For |
dea_log2fc_cutoff |
Log2 fold change cutoff statistical significance.
A length-2 numeric vector, being negative and positive boundaries, respectively.
For example, |
organism |
WikiPathways organism name. Passed to |
universe |
Background genes Uniprot IDs, directly passed to |
p_adj_method |
P-value adjustment method.
One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
Passed to |
p_cutoff |
P-value cutoff to filter significant terms.
Passed to |
q_cutoff |
Q-value (FDR) cutoff to filter significant terms.
Passed to |
min_gs_size |
Minimal size of each gene set for analyzing.
Gene sets with fewer genes than this threshold will be excluded.
Passed to |
max_gs_size |
Maximum size of each gene set for analyzing.
Gene sets with more genes than this threshold will be excluded.
Passed to |
A clusterProfiler enrichResult object.
It can be readily converted to a tibble with tibble::as_tibble(),
or visualized with clusterProfiler functions like clusterProfiler::dotplot().
A common pattern of using this function is:
# 1. Perform differential analysis with `glystats`. dea_res <- gly_ttest(exp) # 2. Use this function. go_res <- enrich_gc_ora_go(dea_res) # or other glyfun functions