Project: NEGEDIA
This report summarises differential gene expression analysis as performed by the negedia/degsanalysis pipeline.
A summary of samples sequencing metrics is below:
Comparisons defined by the user are listed in the following table:
Results
Counts
The results are derived from the global gene expression analysis of the experiment, focusing exclusively on genes that are expressed at least in one sample (genes with 0 counts in all the samples are excluded).
Preliminary analysis
Expression value distributions
The following plots show the log2 expression value distributions of the experiment samples.
The distribution of normalised counts per gene, for each sample, is represented in the form of a box plot.
The density plot represents the distribution of the log2 normalised counts.
Samples correlation
Principal components analysis was conducted based on the 500 most variable genes. Each component was annotated with its percent contribution to variance.
Normalised (Condition)
ANOVA test was used to determine assocations between continuous principal components and categorical covariates (including the variable of interest).
The resulting p values are illustrated below.
The variable ‘Condition’ shows an association with PC1 (49.7%) (p = 0.00).
A hierarchical clustering of genes was undertaken based on the top 500 most variable genes. Distances between genes were estimated based on spearman correlation.
Normalised (Condition)
Differential analysis
Differential expressed genes
Differential expression details
The red dots represent Differentially Expressed Genes (DEGs). Genes with a log2 fold change greater than or equal to the specified value are upregulated compared to the reference (dots on the right). Genes with a log2 fold change smaller than or equal to the specified value are downregulated compared to the reference (dots on the left). Genes with an FDR or padj value less than or equal to 0.05 are considered statistically significant.
Heatmap of normalized scaled expression per replicate of DEGs in the selected comparison.
The red dots represent Differentially Expressed Genes (DEGs). Genes with a log2 fold change greater than or equal to the specified value are upregulated compared to the reference (dots on the right). Genes with a log2 fold change smaller than or equal to the specified value are downregulated compared to the reference (dots on the left). Genes with an FDR or padj value less than or equal to 0.05 are considered statistically significant.
Heatmap of normalized scaled expression per replicate of DEGs in the selected comparison.
Gene set analysis
Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes).
Analysed databases:
c2.cp.kegg: Canonical Pathways gene sets derived from the KEGG pathway database.
c2.cp.reactome: Canonical Pathways gene sets derived from the Reactome pathway database.
c2.cp.wikipathways: Canonical Pathways gene sets derived from the WikiPathways pathway database.
c5.go.bp: Gene sets derived from the GO Biological Process ontology.
c5.go.cc: Gene sets derived from the GO Cellular Component ontology.
c5.go.mf: Gene sets derived from the GO Molecular Function ontology.
c8.all: Gene sets that contain curated cluster markers for cell types identified in single-cell sequencing studies of human tissue.
GSEA
In the tables are reported the pathways name, the pathway size, enrichment score (ES), normalized enrichment score (NES), normalized enrichment score (NES), and FRD value.
In the tables are reported the pathways name, the pathway size, enrichment score (ES), normalized enrichment score (NES), normalized enrichment score (NES), and FRD value.
In the tables are reported the pathways name, the pathway size, enrichment score (ES), normalized enrichment score (NES), normalized enrichment score (NES), and FRD value.
In the tables are reported the pathways name, the pathway size, enrichment score (ES), normalized enrichment score (NES), normalized enrichment score (NES), and FRD value.
In the tables are reported the pathways name, the pathway size, enrichment score (ES), normalized enrichment score (NES), normalized enrichment score (NES), and FRD value.
In the tables are reported the pathways name, the pathway size, enrichment score (ES), normalized enrichment score (NES), normalized enrichment score (NES), and FRD value.
In the tables are reported the pathways name, the pathway size, enrichment score (ES), normalized enrichment score (NES), normalized enrichment score (NES), and FRD value.
In the tables are reported the pathways name, the pathway size, enrichment score (ES), normalized enrichment score (NES), normalized enrichment score (NES), and FRD value.
In the tables are reported the pathways name, the pathway size, enrichment score (ES), normalized enrichment score (NES), normalized enrichment score (NES), and FRD value.
In the tables are reported the pathways name, the pathway size, enrichment score (ES), normalized enrichment score (NES), normalized enrichment score (NES), and FRD value.
In the tables are reported the pathways name, the pathway size, enrichment score (ES), normalized enrichment score (NES), normalized enrichment score (NES), and FRD value.
In the tables are reported the pathways name, the pathway size, enrichment score (ES), normalized enrichment score (NES), normalized enrichment score (NES), and FRD value.
In the tables are reported the pathways name, the pathway size, enrichment score (ES), normalized enrichment score (NES), normalized enrichment score (NES), and FRD value.
In the tables are reported the pathways name, the pathway size, enrichment score (ES), normalized enrichment score (NES), normalized enrichment score (NES), and FRD value.
Materials and Methods
Library Preparation
Total RNA was quantified using the Qubit 4.0 fluorimetric Assay (Thermo Fisher Scientific). Libraries were prepared from 125 ng of total RNA using the NEGEDIA Digital mRNA-seq research grade sequencing service v2.0 (Next Generation Diagnostic srl)1 which included library preparation, quality assessment and sequencing on a NovaSeq 6000 sequencing system using a single-end, 100 cycle strategy (Illumina Inc.).
Bioinformatics workflow
The raw data were analyzed by Next Generation Diagnostic srl proprietary NEGEDIA Digital mRNA-seq pipeline (v2.0) which involves a cleaning step by quality filtering and trimming, alignment to the reference genome and counting by gene 234. The raw expression data were normalized, analyzed by NEGEDIA degsanalysis pipeline (v1.2.0) 56 and visualized in a proprietary report (v1.0).
QUICK GEODATASET REFERENCE
PROTOCOLS | Description |
---|---|
Growth Protocol | Defined by the user |
Treatment Protocol | Defined by the user |
Extract Protocol | Defined by the user |
Library Construction Protocol | NEGEDIA Digital mRNA-seq research grade sequencing service v2.0 (Next Generation Diagnostic srl) |
Library Strategy | NEGEDIA Digital mRNA-seq v2.0 |
DATA PROCESSING PIPELINE | Description |
---|---|
Data Processing Step | Illumina NovaSeq 6000 base call (BCL) files were converted to fastq files using bcl2fastq |
Data Processing Step | Trimming and cleaning with bbduk |
Data Processing Step | Alignment was performed with STAR 2.6.0a |
Data Processing Step | The expression levels of genes were determined with HTseq-counts 0.9.1 |
Genome Build | hg38 |
Processed Data Files Format and Content | Tab-delimited text files including raw counts |
DEGs ANALYSIS | Description |
---|---|
Fold Change | -1.5 <= log2(Fold Change) >= 1.5 |
pAdj | <= 0.05 |
GSEA ANALYSIS | Description |
---|---|
FDR q-val | <= 25% |
Permutation number | 1000 |
-
Xiong Y, Soumillon M, Wu J et al. A Comparison of mRNA Sequencing with Random Primed and 3′-Directed Libraries. Sci Rep 7, 14626 (2017). https://doi.org/10.1038/s41598-017-14892-x ↩
-
Bushnell, Brian. 2014. “BBMap: A Fast, Accurate, Splice-Aware Aligner”. United States. https://www.osti.gov/servlets/purl/1241166. ↩
-
Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013 Jan 1;29(1):15-21. doi: 10.1093/bioinformatics/bts635. Epub 2012 Oct 25. PubMed PMID: 23104886; PubMed Central PMCID: PMC3530905. ↩
-
Anders S, Pyl PT, Huber W. HTSeq–a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015 Jan 15;31(2):166-9. doi: 10.1093/bioinformatics/btu638. Epub 2014 Sep 25. PubMed PMID: 25260700; PubMed Central PMCID: PMC4287950 ↩
-
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. ↩
-
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50. doi: 10.1073/pnas.0506580102. Epub 2005 Sep 30. PMID: 16199517; PMCID: PMC1239896. ↩
Download Files
Use the buttons listed below to download the specific files.