Last updated: 2025-06-26

Checks: 5 2

Knit directory: locust-comparative-genomics/

This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20221025) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.

absolute relative
/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data data
/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Schistocerca data/orthofinder/Schistocerca
/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/ data/orthofinder/Polyneoptera/Results_I2_iqtree

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 510c1f0. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    analysis/.DS_Store
    Ignored:    analysis/.Rhistory
    Ignored:    analysis/figure/
    Ignored:    code/.DS_Store
    Ignored:    code/scripts/.DS_Store
    Ignored:    code/scripts/pal2nal.v14/.DS_Store
    Ignored:    data/.DS_Store
    Ignored:    data/DEG_results/.DS_Store
    Ignored:    data/DEG_results/Bulk_RNAseq/.DS_Store
    Ignored:    data/DEG_results/Bulk_RNAseq/americana/.DS_Store
    Ignored:    data/DEG_results/Bulk_RNAseq/cancellata/.DS_Store
    Ignored:    data/DEG_results/Bulk_RNAseq/cubense/.DS_Store
    Ignored:    data/DEG_results/Bulk_RNAseq/gregaria/.DS_Store
    Ignored:    data/DEG_results/Bulk_RNAseq/nitens/.DS_Store
    Ignored:    data/WGCNA/.DS_Store
    Ignored:    data/WGCNA/input/.DS_Store
    Ignored:    data/WGCNA/input/Bulk_RNAseq/.DS_Store
    Ignored:    data/WGCNA/output/.DS_Store
    Ignored:    data/WGCNA/output/Bulk_RNAseq/.DS_Store
    Ignored:    data/behavioral_data/.DS_Store
    Ignored:    data/behavioral_data/Raw_data/.DS_Store
    Ignored:    data/list/.DS_Store
    Ignored:    data/list/Bulk_RNAseq/.DS_Store
    Ignored:    data/list/GO_Annotations/.DS_Store
    Ignored:    data/list/excluded_loci/.DS_Store
    Ignored:    data/orthofinder/.DS_Store
    Ignored:    data/orthofinder/Polyneoptera/.DS_Store
    Ignored:    data/orthofinder/Polyneoptera/Results_I2_iqtree/.DS_Store
    Ignored:    data/orthofinder/Polyneoptera/Results_I2_iqtree/Orthogroups/.DS_Store
    Ignored:    data/orthofinder/Polyneoptera/Results_I2_withDaust/.DS_Store
    Ignored:    data/orthofinder/Polyneoptera/Results_I2_withDaust/Orthogroups/.DS_Store
    Ignored:    data/orthofinder/Schistocerca/.DS_Store
    Ignored:    data/orthofinder/Schistocerca/Results_I2/.DS_Store
    Ignored:    data/orthofinder/Schistocerca/Results_I2/Orthogroups/.DS_Store
    Ignored:    data/overlap/.DS_Store
    Ignored:    data/overlap/Locusts/
    Ignored:    data/pathway_enrichment/.DS_Store
    Ignored:    data/readcounts/.DS_Store
    Ignored:    data/readcounts/Bulk_RNAseq/.DS_Store
    Ignored:    data/readcounts/RNAi/.DS_Store

Untracked files:
    Untracked:  VennDiagram.2025-06-26_12-29-36.059755.log
    Untracked:  VennDiagram.2025-06-26_12-29-36.658769.log
    Untracked:  VennDiagram.2025-06-26_12-29-37.055532.log
    Untracked:  VennDiagram.2025-06-26_12-29-37.460107.log
    Untracked:  VennDiagram.2025-06-26_12-29-37.904047.log
    Untracked:  VennDiagram.2025-06-26_12-29-38.353885.log
    Untracked:  VennDiagram.2025-06-26_12-29-38.471071.log
    Untracked:  VennDiagram.2025-06-26_12-29-38.5981.log
    Untracked:  VennDiagram.2025-06-26_12-29-39.436414.log
    Untracked:  VennDiagram.2025-06-26_12-29-39.495595.log
    Untracked:  VennDiagram.2025-06-26_12-29-39.613007.log
    Untracked:  VennDiagram.2025-06-26_12-29-40.471068.log
    Untracked:  VennDiagram.2025-06-26_12-29-40.504205.log
    Untracked:  VennDiagram.2025-06-26_12-29-40.565478.log
    Untracked:  VennDiagram.2025-06-26_12-29-41.08025.log
    Untracked:  VennDiagram.2025-06-26_12-29-41.112978.log
    Untracked:  VennDiagram.2025-06-26_12-29-41.174449.log
    Untracked:  VennDiagram.2025-06-26_12-29-41.747758.log
    Untracked:  VennDiagram.2025-06-26_12-29-41.835115.log
    Untracked:  VennDiagram.2025-06-26_12-29-41.978458.log
    Untracked:  VennDiagram.2025-06-26_12-29-42.606851.log
    Untracked:  VennDiagram.2025-06-26_12-29-42.758257.log
    Untracked:  VennDiagram.2025-06-26_12-29-42.857291.log
    Untracked:  VennDiagram.2025-06-26_12-29-43.561813.log
    Untracked:  VennDiagram.2025-06-26_12-29-43.680914.log
    Untracked:  VennDiagram.2025-06-26_12-29-43.812004.log
    Untracked:  VennDiagram.2025-06-26_12-29-43.926406.log
    Untracked:  VennDiagram.2025-06-26_12-29-44.056153.log
    Untracked:  VennDiagram.2025-06-26_12-29-44.185889.log
    Untracked:  VennDiagram.2025-06-26_12-29-51.527365.log
    Untracked:  VennDiagram.2025-06-26_12-29-51.93553.log
    Untracked:  VennDiagram.2025-06-26_12-29-52.382761.log
    Untracked:  VennDiagram.2025-06-26_12-29-52.83256.log
    Untracked:  VennDiagram.2025-06-26_12-29-53.28041.log
    Untracked:  VennDiagram.2025-06-26_12-29-54.79983.log
    Untracked:  VennDiagram.2025-06-26_12-29-54.895253.log
    Untracked:  VennDiagram.2025-06-26_12-29-54.958389.log
    Untracked:  VennDiagram.2025-06-26_12-29-56.725858.log
    Untracked:  VennDiagram.2025-06-26_12-29-56.811487.log
    Untracked:  VennDiagram.2025-06-26_12-29-58.794506.log
    Untracked:  VennDiagram.2025-06-26_12-29-58.843431.log
    Untracked:  VennDiagram.2025-06-26_12-29-58.91168.log
    Untracked:  VennDiagram.2025-06-26_12-30-00.664344.log
    Untracked:  VennDiagram.2025-06-26_12-30-00.706722.log
    Untracked:  VennDiagram.2025-06-26_12-30-02.592388.log
    Untracked:  VennDiagram.2025-06-26_12-30-02.652934.log
    Untracked:  VennDiagram.2025-06-26_12-30-04.458741.log
    Untracked:  VennDiagram.2025-06-26_12-30-04.497499.log
    Untracked:  VennDiagram.2025-06-26_12-30-04.551857.log
    Untracked:  VennDiagram.2025-06-26_12-30-06.643635.log
    Untracked:  VennDiagram.2025-06-26_12-30-06.710079.log
    Untracked:  VennDiagram.2025-06-26_12-30-06.837773.log
    Untracked:  VennDiagram.2025-06-26_12-30-08.904242.log
    Untracked:  VennDiagram.2025-06-26_12-30-08.974097.log
    Untracked:  VennDiagram.2025-06-26_12-30-09.104634.log
    Untracked:  VennDiagram.2025-06-26_12-30-11.82923.log
    Untracked:  VennDiagram.2025-06-26_12-30-11.948473.log
    Untracked:  VennDiagram.2025-06-26_12-30-12.031984.log
    Untracked:  VennDiagram.2025-06-26_12-30-12.162686.log
    Untracked:  VennDiagram.2025-06-26_12-30-12.293596.log
    Untracked:  VennDiagram.2025-06-26_12-30-12.377415.log
    Untracked:  VennDiagram.2025-06-26_12-30-15.656753.log
    Untracked:  VennDiagram.2025-06-26_12-30-15.729036.log
    Untracked:  VennDiagram.2025-06-26_12-30-15.85983.log
    Untracked:  VennDiagram.2025-06-26_12-30-18.26259.log
    Untracked:  VennDiagram.2025-06-26_12-30-18.331564.log
    Untracked:  VennDiagram.2025-06-26_12-30-18.46018.log
    Untracked:  VennDiagram.2025-06-26_12-36-21.039378.log
    Untracked:  VennDiagram.2025-06-26_12-39-06.714741.log
    Untracked:  analysis/VennDiagram.2025-06-26_12-32-15.339949.log
    Untracked:  analysis/VennDiagram.2025-06-26_12-32-15.423798.log
    Untracked:  analysis/VennDiagram.2025-06-26_12-32-15.532911.log
    Untracked:  analysis/VennDiagram.2025-06-26_12-39-31.618684.log
    Untracked:  data/HYPHY_selection/
    Untracked:  data/RefSeq/
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_HeadThorax_americana.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_HeadThorax_cancellata.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_HeadThorax_cubense.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_HeadThorax_gregaria.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_HeadThorax_nitens.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_HeadThorax_piceifrons.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_Head_americana.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_Head_cancellata.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_Head_cubense.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_Head_gregaria.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_Head_nitens.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_Head_piceifrons.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_Thorax_americana.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_Thorax_cancellata.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_Thorax_cubense.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_Thorax_gregaria.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_Thorax_nitens.rds
    Untracked:  data/WGCNA/input/Bulk_RNAseq/SE_Thorax_piceifrons.rds
    Untracked:  data/WGCNA/output/Bulk_RNAseq/gregaria/
    Untracked:  data/cafe5_results/
    Untracked:  data/list/GO_Annotations/DesertLocustR/
    Untracked:  data/list/GO_Annotations/DesertLocustR_0.1.0.tar.gz
    Untracked:  data/list/GO_Annotations/EggNog_Arthropoda_one2one.emapper.annotations
    Untracked:  data/list/GO_Annotations/GCF_021461385.2_iqSchPice1.1_Arthopoda_one2one.emapper.annotations
    Untracked:  data/list/GO_Annotations/GCF_021461395.2_iqSchAmer2.1_Arthopoda_one2one.emapper.annotations
    Untracked:  data/list/GO_Annotations/GCF_023864275.1_iqSchCanc2.1_Arthopoda_one2one.emapper.annotations
    Untracked:  data/list/GO_Annotations/GCF_023864345.2_iqSchSeri2.2_Arthopoda_one2one.emapper.annotations
    Untracked:  data/list/GO_Annotations/GCF_023897955.1_iqSchGreg1.2_Arthopoda_one2one.emapper.annotations
    Untracked:  data/list/GO_Annotations/GCF_023898315.1_iqSchNite1.1_Arthopoda_one2one.emapper.annotations
    Untracked:  data/list/GO_Annotations/install_depedencies.R
    Untracked:  data/orthofinder/Polyneoptera/Results_I2_iqtree/Orthogroups/Orthogroups_CladeAssignment_WithCopyStatus.tsv
    Untracked:  data/orthofinder/Polyneoptera/Results_I2_iqtree/Orthogroups/Orthogroups_CladeAssignment_WithCopyStatus_cleaned.csv
    Untracked:  data/orthofinder/Polyneoptera/Results_I2_iqtree/Orthogroups/Orthogroups_CladeAssignment_WithCopyStatus_cleaned.xlsx
    Untracked:  data/orthofinder/Polyneoptera/Results_I2_iqtree/Orthogroups/Orthogroups_SingleCopyOrthologues_selanalysis.txt
    Untracked:  data/orthofinder/Polyneoptera/Results_I2_iqtree/Orthogroups/Orthogroups_SingleCopyOrthologues_selanalysiswide.txt
    Untracked:  data/orthofinder/Polyneoptera/Results_I2_iqtree/Orthogroups/Orthogroups_SingleCopyOrthologues_strict.txt
    Untracked:  data/orthofinder/Schistocerca/Results_I2/Orthogroups/Orthogroups_CladeAssignment_WithCopyStatus.tsv
    Untracked:  data/orthofinder/Schistocerca/Results_I2/Orthogroups/Orthogroups_CladeAssignment_WithCopyStatus_cleaned.txt
    Untracked:  data/orthofinder/Schistocerca/Results_I2/Orthogroups/Orthogroups_CladeAssignment_WithCopyStatus_cleaned.xlsx
    Untracked:  data/orthofinder/Schistocerca/Results_I2/Orthogroups_Schistocerca_May2025.txt
    Untracked:  data/pathway_enrichment/GO30_enrichment_Head_americana_custom.csv
    Untracked:  data/pathway_enrichment/GO30_enrichment_Head_cancellata_custom.csv
    Untracked:  data/pathway_enrichment/GO30_enrichment_Head_cubense_custom.csv
    Untracked:  data/pathway_enrichment/GO30_enrichment_Head_gregaria_custom.csv
    Untracked:  data/pathway_enrichment/GO30_enrichment_Head_nitens_custom.csv
    Untracked:  data/pathway_enrichment/GO30_enrichment_Head_piceifrons_custom.csv
    Untracked:  data/pathway_enrichment/GO30_enrichment_Thorax_americana_custom.csv
    Untracked:  data/pathway_enrichment/GO30_enrichment_Thorax_cancellata_custom.csv
    Untracked:  data/pathway_enrichment/GO30_enrichment_Thorax_cubense_custom.csv
    Untracked:  data/pathway_enrichment/GO30_enrichment_Thorax_gregaria_custom.csv
    Untracked:  data/pathway_enrichment/GO30_enrichment_Thorax_nitens_custom.csv
    Untracked:  data/pathway_enrichment/GO30_enrichment_Thorax_piceifrons_custom.csv
    Untracked:  data/pathway_enrichment/OLD/
    Untracked:  data/pathway_enrichment/Overlapping_Genes_Head_Locusts.csv
    Untracked:  data/pathway_enrichment/Overlapping_Genes_Thorax_Locusts.csv
    Untracked:  data/pathway_enrichment/REVIGO_results/
    Untracked:  data/pathway_enrichment/americana/
    Untracked:  data/pathway_enrichment/cancellata/
    Untracked:  data/pathway_enrichment/cross_species_GO_terms_BP_ALL.csv
    Untracked:  data/pathway_enrichment/cross_species_GO_terms_CC_ALL.csv
    Untracked:  data/pathway_enrichment/cross_species_GO_terms_MF_ALL.csv
    Untracked:  data/pathway_enrichment/cross_species_GO_terms_matrix_BP.csv
    Untracked:  data/pathway_enrichment/cross_species_GO_terms_matrix_CC.csv
    Untracked:  data/pathway_enrichment/cross_species_GO_terms_matrix_MF.csv
    Untracked:  data/pathway_enrichment/cross_species_top30_GO_terms_BP_ALL.csv
    Untracked:  data/pathway_enrichment/cross_species_top30_GO_terms_CC_ALL.csv
    Untracked:  data/pathway_enrichment/cross_species_top30_GO_terms_MF_ALL.csv
    Untracked:  data/pathway_enrichment/cross_species_top30_GO_terms_matrix_BP.csv
    Untracked:  data/pathway_enrichment/cross_species_top30_GO_terms_matrix_CC.csv
    Untracked:  data/pathway_enrichment/cross_species_top30_GO_terms_matrix_MF.csv
    Untracked:  data/pathway_enrichment/cross_species_top30_heatmap_BP.pdf
    Untracked:  data/pathway_enrichment/cross_species_top30_heatmap_CC.pdf
    Untracked:  data/pathway_enrichment/cross_species_top30_heatmap_MF.pdf
    Untracked:  data/pathway_enrichment/cubense/
    Untracked:  data/pathway_enrichment/gregaria/
    Untracked:  data/pathway_enrichment/nitens/
    Untracked:  data/pathway_enrichment/piceifrons/
    Untracked:  permutedStats-actualModules.RData

Unstaged changes:
    Modified:   analysis/2_hic-snps-phylogeny.Rmd
    Modified:   analysis/2_orthologs-prediction.Rmd
    Modified:   analysis/2_signatures-selection.Rmd
    Modified:   analysis/3_compiling_tables.Rmd
    Modified:   analysis/3_deseq2-results.Rmd
    Modified:   analysis/3_go-enrichment.Rmd
    Modified:   analysis/3_overlap-venn.Rmd
    Modified:   analysis/3_wgcna-network.Rmd
    Modified:   data/DEG_results/Bulk_RNAseq/All_species/PCA_VST.png
    Modified:   data/DEG_results/Bulk_RNAseq/All_species/PCA_labelled_VST.png
    Modified:   data/DEG_results/Bulk_RNAseq/GO10_enrichment_Head_americana_custom.csv
    Modified:   data/DEG_results/Bulk_RNAseq/GO10_enrichment_Head_cancellata_custom.csv
    Modified:   data/DEG_results/Bulk_RNAseq/GO10_enrichment_Head_cubense_custom.csv
    Modified:   data/DEG_results/Bulk_RNAseq/GO10_enrichment_Head_gregaria_custom.csv
    Modified:   data/DEG_results/Bulk_RNAseq/GO10_enrichment_Head_nitens_custom.csv
    Modified:   data/DEG_results/Bulk_RNAseq/GO10_enrichment_Head_piceifrons_custom.csv
    Modified:   data/DEG_results/Bulk_RNAseq/GO10_enrichment_Thorax_americana_custom.csv
    Modified:   data/DEG_results/Bulk_RNAseq/GO10_enrichment_Thorax_cancellata_custom.csv
    Modified:   data/DEG_results/Bulk_RNAseq/GO10_enrichment_Thorax_cubense_custom.csv
    Modified:   data/DEG_results/Bulk_RNAseq/GO10_enrichment_Thorax_gregaria_custom.csv
    Modified:   data/DEG_results/Bulk_RNAseq/GO10_enrichment_Thorax_piceifrons_custom.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/americana/DESeq2_sigresults_sva_HeadLeftJoinThorax_americana.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/americana/DESeq2_sigresults_sva_HeadLeftJoinThorax_americana_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/americana/Head/DESeq2_results_Head_americana.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/americana/Head/DESeq2_results_Head_americana_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/americana/Head/DESeq2_sigresults_Head_americana.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/americana/Head/DESeq2_sigresults_Head_americana_togregaria.csv
    Modified:   data/DEG_results/Bulk_RNAseq/americana/Head/heatmap_VST_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/americana/Head/heatmap_VST_Head_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/americana/Head/heatmap_normTransform_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/americana/Head/heatmap_normTransform_Head_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/americana/Head/heatmap_rlog_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/americana/Head/heatmap_rlog_Head_togregaria.pdf
    Deleted:    data/DEG_results/Bulk_RNAseq/americana/Thorax/DESeq2_results_Thorax_americana.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/americana/Thorax/DESeq2_results_Thorax_americana_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/americana/Thorax/DESeq2_sigresults_Thorax_americana.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/americana/Thorax/DESeq2_sigresults_Thorax_americana_togregaria.csv
    Modified:   data/DEG_results/Bulk_RNAseq/americana/Thorax/heatmap_VST_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/americana/Thorax/heatmap_VST_Thorax_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/americana/Thorax/heatmap_normTransform_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/americana/Thorax/heatmap_normTransform_Thorax_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/americana/Thorax/heatmap_rlog_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/americana/Thorax/heatmap_rlog_Thorax_togregaria.pdf
    Deleted:    data/DEG_results/Bulk_RNAseq/cancellata/DESeq2_sigresults_sva_HeadLeftJoinThorax_cancellata.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/cancellata/DESeq2_sigresults_sva_HeadLeftJoinThorax_cancellata_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/cancellata/Head/DESeq2_results_Head_cancellata.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/cancellata/Head/DESeq2_results_Head_cancellata_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/cancellata/Head/DESeq2_sigresults_Head_cancellata.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/cancellata/Head/DESeq2_sigresults_Head_cancellata_togregaria.csv
    Modified:   data/DEG_results/Bulk_RNAseq/cancellata/Head/heatmap_VST_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cancellata/Head/heatmap_VST_Head_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cancellata/Head/heatmap_normTransform_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cancellata/Head/heatmap_normTransform_Head_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cancellata/Head/heatmap_rlog_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cancellata/Head/heatmap_rlog_Head_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cancellata/Thorax/heatmap_VST_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cancellata/Thorax/heatmap_VST_Thorax_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cancellata/Thorax/heatmap_normTransform_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cancellata/Thorax/heatmap_normTransform_Thorax_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cancellata/Thorax/heatmap_rlog_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cancellata/Thorax/heatmap_rlog_Thorax_togregaria.pdf
    Deleted:    data/DEG_results/Bulk_RNAseq/cubense/DESeq2_sigresults_sva_HeadLeftJoinThorax_cubense.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/cubense/DESeq2_sigresults_sva_HeadLeftJoinThorax_cubense_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/cubense/Head/DESeq2_results_Head_cubense.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/cubense/Head/DESeq2_results_Head_cubense_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/cubense/Head/DESeq2_sigresults_Head_cubense.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/cubense/Head/DESeq2_sigresults_Head_cubense_togregaria.csv
    Modified:   data/DEG_results/Bulk_RNAseq/cubense/Head/heatmap_VST_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cubense/Head/heatmap_VST_Head_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cubense/Head/heatmap_normTransform_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cubense/Head/heatmap_normTransform_Head_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cubense/Head/heatmap_rlog_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cubense/Head/heatmap_rlog_Head_togregaria.pdf
    Deleted:    data/DEG_results/Bulk_RNAseq/cubense/Thorax/DESeq2_results_Thorax_cubense.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/cubense/Thorax/DESeq2_results_Thorax_cubense_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/cubense/Thorax/DESeq2_sigresults_Thorax_cubense.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/cubense/Thorax/DESeq2_sigresults_Thorax_cubense_togregaria.csv
    Modified:   data/DEG_results/Bulk_RNAseq/cubense/Thorax/heatmap_VST_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cubense/Thorax/heatmap_VST_Thorax_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cubense/Thorax/heatmap_normTransform_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cubense/Thorax/heatmap_normTransform_Thorax_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cubense/Thorax/heatmap_rlog_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/cubense/Thorax/heatmap_rlog_Thorax_togregaria.pdf
    Deleted:    data/DEG_results/Bulk_RNAseq/davidO/Head/DESeq2_results_Head_davidB_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/davidO/Head/DESeq2_results_Head_davidO_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/davidO/Head/singlecell_brain_phase_DGE_markers.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/davidO/Head/singlecell_opticlobe_phase_DGE_markers.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/gregaria/DESeq2_sigresults_sva_HeadLeftJoinThorax_gregaria_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/gregaria/Head/DESeq2_results_Head_gregaria_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/gregaria/Head/DESeq2_sigresults_Head_gregaria_togregaria.csv
    Modified:   data/DEG_results/Bulk_RNAseq/gregaria/Head/heatmap_VST_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/gregaria/Head/heatmap_VST_Head_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/gregaria/Head/heatmap_normTransform_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/gregaria/Head/heatmap_normTransform_Head_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/gregaria/Head/heatmap_rlog_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/gregaria/Head/heatmap_rlog_Head_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/gregaria/Thorax/heatmap_VST_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/gregaria/Thorax/heatmap_VST_Thorax_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/gregaria/Thorax/heatmap_normTransform_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/gregaria/Thorax/heatmap_normTransform_Thorax_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/gregaria/Thorax/heatmap_rlog_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/gregaria/Thorax/heatmap_rlog_Thorax_togregaria.pdf
    Deleted:    data/DEG_results/Bulk_RNAseq/nitens/DESeq2_sigresults_sva_HeadLeftJoinThorax_nitens.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/nitens/DESeq2_sigresults_sva_HeadLeftJoinThorax_nitens_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/nitens/Head/DESeq2_results_Head_nitens.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/nitens/Head/DESeq2_results_Head_nitens_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/nitens/Head/DESeq2_sigresults_Head_nitens.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/nitens/Head/DESeq2_sigresults_Head_nitens_togregaria.csv
    Modified:   data/DEG_results/Bulk_RNAseq/nitens/Head/heatmap_VST_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/nitens/Head/heatmap_VST_Head_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/nitens/Head/heatmap_normTransform_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/nitens/Head/heatmap_normTransform_Head_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/nitens/Head/heatmap_rlog_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/nitens/Head/heatmap_rlog_Head_togregaria.pdf
    Deleted:    data/DEG_results/Bulk_RNAseq/nitens/Thorax/DESeq2_results_Thorax_nitens.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/nitens/Thorax/DESeq2_results_Thorax_nitens_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/nitens/Thorax/DESeq2_sigresults_Thorax_nitens.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/nitens/Thorax/DESeq2_sigresults_Thorax_nitens_togregaria.csv
    Modified:   data/DEG_results/Bulk_RNAseq/nitens/Thorax/heatmap_VST_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/nitens/Thorax/heatmap_VST_Thorax_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/nitens/Thorax/heatmap_normTransform_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/nitens/Thorax/heatmap_normTransform_Thorax_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/nitens/Thorax/heatmap_rlog_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/nitens/Thorax/heatmap_rlog_Thorax_togregaria.pdf
    Deleted:    data/DEG_results/Bulk_RNAseq/piceifrons/DESeq2_sigresults_sva_HeadLeftJoinThorax_piceifrons.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/piceifrons/DESeq2_sigresults_sva_HeadLeftJoinThorax_piceifrons_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/piceifrons/Head/DESeq2_results_Head_piceifrons.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/piceifrons/Head/DESeq2_results_Head_piceifrons_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/piceifrons/Head/DESeq2_sigresults_Head_piceifrons.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/piceifrons/Head/DESeq2_sigresults_Head_piceifrons_togregaria.csv
    Modified:   data/DEG_results/Bulk_RNAseq/piceifrons/Head/heatmap_VST_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/piceifrons/Head/heatmap_VST_Head_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/piceifrons/Head/heatmap_normTransform_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/piceifrons/Head/heatmap_normTransform_Head_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/piceifrons/Head/heatmap_rlog_Head.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/piceifrons/Head/heatmap_rlog_Head_togregaria.pdf
    Deleted:    data/DEG_results/Bulk_RNAseq/piceifrons/Thorax/DESeq2_results_Thorax_piceifrons.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/piceifrons/Thorax/DESeq2_results_Thorax_piceifrons_togregaria.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/piceifrons/Thorax/DESeq2_sigresults_Thorax_piceifrons.csv
    Deleted:    data/DEG_results/Bulk_RNAseq/piceifrons/Thorax/DESeq2_sigresults_Thorax_piceifrons_togregaria.csv
    Modified:   data/DEG_results/Bulk_RNAseq/piceifrons/Thorax/heatmap_VST_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/piceifrons/Thorax/heatmap_VST_Thorax_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/piceifrons/Thorax/heatmap_normTransform_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/piceifrons/Thorax/heatmap_normTransform_Thorax_togregaria.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/piceifrons/Thorax/heatmap_rlog_Thorax.pdf
    Modified:   data/DEG_results/Bulk_RNAseq/piceifrons/Thorax/heatmap_rlog_Thorax_togregaria.pdf
    Deleted:    data/DEG_results/RNAi/All/HEX1_vs_GFP/DEG_sigresults_HEX1_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/All/HEX2_vs_GFP/DEG_sigresults_HEX2_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/All/Hex1_vs_GFP/heatmap_plot_Hex1_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All/Hex1_vs_GFP/volcano_plot_Hex1_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All/Hex2_vs_GFP/heatmap_plot_Hex2_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All/Hex2_vs_GFP/volcano_plot_Hex2_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All/JHMT_vs_GFP/DEG_sigresults_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/All/JHMT_vs_GFP/heatmap_plot_JHMT_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All/JHMT_vs_GFP/volcano_plot_JHMT_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All/MIOX_vs_GFP/DEG_sigresults_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/All/MIOX_vs_GFP/heatmap_plot_MIOX_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All/MIOX_vs_GFP/volcano_plot_MIOX_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All/PCA_Tissue_Gene_Label.png
    Deleted:    data/DEG_results/RNAi/All/PCA_Tissue_Gene_NoLabel.png
    Deleted:    data/DEG_results/RNAi/All/UNCH_vs_GFP/DEG_sigresults_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/All/UNCH_vs_GFP/heatmap_plot_UNCH_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All/UNCH_vs_GFP/volcano_plot_UNCH_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/All/sva_scatter_SV1_SV3.png
    Deleted:    data/DEG_results/RNAi/All/sva_scatter_SV2_SV3.png
    Deleted:    data/DEG_results/RNAi/All/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/All/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/All/sva_stripchart_SV3.png
    Deleted:    data/DEG_results/RNAi/All_GFP/GFP_vs_CONTROL/DEG_sigresults_GFP_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/All_GFP/PCA_Tissue_Gene_Label.png
    Deleted:    data/DEG_results/RNAi/All_GFP/PCA_Tissue_Gene_NoLabel.png
    Deleted:    data/DEG_results/RNAi/All_GFP/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/All_GFP/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/All_GFP/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/All_GFP_no_rRNA/GFP_vs_CONTROL/DEG_sigresults_GFP_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/All_GFP_no_rRNA/PCA_Tissue_Gene_Label.png
    Deleted:    data/DEG_results/RNAi/All_GFP_no_rRNA/PCA_Tissue_Gene_NoLabel.png
    Deleted:    data/DEG_results/RNAi/All_GFP_no_rRNA/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/All_GFP_no_rRNA/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/All_GFP_no_rRNA/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/All_control/GFP_vs_CONTROL/DEG_sigresults_GFP_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/All_control/HEX1_vs_CONTROL/DEG_sigresults_HEX1_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/All_control/HEX2_vs_CONTROL/DEG_sigresults_HEX2_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/All_control/JHMT_vs_CONTROL/DEG_sigresults_JHMT_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/All_control/MIOX_vs_CONTROL/DEG_sigresults_MIOX_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/All_control/PCA_Tissue_Gene_Label.png
    Deleted:    data/DEG_results/RNAi/All_control/PCA_Tissue_Gene_NoLabel.png
    Deleted:    data/DEG_results/RNAi/All_control/UNCH_vs_CONTROL/DEG_sigresults_UNCH_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/All_control/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/All_control/sva_scatter_SV1_SV3.png
    Deleted:    data/DEG_results/RNAi/All_control/sva_scatter_SV1_SV4.png
    Deleted:    data/DEG_results/RNAi/All_control/sva_scatter_SV2_SV3.png
    Deleted:    data/DEG_results/RNAi/All_control/sva_scatter_SV2_SV4.png
    Deleted:    data/DEG_results/RNAi/All_control/sva_scatter_SV3_SV4.png
    Deleted:    data/DEG_results/RNAi/All_control/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/All_control/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/All_control/sva_stripchart_SV3.png
    Deleted:    data/DEG_results/RNAi/All_control/sva_stripchart_SV4.png
    Deleted:    data/DEG_results/RNAi/All_control_no_rRNA/GFP_vs_CONTROL/DEG_sigresults_GFP_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/All_control_no_rRNA/HEX1_vs_CONTROL/DEG_sigresults_HEX1_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/All_control_no_rRNA/HEX2_vs_CONTROL/DEG_sigresults_HEX2_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/All_control_no_rRNA/JHMT_vs_CONTROL/DEG_sigresults_JHMT_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/All_control_no_rRNA/MIOX_vs_CONTROL/DEG_sigresults_MIOX_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/All_control_no_rRNA/PCA_Tissue_Gene_Label.png
    Deleted:    data/DEG_results/RNAi/All_control_no_rRNA/PCA_Tissue_Gene_NoLabel.png
    Deleted:    data/DEG_results/RNAi/All_control_no_rRNA/UNCH_vs_CONTROL/DEG_sigresults_UNCH_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/All_control_no_rRNA/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/All_control_no_rRNA/sva_scatter_SV1_SV3.png
    Deleted:    data/DEG_results/RNAi/All_control_no_rRNA/sva_scatter_SV2_SV3.png
    Deleted:    data/DEG_results/RNAi/All_control_no_rRNA/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/All_control_no_rRNA/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/All_control_no_rRNA/sva_stripchart_SV3.png
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/HEX1_vs_GFP/DEG_sigresults_HEX1_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/HEX2_vs_GFP/DEG_sigresults_HEX2_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/Hex1_vs_GFP/heatmap_plot_Hex1_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/Hex1_vs_GFP/volcano_plot_Hex1_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/Hex2_vs_GFP/heatmap_plot_Hex2_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/Hex2_vs_GFP/volcano_plot_Hex2_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/JHMT_vs_GFP/DEG_sigresults_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/JHMT_vs_GFP/heatmap_plot_JHMT_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/JHMT_vs_GFP/volcano_plot_JHMT_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/MIOX_vs_GFP/DEG_sigresults_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/MIOX_vs_GFP/heatmap_plot_MIOX_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/MIOX_vs_GFP/volcano_plot_MIOX_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/PCA_Tissue_Gene_Label.png
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/PCA_Tissue_Gene_NoLabel.png
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/UNCH_vs_GFP/DEG_sigresults_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/UNCH_vs_GFP/heatmap_plot_UNCH_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/UNCH_vs_GFP/volcano_plot_UNCH_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/All_no_rRNA/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/Head/HEX1_vs_GFP/DEG_sigresults_HEX1_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/HEX2_vs_GFP/DEG_sigresults_HEX2_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/Hex1_vs_GFP/heatmap_plot_Hex1_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head/Hex1_vs_GFP/volcano_plot_Hex1_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head/Hex2_vs_GFP/heatmap_plot_Hex2_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head/Hex2_vs_GFP/volcano_plot_Hex2_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head/JHMT_vs_GFP/DEG_sigresults_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/JHMT_vs_GFP/heatmap_plot_JHMT_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head/JHMT_vs_GFP/volcano_plot_JHMT_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head/MIOX_vs_GFP/DEG_sigresults_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/MIOX_vs_GFP/heatmap_plot_MIOX_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head/MIOX_vs_GFP/volcano_plot_MIOX_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head/PCA_vst_Gene_hull.png
    Deleted:    data/DEG_results/RNAi/Head/PCA_vst_Gene_labelled.png
    Deleted:    data/DEG_results/RNAi/Head/UNCH_vs_GFP/DEG_sigresults_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UNCH_vs_GFP/heatmap_plot_UNCH_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head/UNCH_vs_GFP/volcano_plot_UNCH_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP_&_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP_&_JHMT_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP_&_JHMT_vs_GFP_&_MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP_&_JHMT_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP_&_MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_JHMT_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_JHMT_vs_GFP_&_MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_JHMT_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX1_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX2_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX2_vs_GFP_&_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX2_vs_GFP_&_JHMT_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX2_vs_GFP_&_JHMT_vs_GFP_&_MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX2_vs_GFP_&_JHMT_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX2_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX2_vs_GFP_&_MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/HEX2_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/JHMT_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/JHMT_vs_GFP_&_MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/JHMT_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/UpSetR_all_intersections/UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/Head/sva_scatter_SV1_SV3.png
    Deleted:    data/DEG_results/RNAi/Head/sva_scatter_SV2_SV3.png
    Deleted:    data/DEG_results/RNAi/Head/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/Head/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/Head/sva_stripchart_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_GFP/GFP_vs_CONTROL/DEG_sigresults_GFP_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Head_GFP/PCA_Tissue_Gene_Label.png
    Deleted:    data/DEG_results/RNAi/Head_GFP/PCA_Tissue_Gene_NoLabel.png
    Deleted:    data/DEG_results/RNAi/Head_GFP/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/Head_GFP/sva_scatter_SV1_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_GFP/sva_scatter_SV1_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_GFP/sva_scatter_SV2_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_GFP/sva_scatter_SV2_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_GFP/sva_scatter_SV3_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_GFP/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/Head_GFP/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/Head_GFP/sva_stripchart_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_GFP/sva_stripchart_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_GFP_no_rRNA/GFP_vs_CONTROL/DEG_sigresults_GFP_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Head_GFP_no_rRNA/PCA_Tissue_Gene_Label.png
    Deleted:    data/DEG_results/RNAi/Head_GFP_no_rRNA/PCA_Tissue_Gene_NoLabel.png
    Deleted:    data/DEG_results/RNAi/Head_GFP_no_rRNA/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/Head_GFP_no_rRNA/sva_scatter_SV1_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_GFP_no_rRNA/sva_scatter_SV1_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_GFP_no_rRNA/sva_scatter_SV2_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_GFP_no_rRNA/sva_scatter_SV2_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_GFP_no_rRNA/sva_scatter_SV3_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_GFP_no_rRNA/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/Head_GFP_no_rRNA/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/Head_GFP_no_rRNA/sva_stripchart_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_GFP_no_rRNA/sva_stripchart_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_control/GFP_vs_CONTROL/DEG_sigresults_GFP_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Head_control/HEX1_vs_CONTROL/DEG_sigresults_HEX1_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Head_control/HEX2_vs_CONTROL/DEG_sigresults_HEX2_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Head_control/JHMT_vs_CONTROL/DEG_sigresults_JHMT_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Head_control/MIOX_vs_CONTROL/DEG_sigresults_MIOX_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Head_control/PCA_Tissue_Gene_Label.png
    Deleted:    data/DEG_results/RNAi/Head_control/PCA_Tissue_Gene_NoLabel.png
    Deleted:    data/DEG_results/RNAi/Head_control/UNCH_vs_CONTROL/DEG_sigresults_UNCH_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV1_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV1_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV1_SV5.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV1_SV6.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV1_SV7.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV2_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV2_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV2_SV5.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV2_SV6.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV2_SV7.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV3_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV3_SV5.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV3_SV6.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV3_SV7.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV4_SV5.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV4_SV6.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV4_SV7.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV5_SV6.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV5_SV7.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_scatter_SV6_SV7.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_stripchart_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_stripchart_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_stripchart_SV5.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_stripchart_SV6.png
    Deleted:    data/DEG_results/RNAi/Head_control/sva_stripchart_SV7.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/GFP_vs_CONTROL/DEG_sigresults_GFP_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/HEX1_vs_CONTROL/DEG_sigresults_HEX1_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/HEX2_vs_CONTROL/DEG_sigresults_HEX2_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/JHMT_vs_CONTROL/DEG_sigresults_JHMT_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/MIOX_vs_CONTROL/DEG_sigresults_MIOX_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/PCA_Tissue_Gene_Label.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/PCA_Tissue_Gene_NoLabel.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/UNCH_vs_CONTROL/DEG_sigresults_UNCH_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV1_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV1_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV1_SV5.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV1_SV6.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV1_SV7.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV2_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV2_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV2_SV5.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV2_SV6.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV2_SV7.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV3_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV3_SV5.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV3_SV6.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV3_SV7.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV4_SV5.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV4_SV6.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV4_SV7.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV5_SV6.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV5_SV7.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_scatter_SV6_SV7.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_stripchart_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_stripchart_SV4.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_stripchart_SV5.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_stripchart_SV6.png
    Deleted:    data/DEG_results/RNAi/Head_control_no_rRNA/sva_stripchart_SV7.png
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/HEX1_vs_GFP/DEG_sigresults_HEX1_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/HEX2_vs_GFP/DEG_sigresults_HEX2_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/Hex1_vs_GFP/heatmap_plot_Hex1_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/Hex1_vs_GFP/volcano_plot_Hex1_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/Hex2_vs_GFP/heatmap_plot_Hex2_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/Hex2_vs_GFP/volcano_plot_Hex2_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/JHMT_vs_GFP/DEG_sigresults_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/JHMT_vs_GFP/heatmap_plot_JHMT_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/JHMT_vs_GFP/volcano_plot_JHMT_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/MIOX_vs_GFP/DEG_sigresults_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/MIOX_vs_GFP/heatmap_plot_MIOX_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/MIOX_vs_GFP/volcano_plot_MIOX_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/PCA_vst_Gene_hull.png
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/PCA_vst_Gene_labelled.png
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UNCH_vs_GFP/DEG_sigresults_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UNCH_vs_GFP/heatmap_plot_UNCH_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UNCH_vs_GFP/volcano_plot_UNCH_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP_&_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP_&_JHMT_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP_&_MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/HEX2_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/HEX2_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/HEX2_vs_GFP_&_MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/HEX2_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/JHMT_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/JHMT_vs_GFP_&_MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/JHMT_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/UpSetR_all_intersections/UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/sva_scatter_SV1_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/sva_scatter_SV2_SV3.png
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/Head_no_rRNA/sva_stripchart_SV3.png
    Deleted:    data/DEG_results/RNAi/Thorax/HEX1_vs_GFP/DEG_sigresults_HEX1_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/HEX2_vs_GFP/DEG_sigresults_HEX2_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/Hex1_vs_GFP/heatmap_plot_Hex1_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax/Hex1_vs_GFP/volcano_plot_Hex1_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax/Hex2_vs_GFP/heatmap_plot_Hex2_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax/Hex2_vs_GFP/volcano_plot_Hex2_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax/JHMT_vs_GFP/DEG_sigresults_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/JHMT_vs_GFP/heatmap_plot_JHMT_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax/JHMT_vs_GFP/volcano_plot_JHMT_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax/MIOX_vs_GFP/DEG_sigresults_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/MIOX_vs_GFP/heatmap_plot_MIOX_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax/MIOX_vs_GFP/volcano_plot_MIOX_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax/PCA_vst_Gene_hull.png
    Deleted:    data/DEG_results/RNAi/Thorax/PCA_vst_Gene_labelled.png
    Deleted:    data/DEG_results/RNAi/Thorax/Thorax_raw_counts.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UNCH_vs_GFP/DEG_sigresults_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UNCH_vs_GFP/heatmap_plot_UNCH_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax/UNCH_vs_GFP/volcano_plot_UNCH_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX1_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP_&_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP_&_MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX1_vs_GFP_&_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX1_vs_GFP_&_JHMT_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX1_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX1_vs_GFP_&_MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX1_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX2_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX2_vs_GFP_&_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX2_vs_GFP_&_JHMT_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX2_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX2_vs_GFP_&_MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/HEX2_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/JHMT_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/JHMT_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/UpSetR_all_intersections/UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/Thorax/sva_scatter_SV1_SV3.png
    Deleted:    data/DEG_results/RNAi/Thorax/sva_scatter_SV2_SV3.png
    Deleted:    data/DEG_results/RNAi/Thorax/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/Thorax/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/Thorax/sva_stripchart_SV3.png
    Deleted:    data/DEG_results/RNAi/Thorax_GFP/GFP_vs_CONTROL/DEG_sigresults_GFP_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Thorax_GFP/PCA_Tissue_Gene_Label.png
    Deleted:    data/DEG_results/RNAi/Thorax_GFP/PCA_Tissue_Gene_NoLabel.png
    Deleted:    data/DEG_results/RNAi/Thorax_GFP/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/Thorax_GFP/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/Thorax_GFP/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/Thorax_GFP_no_rRNA/GFP_vs_CONTROL/DEG_sigresults_GFP_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Thorax_GFP_no_rRNA/PCA_Tissue_Gene_Label.png
    Deleted:    data/DEG_results/RNAi/Thorax_GFP_no_rRNA/PCA_Tissue_Gene_NoLabel.png
    Deleted:    data/DEG_results/RNAi/Thorax_GFP_no_rRNA/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/Thorax_GFP_no_rRNA/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/Thorax_GFP_no_rRNA/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/GFP_vs_CONTROL/DEG_sigresults_GFP_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Thorax_control/HEX1_vs_CONTROL/DEG_sigresults_HEX1_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Thorax_control/HEX2_vs_CONTROL/DEG_sigresults_HEX2_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Thorax_control/JHMT_vs_CONTROL/DEG_sigresults_JHMT_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Thorax_control/MIOX_vs_CONTROL/DEG_sigresults_MIOX_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Thorax_control/PCA_Tissue_Gene_Label.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/PCA_Tissue_Gene_NoLabel.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/UNCH_vs_CONTROL/DEG_sigresults_UNCH_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV1_SV3.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV1_SV4.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV1_SV5.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV1_SV6.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV2_SV3.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV2_SV4.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV2_SV5.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV2_SV6.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV3_SV4.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV3_SV5.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV3_SV6.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV4_SV5.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV4_SV6.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_scatter_SV5_SV6.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_stripchart_SV3.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_stripchart_SV4.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_stripchart_SV5.png
    Deleted:    data/DEG_results/RNAi/Thorax_control/sva_stripchart_SV6.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/GFP_vs_CONTROL/DEG_sigresults_GFP_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/HEX1_vs_CONTROL/DEG_sigresults_HEX1_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/HEX2_vs_CONTROL/DEG_sigresults_HEX2_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/JHMT_vs_CONTROL/DEG_sigresults_JHMT_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/MIOX_vs_CONTROL/DEG_sigresults_MIOX_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/PCA_Tissue_Gene_Label.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/PCA_Tissue_Gene_NoLabel.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/UNCH_vs_CONTROL/DEG_sigresults_UNCH_vs_CONTROL.csv
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV1_SV3.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV1_SV4.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV1_SV5.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV1_SV6.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV2_SV3.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV2_SV4.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV2_SV5.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV2_SV6.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV3_SV4.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV3_SV5.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV3_SV6.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV4_SV5.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV4_SV6.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_scatter_SV5_SV6.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_stripchart_SV3.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_stripchart_SV4.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_stripchart_SV5.png
    Deleted:    data/DEG_results/RNAi/Thorax_control_no_rRNA/sva_stripchart_SV6.png
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/HEX1_vs_GFP/DEG_sigresults_HEX1_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/HEX2_vs_GFP/DEG_sigresults_HEX2_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/Hex1_vs_GFP/heatmap_plot_Hex1_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/Hex1_vs_GFP/volcano_plot_Hex1_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/Hex2_vs_GFP/heatmap_plot_Hex2_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/Hex2_vs_GFP/volcano_plot_Hex2_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/JHMT_vs_GFP/DEG_sigresults_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/JHMT_vs_GFP/heatmap_plot_JHMT_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/JHMT_vs_GFP/volcano_plot_JHMT_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/MIOX_vs_GFP/DEG_sigresults_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/MIOX_vs_GFP/heatmap_plot_MIOX_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/MIOX_vs_GFP/volcano_plot_MIOX_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/PCA_vst_Gene_hull.png
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/PCA_vst_Gene_labelled.png
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UNCH_vs_GFP/DEG_sigresults_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UNCH_vs_GFP/heatmap_plot_UNCH_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UNCH_vs_GFP/volcano_plot_UNCH_vs_GFP.tiff
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP_&_HEX2_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP_&_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP_&_JHMT_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/HEX1_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/HEX2_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/HEX2_vs_GFP_&_JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/HEX2_vs_GFP_&_JHMT_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/HEX2_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/HEX2_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/JHMT_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/JHMT_vs_GFP_&_MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/JHMT_vs_GFP_&_MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/JHMT_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/MIOX_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/MIOX_vs_GFP_&_UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/UpSetR_all_intersections/UNCH_vs_GFP.csv
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/sva_scatter_SV1_SV2.png
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/sva_scatter_SV1_SV3.png
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/sva_scatter_SV2_SV3.png
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/sva_stripchart_SV1.png
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/sva_stripchart_SV2.png
    Deleted:    data/DEG_results/RNAi/Thorax_no_rRNA/sva_stripchart_SV3.png
    Deleted:    data/DEG_results/single_cell/singlecell_brain_phase_DGE_markers.csv
    Deleted:    data/DEG_results/single_cell/singlecell_opticlobe_phase_DGE_markers.csv
    Deleted:    data/WGCNA/output/Bulk_RNAseq/SoftThreshold_Head_gregaria.pdf
    Modified:   data/behavioral_data/Sample_phenotypes_WGCNA.csv
    Modified:   data/list/Bulk_RNAseq/All_BulkRNAseq_samples.csv
    Deleted:    data/orthofinder/Polyneoptera/Results_I2_iqtree/Orthogroups/Orthogroups_CladeAssignment_cleaned.txt
    Deleted:    data/orthofinder/Polyneoptera/Results_I2_iqtree/Orthogroups/Orthogroups_CladeAssignment_cleaned.xlsx
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Orthogroups/Orthogroups_SingleCopyOrthologues.txt
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Orthogroups/Orthogroups_UnassignedGenes_reprocessed.tsv
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Orthogroups/Orthogroups_reprocessed.tsv
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Orthogroups_13species_May2025.txt
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Plots_Polyneoptera/VerticalStackedBar_A. simplex.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Plots_Polyneoptera/VerticalStackedBar_B. rossius.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Plots_Polyneoptera/VerticalStackedBar_C. secundus.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Plots_Polyneoptera/VerticalStackedBar_G. bimaculatus.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Plots_Polyneoptera/VerticalStackedBar_G. longicornis.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Plots_Polyneoptera/VerticalStackedBar_L. migratoria.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Plots_Polyneoptera/VerticalStackedBar_P. americana.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Plots_Polyneoptera/VerticalStackedBar_americana.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Plots_Polyneoptera/VerticalStackedBar_cancellata.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Plots_Polyneoptera/VerticalStackedBar_cubense.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Plots_Polyneoptera/VerticalStackedBar_gregaria.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Plots_Polyneoptera/VerticalStackedBar_nitens.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2_iqtree/Plots_Polyneoptera/VerticalStackedBar_piceifrons.pdf
    Deleted:    data/orthofinder/Schistocerca/Results_I2/Orthogroups_Schistocerca_Jan2025.txt
    Modified:   data/orthofinder/Schistocerca/Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_annotated_May2025.csv
    Modified:   data/orthofinder/Schistocerca/Results_I2/Plots_Schistocerca/VerticalStackedBar_americana.pdf
    Modified:   data/orthofinder/Schistocerca/Results_I2/Plots_Schistocerca/VerticalStackedBar_cancellata.pdf
    Modified:   data/orthofinder/Schistocerca/Results_I2/Plots_Schistocerca/VerticalStackedBar_cubense.pdf
    Modified:   data/orthofinder/Schistocerca/Results_I2/Plots_Schistocerca/VerticalStackedBar_gregaria.pdf
    Modified:   data/orthofinder/Schistocerca/Results_I2/Plots_Schistocerca/VerticalStackedBar_nitens.pdf
    Modified:   data/orthofinder/Schistocerca/Results_I2/Plots_Schistocerca/VerticalStackedBar_piceifrons.pdf
    Deleted:    data/overlap/Bulk_RNAseq/americana/scatter_plot_americana_togregaria.png
    Deleted:    data/overlap/Bulk_RNAseq/americana/venn_diagram_americana_togregaria.png
    Deleted:    data/overlap/Bulk_RNAseq/cubense/scatter_plot_cubense_togregaria.png
    Deleted:    data/overlap/Bulk_RNAseq/cubense/venn_diagram_cubense_togregaria.png
    Deleted:    data/overlap/Bulk_RNAseq/gregaria/scatter_plot_gregaria_togregaria.png
    Deleted:    data/overlap/Bulk_RNAseq/gregaria/venn_diagram_gregaria_togregaria.png
    Deleted:    data/overlap/Bulk_RNAseq/nitens/venn_diagram_nitens_togregaria.png
    Deleted:    data/overlap/Bulk_RNAseq/overlapping_genes_head_thorax_americana.csv
    Deleted:    data/overlap/Bulk_RNAseq/overlapping_genes_head_thorax_cancellata.csv
    Deleted:    data/overlap/Bulk_RNAseq/overlapping_genes_head_thorax_cubense.csv
    Deleted:    data/overlap/Bulk_RNAseq/overlapping_genes_head_thorax_gregaria.csv
    Deleted:    data/overlap/Bulk_RNAseq/overlapping_genes_head_thorax_piceifrons.csv
    Deleted:    data/overlap/Bulk_RNAseq/piceifrons/scatter_plot_piceifrons_togregaria.png
    Deleted:    data/overlap/Bulk_RNAseq/piceifrons/venn_diagram_piceifrons_togregaria.png
    Deleted:    data/overlap/Bulk_RNAseq/scatter_plot_overlapping_genes_americana.png
    Deleted:    data/overlap/Bulk_RNAseq/scatter_plot_overlapping_genes_cancellata.png
    Deleted:    data/overlap/Bulk_RNAseq/scatter_plot_overlapping_genes_cubense.png
    Deleted:    data/overlap/Bulk_RNAseq/scatter_plot_overlapping_genes_gregaria.png
    Deleted:    data/overlap/Bulk_RNAseq/scatter_plot_overlapping_genes_piceifrons.png
    Deleted:    data/overlap/summaryPolyneoptera_I2_DEGs_Orthogroups_May2025.csv
    Deleted:    data/overlap/summaryPolyneoptera_I2_DEGs_Orthogroups_togregaria_May2025.csv
    Deleted:    data/overlap/summarySchistocerca_I2_DEGs_Orthogroups_May2025.csv
    Deleted:    data/overlap/summarySchistocerca_I2_DEGs_Orthogroups_togregaria_May2025.csv
    Deleted:    data/pathway_enrichment/DESeq2_sigresults_sva_Thorax_cancellata_togregaria.csv
    Deleted:    data/pathway_enrichment/DGE_results_UNCH_vs_GFP.csv
    Deleted:    data/pathway_enrichment/EggNog_Arthropoda_one2one.emapper.annotations
    Deleted:    data/pathway_enrichment/Functional_Enrichment.Rmd
    Deleted:    data/pathway_enrichment/custom_sgregaria_orgdb/org.Sgregaria.eg.db/DESCRIPTION
    Deleted:    data/pathway_enrichment/custom_sgregaria_orgdb/org.Sgregaria.eg.db/NAMESPACE
    Deleted:    data/pathway_enrichment/custom_sgregaria_orgdb/org.Sgregaria.eg.db/R/zzz.R
    Deleted:    data/pathway_enrichment/custom_sgregaria_orgdb/org.Sgregaria.eg.db/inst/extdata/org.Sgregaria.eg.sqlite
    Deleted:    data/pathway_enrichment/custom_sgregaria_orgdb/org.Sgregaria.eg.db/man/org.Sgregaria.egBASE.Rd
    Deleted:    data/pathway_enrichment/custom_sgregaria_orgdb/org.Sgregaria.eg.db/man/org.Sgregaria.egORGANISM.Rd
    Deleted:    data/pathway_enrichment/custom_sgregaria_orgdb/org.Sgregaria.eg.db/man/org.Sgregaria.eg_dbconn.Rd
    Deleted:    data/pathway_enrichment/custom_sgregaria_orgdb/org.Sgregaria.eg.db_v1.tar.gz
    Deleted:    data/pathway_enrichment/sessionInfo.txt

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/3_overlap-venn.Rmd) and HTML (docs/3_overlap-venn.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 4e391c3 Maeva TECHER 2025-05-30 add new analysis orthology, synteny
html 4e391c3 Maeva TECHER 2025-05-30 add new analysis orthology, synteny
Rmd 9451c02 Maeva TECHER 2025-03-03 adding GO enrich
html 9451c02 Maeva TECHER 2025-03-03 adding GO enrich
html 5fe5034 Maeva TECHER 2025-02-27 Build site.
Rmd b540a1e Maeva TECHER 2025-02-27 Updating overlap and RNAi
html b540a1e Maeva TECHER 2025-02-27 Updating overlap and RNAi
Rmd 89984c0 Maeva TECHER 2025-02-19 Add overlap update
html 89984c0 Maeva TECHER 2025-02-19 Add overlap update
Rmd d7fa779 Maeva TECHER 2025-02-14 Update RNAi and overlap
html d7fa779 Maeva TECHER 2025-02-14 Update RNAi and overlap
Rmd 3746422 Maeva TECHER 2025-02-12 Add RNAi
html 3746422 Maeva TECHER 2025-02-12 Add RNAi
Rmd 34c299a Maeva TECHER 2025-02-06 Overlap confirmed
html 34c299a Maeva TECHER 2025-02-06 Overlap confirmed
Rmd db8b525 Maeva TECHER 2025-02-06 update overlap
Rmd aab712a Maeva TECHER 2025-02-04 change overlap
html aab712a Maeva TECHER 2025-02-04 change overlap
Rmd faf2db3 Maeva TECHER 2025-01-13 update markdown
Rmd fe6dae9 Maeva TECHER 2024-11-19 changes ESA
html fe6dae9 Maeva TECHER 2024-11-19 changes ESA
Rmd 3fa8e62 Maeva TECHER 2024-11-09 updated analysis
html 3fa8e62 Maeva TECHER 2024-11-09 updated analysis
Rmd edb70fe Maeva TECHER 2024-11-08 overlap and deg results created
html edb70fe Maeva TECHER 2024-11-08 overlap and deg results created
html ba35b82 Maeva A. TECHER 2024-06-20 Build site.
html 45d0b6b Maeva A. TECHER 2024-05-16 Build site.
Rmd 5dff93d Maeva A. TECHER 2024-05-16 wflow_publish("analysis/3_overlap-venn.Rmd")

Load libraries

We start by loading all the required R packages.

#(install first from CRAN or Bioconductor)
library("knitr")
library("dplyr") 
library("ggplot2")
library("plotly")
library("htmlwidgets")  # For saving interactive plots
library("ggVennDiagram")
library("pheatmap")
library("tidyr")
library("RColorBrewer")
library("viridis")
library("kableExtra")
library("tibble")
library("VennDiagram")
library("gridExtra")
library("grid")
library("DT")
library("readr")
library("tidyverse")
library("data.table")
library("UpSetR")
library("ComplexUpset")

# Path for all species
workDir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data"
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Schistocerca"
allspecies_path <- file.path(workDir, "/list/13polyneoptera_geneid_ncbi.csv")
allspecies_df <- read.table(allspecies_path, sep = ",", header = TRUE, quote = "", fill = TRUE, stringsAsFactors = FALSE)
species_list <- c("gregaria", "piceifrons", "cancellata", "americana", "cubense", "nitens")
species_order <- c( "nitens", "cubense", "americana",  "piceifrons", "cancellata", "gregaria")

Here our objective is to compare the abundance, composition and overlap of the DEGs found in the head and thorax tissues of each species between the isolated and crowded last instar females. We found that the differential genes expressed detected by DESeq2 varied across species and tissues but we need some perspective: Are locusts up-regulated and down-regulated the same genes? In the later section GO enrichment, we will investigate what are the functions of these genes as we will see that each species seems to show different gene expression profiles in response to density changes.

STRATEGY 1: One genome S. gregaria

1. DEGs comparison among species

We summarized the number of genes differential expressed between density for each species and each tissues.

# Initialize empty lists to store results
summary_list_head <- list()
summary_list_thorax <- list()

# Loop through each species to process their data
for (species in species_list) {
    # Read the DESeq2 results
  head_results_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species ,"_togregaria.csv"))
  thorax_results_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species ,"_togregaria.csv"))

    head_sigresults <- fread(head_results_file)  # fread is faster and uses less memory
    thorax_sigresults <- fread(thorax_results_file)

    # Count upregulated and downregulated genes for head
    head_upregulated <- sum(head_sigresults$log2FoldChange > 0)
    head_downregulated <- sum(head_sigresults$log2FoldChange < 0)
    head_upregulated_strict <- sum(head_sigresults$log2FoldChange > 1)
    head_downregulated_strict <- sum(head_sigresults$log2FoldChange < -1)

    # Count upregulated and downregulated genes for thorax
    thorax_upregulated <- sum(thorax_sigresults$log2FoldChange > 0)
    thorax_downregulated <- sum(thorax_sigresults$log2FoldChange < 0)
    thorax_upregulated_strict <- sum(thorax_sigresults$log2FoldChange > 1)
    thorax_downregulated_strict <- sum(thorax_sigresults$log2FoldChange < -1)

    # Store results in the list
    summary_list_head[[species]] <- data.frame(
        Species = species,
        Head_Upregulated = head_upregulated,
        Head_Downregulated = head_downregulated,
        Head_Upregulated_Strict = head_upregulated_strict,
        Head_Downregulated_Strict = head_downregulated_strict
    )

    summary_list_thorax[[species]] <- data.frame(
        Species = species,
        Thorax_Upregulated = thorax_upregulated,
        Thorax_Downregulated = thorax_downregulated,
        Thorax_Upregulated_Strict = thorax_upregulated_strict,
        Thorax_Downregulated_Strict = thorax_downregulated_strict
    )
}

# Combine lists into final data frames
summary_table_head <- bind_rows(summary_list_head)
summary_table_thorax <- bind_rows(summary_list_thorax)

# Print the summary table in a markdown-friendly format
knitr::kable(summary_table_head, format = "markdown", caption = "Summary of differentially expressed genes in head per species")
Summary of differentially expressed genes in head per species
Species Head_Upregulated Head_Downregulated Head_Upregulated_Strict Head_Downregulated_Strict
gregaria 397 327 397 327
piceifrons 161 160 161 160
cancellata 249 325 249 325
americana 283 225 283 225
cubense 19 17 19 17
nitens 89 206 89 206
# Convert the summary table to a long format for easier plotting
summary_long_head <- summary_table_head %>%
  pivot_longer(cols = c(Head_Upregulated_Strict, Head_Downregulated_Strict),
               names_to = "Tissue", values_to = "Count")

# Adjust the values for downregulated genes to be negative
summary_long_head <- summary_long_head %>%
  mutate(Count = ifelse(Tissue == "Head_Downregulated_Strict", -Count, Count))

summary_long_head$Species <- factor(summary_long_head$Species, levels = species_order)

# Plot barplot for head
ggplot(summary_long_head, aes(x = Species, y = Count, fill = Tissue)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Upregulated and Downregulated Genes in Head (absolute lfc >1)",
       x = "Species", y = "Number of Genes") +
  scale_fill_manual(values = c("Head_Upregulated_Strict" = "red2", "Head_Downregulated_Strict" = "blue")) +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200)) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1)) +
  coord_flip()

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
3fa8e62 Maeva TECHER 2024-11-09
edb70fe Maeva TECHER 2024-11-08
# Print the summary table for thorax
knitr::kable(summary_table_thorax, format = "markdown", caption = "Summary of differentially expressed genes in thorax per species")
Summary of differentially expressed genes in thorax per species
Species Thorax_Upregulated Thorax_Downregulated Thorax_Upregulated_Strict Thorax_Downregulated_Strict
gregaria 463 620 463 620
piceifrons 497 182 497 182
cancellata 256 261 256 261
americana 127 275 127 275
cubense 50 137 50 137
nitens 0 0 0 0
# Convert the summary table to a long format for thorax
summary_long_thorax <- summary_table_thorax %>%
  pivot_longer(cols = c(Thorax_Upregulated_Strict, Thorax_Downregulated_Strict),
               names_to = "Tissue", values_to = "Count")

# Adjust the values for downregulated genes to be negative
summary_long_thorax <- summary_long_thorax %>%
  mutate(Count = ifelse(Tissue == "Thorax_Downregulated_Strict", -Count, Count))

summary_long_thorax$Species <- factor(summary_long_thorax$Species, levels = species_order)

# Plot barplot for thorax
ggplot(summary_long_thorax, aes(x = Species, y = Count, fill = Tissue)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Upregulated and Downregulated Genes in Thorax (absolute lfc >1)",
       x = "Species", y = "Number of Genes") +
  scale_fill_manual(values = c("Thorax_Upregulated_Strict" = "red2", "Thorax_Downregulated_Strict" = "blue")) +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200)) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1)) +
  coord_flip()

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
3fa8e62 Maeva TECHER 2024-11-09
edb70fe Maeva TECHER 2024-11-08
# Define custom colors for each GeneType
custom_colors <- c(
  "transcribed_pseudogene" = "#F4F1BB",  # Example color for transcribed_pseudogene
  "protein-coding" = "#9B57D3",         # Example color for protein-coding
  "lncRNA" = "#A5300F",                 # Example color for lncRNA
  "tRNA" = "#74D055FF",                   # Example color for tRNA
  "misc_RNA" = "#3B6978",               # Example color for misc_RNA
  "ncRNA" = "#29AF7FFF",                  # Example color for ncRNA
  "pseudogene" = "#81B29A",             # Example color for pseudogene
  "rRNA" = "#5982DB",                   # Example color for rRNA
  "snoRNA" = "#DCE318FF",                 # Example color for snoRNA
  "snRNA" = "#665EB8"                   # Example color for snRNA
)

# Use scale_fill_manual to map the custom colors to the GeneTypes
custom_color_scale <- scale_fill_manual(
  values = custom_colors
)
# Create an empty list to store the data for all species
all_species_data <- list()

# Loop through each species to process their data
for (species in species_list) {
  # Read the DESeq2 results for head and thorax
  head_results_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species ,"_togregaria.csv"))
  thorax_results_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species ,"_togregaria.csv"))
  
  head_sigresults <- read.csv(head_results_file, stringsAsFactors = FALSE)
  thorax_sigresults <- read.csv(thorax_results_file, stringsAsFactors = FALSE)
  
  # Add GeneType and Species columns (from `allspecies_df`)
  head_results_merged <- merge(head_sigresults, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
  thorax_results_merged <- merge(thorax_sigresults, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
  
  # Count for upregulated and downregulated genes in head
  head_upregulated <- head_results_merged %>%
    filter(log2FoldChange > 1) %>%
    mutate(Regulation = "Upregulated", Tissue = "Head", Count = 1)
  
  head_downregulated <- head_results_merged %>%
    filter(log2FoldChange < -1) %>%
    mutate(Regulation = "Downregulated", Tissue = "Head", Count = -1)  # Mutate downregulated genes to negative
  
  # Combine upregulated and downregulated genes for head
  head_combined <- rbind(head_upregulated, head_downregulated)
  
  # Ensure all GeneTypes are represented for this species, even if they have no DEGs
  head_combined <- head_combined %>%
    complete(GeneType = unique(allspecies_df$GeneType), 
             fill = list(Count = 0))  # Fill missing GeneTypes with Count = 0
  
  # Count for upregulated and downregulated genes in thorax
  thorax_upregulated <- thorax_results_merged %>%
    filter(log2FoldChange > 1) %>%
    mutate(Regulation = "Upregulated", Tissue = "Thorax", Count = 1)
  
  thorax_downregulated <- thorax_results_merged %>%
    filter(log2FoldChange < -1) %>%
    mutate(Regulation = "Downregulated", Tissue = "Thorax", Count = -1)  # Mutate downregulated genes to negative
  
  # Combine upregulated and downregulated genes for thorax
  thorax_combined <- rbind(thorax_upregulated, thorax_downregulated)
  
  # Ensure all GeneTypes are represented for this species in thorax, even if they have no DEGs
  thorax_combined <- thorax_combined %>%
    complete(GeneType = unique(allspecies_df$GeneType), 
             fill = list(Count = 0))  # Fill missing GeneTypes with Count = 0
  
  # Combine data for head and thorax into one
  combined_data <- rbind(head_combined, thorax_combined)
  
  # Add species column to the data
  combined_data$Species <- species
  
  # Append the data to the list for all species
  all_species_data[[species]] <- combined_data
}

# Combine all species data into one data frame
final_data <- bind_rows(all_species_data)

# Reorder species according to the desired order
final_data$Species <- factor(final_data$Species, levels = species_order)

# Filter for head tissue only
final_data_head <- final_data %>% filter(Tissue == "Head")
final_data_thorax <- final_data %>% filter(Tissue == "Thorax")

# Create the barplot for all species and only head tissue
ggplot(final_data_head, aes(x = Species, y = Count, fill = GeneType)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "DEGs by Gene Biotype for Head (absolute lfc >1)",
       x = "Species",
       y = "Number of Genes") +
  custom_color_scale +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200))+
theme_minimal(base_size = 12) + 
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1), 
        axis.text.y = element_text(size = 12), 
        panel.grid.major.y = element_line(color = "grey90", linetype = "dashed"),
        panel.grid.minor = element_blank()) +
  coord_flip()  # Flip coordinates to make the plot horizontal

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Create the barplot for all species and only head tissue
ggplot(final_data_thorax, aes(x = Species, y = Count, fill = GeneType)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "DEGs by Gene Biotype for Thorax (absolute lfc >1)",
       x = "Species",
       y = "Number of Genes") +
  custom_color_scale +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200))+
theme_minimal(base_size = 12) + 
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1), 
        axis.text.y = element_text(size = 12), 
        panel.grid.major.y = element_line(color = "grey90", linetype = "dashed"),
        panel.grid.minor = element_blank()) +
  coord_flip()  # Flip coordinates to make the plot horizontal

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
8df3d7c Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

2. Overlap DEGs between tissues

gregaria


species <- "gregaria"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))


head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }
    
    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

piceifrons


species <- "piceifrons"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

cancellata


species <- "cancellata"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

americana


species <- "americana"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
8df3d7c Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

cubense


species <- "cubense"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

nitens


species <- "nitens"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

3. Overlap DEGs among species

Locusts

Head tissues

# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")

# Initialize an empty list to store DEG data
venn_data_locusts_up <- list()
venn_data_locusts_down <- list()
venn_data_locusts_all <- list()

# Function to load DEGs for a given group of species for head
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in locusts) {
    head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
    
    head_data <- read.csv(head_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(head_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    head_up <- head_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    head_down <- head_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- head_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- head_up$GeneID
    degs_down[[species]] <- head_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for head
venn_data_locusts <- load_deg_data(locusts)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_locusts$up[["gregaria"]],
  piceifrons = venn_data_locusts$up[["piceifrons"]],
  cancellata = venn_data_locusts$up[["cancellata"]]
)

venn_data_down <- list(
  gregaria = venn_data_locusts$down[["gregaria"]],
  piceifrons = venn_data_locusts$down[["piceifrons"]],
  cancellata = venn_data_locusts$down[["cancellata"]]
)

venn_data_all <- list(
  gregaria = venn_data_locusts$all[["gregaria"]],
  piceifrons = venn_data_locusts$all[["piceifrons"]],
  cancellata = venn_data_locusts$all[["cancellata"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("gregaria", "piceifrons", "cancellata"), 
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid"),  # Set colors for the groups
    alpha = 0.5, 
    cex = 2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("gregaria", "piceifrons", "cancellata")
    legend_colors <- c("orange", "red", "orchid")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }  
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for head upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Head Upregulated DEGs - Locusts", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for head downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Head Downregulated DEGs - Locusts", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - Locusts", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")

# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in locusts) {
  # Load DESeq2 results for head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
  
  # Load the data using fread() for memory efficiency
  head_data <- fread(head_file, data.table = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter significant DEGs first (reduces memory use in sorting)
  head_data_filtered <- head_data %>%
    filter(padj < 0.05, abs(log2FoldChange) > 1)  # Keep only strong up/downregulated DEGs
  
  # Select top 500 upregulated and top 500 downregulated genes
  head_up <- head_data_filtered %>%
    filter(log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  head_down <- head_data_filtered %>%
    filter(log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  # Combine data for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
  stop("No valid data available for heatmap generation.")
}

# **Fix duplicate GeneIDs: Aggregate log2FoldChange by taking the mean**
final_heatmap_data <- final_heatmap_data %>%
  group_by(GeneID, Species) %>%
  summarise(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop")

# **Create heatmap matrix without duplicates**
heatmap_matrix <- final_heatmap_data %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Thorax tissues

# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")

# Initialize an empty list to store DEG data
venn_data_locusts_up <- list()
venn_data_locusts_down <- list()
venn_data_locusts_all <- list()

# Function to load DEGs for a given group of species for thorax
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in locusts) {
    thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))
    
    thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(thorax_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    thorax_up <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    thorax_down <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- thorax_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- thorax_up$GeneID
    degs_down[[species]] <- thorax_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for thorax
venn_data_locusts <- load_deg_data(locusts)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_locusts$up[["gregaria"]],
  piceifrons = venn_data_locusts$up[["piceifrons"]],
  cancellata = venn_data_locusts$up[["cancellata"]]
)

venn_data_down <- list(
  gregaria = venn_data_locusts$down[["gregaria"]],
  piceifrons = venn_data_locusts$down[["piceifrons"]],
  cancellata = venn_data_locusts$down[["cancellata"]]
)

venn_data_all <- list(
  gregaria = venn_data_locusts$all[["gregaria"]],
  piceifrons = venn_data_locusts$all[["piceifrons"]],
  cancellata = venn_data_locusts$all[["cancellata"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("gregaria", "piceifrons", "cancellata"), 
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid"),  # Set colors for the groups
    alpha = 0.5, 
    cex = 2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("gregaria", "piceifrons", "cancellata")
    legend_colors <- c("orange", "red", "orchid")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }    
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for thorax upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Thorax Upregulated DEGs - Locusts", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for head downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Thorax Downregulated DEGs - Locusts", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - Locusts", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
8df3d7c Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in locusts) {
  # Load DESeq2 results for thorax
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))
  
  # Load the data using fread() for memory efficiency
  thorax_data <- fread(thorax_file, data.table = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter significant DEGs first (reduces memory use in sorting)
  thorax_data_filtered <- thorax_data %>%
    filter(padj < 0.05, abs(log2FoldChange) > 1)  # Keep only strong up/downregulated DEGs
  
  # Select top 500 upregulated and top 500 downregulated genes
  thorax_up <- thorax_data_filtered %>%
    filter(log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  thorax_down <- thorax_data_filtered %>%
    filter(log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  # Combine data for heatmap, adding the species column
  heatmap_data <- bind_rows(
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
  stop("No valid data available for heatmap generation.")
}

# **Fix duplicate GeneIDs: Aggregate log2FoldChange by taking the mean**
final_heatmap_data <- final_heatmap_data %>%
  group_by(GeneID, Species) %>%
  summarise(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop")

# **Create heatmap matrix without duplicates**
heatmap_matrix <- final_heatmap_data %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Thorax Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Thorax Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
8df3d7c Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

piceifrons-americana-cubense

Head tissues

PACclade <- c("piceifrons", "americana", "cubense")

# Initialize an empty list to store DEG data
venn_data_PACclade_up <- list()
venn_data_PACclade_down <- list()
venn_data_PACclade_all <- list()

# Function to load DEGs for a given group of species for head
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in PACclade) {
    head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
    
    head_data <- read.csv(head_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(head_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    head_up <- head_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    head_down <- head_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- head_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- head_up$GeneID
    degs_down[[species]] <- head_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for head
venn_data_PACclade <- load_deg_data(PACclade)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  piceifrons = venn_data_PACclade$up[["piceifrons"]],
  americana = venn_data_PACclade$up[["americana"]],
  cubense = venn_data_PACclade$up[["cubense"]]
)

venn_data_down <- list(
  piceifrons = venn_data_PACclade$down[["piceifrons"]],
  americana = venn_data_PACclade$down[["americana"]],
  cubense = venn_data_PACclade$down[["cubense"]]
)

venn_data_all <- list(
  piceifrons = venn_data_PACclade$all[["piceifrons"]],
  americana = venn_data_PACclade$all[["americana"]],
  cubense = venn_data_PACclade$all[["cubense"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("piceifrons", "americana", "cubense"), 
    filename = NULL, 
    output = TRUE, 
    fill = c("red", "green", "yellow"),  # Set colors for the groups
    alpha = 0.5, 
    cex = 2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("piceifrons", "americana", "cubense")
    legend_colors <- c("red", "green", "yellow")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }    
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for head upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Head Upregulated DEGs - PACclade", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for head downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Head Downregulated DEGs - PACclade", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - PACclade", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Define the species for Group 1
PACclade <- c("piceifrons", "americana", "cubense")

# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in PACclade) {
  # Load DESeq2 results for head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
  
  # Load the data using fread() for memory efficiency
  head_data <- fread(head_file, data.table = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter significant DEGs first (reduces memory use in sorting)
  head_data_filtered <- head_data %>%
    filter(padj < 0.05, abs(log2FoldChange) > 1)  # Keep only strong up/downregulated DEGs
  
  # Select top 500 upregulated and top 500 downregulated genes
  head_up <- head_data_filtered %>%
    filter(log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  head_down <- head_data_filtered %>%
    filter(log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  # Combine data for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
  stop("No valid data available for heatmap generation.")
}

# Fix duplicate GeneIDs: Aggregate log2FoldChange by taking the mean**
final_heatmap_data <- final_heatmap_data %>%
  group_by(GeneID, Species) %>%
  summarise(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop")

# *Create heatmap matrix without duplicates**
heatmap_matrix <- final_heatmap_data %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Thorax tissues

# Define the species for PACclade
PACclade <- c("piceifrons", "americana", "cubense")

# Initialize an empty list to store DEG data
venn_data_PACclade_up <- list()
venn_data_PACclade_down <- list()
venn_data_PACclade_all <- list()

# Function to load DEGs for a given group of species for thorax
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in PACclade) {
    thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))
    
    thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(thorax_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    thorax_up <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    thorax_down <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- thorax_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- thorax_up$GeneID
    degs_down[[species]] <- thorax_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for thorax
venn_data_PACclade <- load_deg_data(PACclade)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  piceifrons = venn_data_PACclade$up[["piceifrons"]],
  americana = venn_data_PACclade$up[["americana"]],
  cubense = venn_data_PACclade$up[["cubense"]]
)

venn_data_down <- list(
  piceifrons = venn_data_PACclade$down[["piceifrons"]],
  americana = venn_data_PACclade$down[["americana"]],
  cubense = venn_data_PACclade$down[["cubense"]]
)

venn_data_all <- list(
  piceifrons = venn_data_PACclade$all[["piceifrons"]],
  americana = venn_data_PACclade$all[["americana"]],
  cubense = venn_data_PACclade$all[["cubense"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("piceifrons", "americana", "cubense"), 
    filename = NULL, 
    output = TRUE, 
    fill = c("red", "green", "yellow"),   # Set colors for the groups
    alpha = 0.5, 
    cex = 2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("piceifrons", "americana", "cubense")
    legend_colors <- c("red", "green", "yellow")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }    
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for thorax upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Thorax Upregulated DEGs - PACclade", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for head downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Thorax Downregulated DEGs - PACclade", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - PACclade", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
PACclade <- c("piceifrons", "americana", "cubense")

# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in PACclade) {
  # Load DESeq2 results for thorax
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))
  
  # Load the data using fread() for memory efficiency
  thorax_data <- fread(thorax_file, data.table = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter significant DEGs first (reduces memory use in sorting)
  thorax_data_filtered <- thorax_data %>%
    filter(padj < 0.05, abs(log2FoldChange) > 1)  # Keep only strong up/downregulated DEGs
  
  # Select top 500 upregulated and top 500 downregulated genes
  thorax_up <- thorax_data_filtered %>%
    filter(log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  thorax_down <- thorax_data_filtered %>%
    filter(log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  # Combine data for heatmap, adding the species column
  heatmap_data <- bind_rows(
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
  stop("No valid data available for heatmap generation.")
}

# Fix duplicate GeneIDs: Aggregate log2FoldChange by taking the mean**
final_heatmap_data <- final_heatmap_data %>%
  group_by(GeneID, Species) %>%
  summarise(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop")

# *Create heatmap matrix without duplicates**
heatmap_matrix <- final_heatmap_data %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Thorax Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Thorax Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Plastic species

Head tissues

# Define the species for Group 1
plastic_species <- c("gregaria", "piceifrons", "cancellata","americana")

# Initialize an empty list to store DEG data
venn_data_plastic_species_up <- list()
venn_data_plastic_species_down <- list()
venn_data_plastic_species_all <- list()

# Function to load DEGs for a given group of species for head
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in plastic_species) {
    head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
    
    head_data <- read.csv(head_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(head_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    head_up <- head_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    head_down <- head_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- head_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- head_up$GeneID
    degs_down[[species]] <- head_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for head
venn_data_plastic_species <- load_deg_data(plastic_species)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_plastic_species$up[["gregaria"]],
  piceifrons = venn_data_plastic_species$up[["piceifrons"]],
  cancellata = venn_data_plastic_species$up[["cancellata"]],
  americana = venn_data_plastic_species$up[["americana"]]
)

venn_data_down <- list(
  gregaria = venn_data_plastic_species$down[["gregaria"]],
  piceifrons = venn_data_plastic_species$down[["piceifrons"]],
  cancellata = venn_data_plastic_species$down[["cancellata"]],
  americana = venn_data_plastic_species$down[["americana"]]
)

venn_data_all <- list(
  gregaria = venn_data_plastic_species$all[["gregaria"]],
  piceifrons = venn_data_plastic_species$all[["piceifrons"]],
  cancellata = venn_data_plastic_species$all[["cancellata"]],
  americana = venn_data_plastic_species$all[["americana"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("gregaria", "piceifrons", "cancellata","americana"),
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid", "green"),  # Set colors for the groups
    alpha = 0.5, 
    cex = 2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("gregaria", "piceifrons", "cancellata","americana")
    legend_colors <- c("orange", "red", "orchid", "green")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }    
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for head upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Head Upregulated DEGs - plastic_species", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
8df3d7c Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for head downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Head Downregulated DEGs - plastic_species", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
8df3d7c Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - plastic_species", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Define the species for Group 1
plastic_species <- c("gregaria", "piceifrons", "cancellata","americana")

# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in locusts) {
  # Load DESeq2 results for head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
  
  # Load the data using fread() for memory efficiency
  head_data <- fread(head_file, data.table = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter significant DEGs first (reduces memory use in sorting)
  head_data_filtered <- head_data %>%
    filter(padj < 0.05, abs(log2FoldChange) > 1)  # Keep only strong up/downregulated DEGs
  
  # Select top 500 upregulated and top 500 downregulated genes
  head_up <- head_data_filtered %>%
    filter(log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  head_down <- head_data_filtered %>%
    filter(log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  # Combine data for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
  stop("No valid data available for heatmap generation.")
}

# **Fix duplicate GeneIDs: Aggregate log2FoldChange by taking the mean**
final_heatmap_data <- final_heatmap_data %>%
  group_by(GeneID, Species) %>%
  summarise(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop")

# **Create heatmap matrix without duplicates**
heatmap_matrix <- final_heatmap_data %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
8df3d7c Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Thorax tissues

plastic_species <- c("gregaria", "piceifrons", "cancellata","americana")

# Initialize an empty list to store DEG data
venn_data_plastic_species_up <- list()
venn_data_plastic_species_down <- list()
venn_data_plastic_species_all <- list()

# Function to load DEGs for a given group of species for thorax
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in plastic_species) {
    thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))
    
    thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(thorax_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    thorax_up <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    thorax_down <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- thorax_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- thorax_up$GeneID
    degs_down[[species]] <- thorax_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for thorax
venn_data_plastic_species <- load_deg_data(plastic_species)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_plastic_species$up[["gregaria"]],
  piceifrons = venn_data_plastic_species$up[["piceifrons"]],
  cancellata = venn_data_plastic_species$up[["cancellata"]],
  americana = venn_data_plastic_species$up[["americana"]]
)

venn_data_down <- list(
  gregaria = venn_data_plastic_species$down[["gregaria"]],
  piceifrons = venn_data_plastic_species$down[["piceifrons"]],
  cancellata = venn_data_plastic_species$down[["cancellata"]],
  americana = venn_data_plastic_species$down[["americana"]]
)

venn_data_all <- list(
  gregaria = venn_data_plastic_species$all[["gregaria"]],
  piceifrons = venn_data_plastic_species$all[["piceifrons"]],
  cancellata = venn_data_plastic_species$all[["cancellata"]],
  americana = venn_data_plastic_species$all[["americana"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("gregaria", "piceifrons", "cancellata","americana"),
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid", "green"),  # Set colors for the groups
    alpha = 0.5, 
    cex = 2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("gregaria", "piceifrons", "cancellata","americana")
    legend_colors <- c("orange", "red", "orchid", "green")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }    
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for thorax upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Thorax Upregulated DEGs - plastic_species", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for thorax downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Thorax Downregulated DEGs - plastic_species", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - plastic_species", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
plastic_species <- c("gregaria", "piceifrons", "cancellata","americana")

# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in plastic_species) {
  # Load DESeq2 results for thorax
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))
  
  # Load the data using fread() for memory efficiency
  thorax_data <- fread(thorax_file, data.table = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter significant DEGs first (reduces memory use in sorting)
  thorax_data_filtered <- thorax_data %>%
    filter(padj < 0.05, abs(log2FoldChange) > 1)  # Keep only strong up/downregulated DEGs
  
  # Select top 500 upregulated and top 500 downregulated genes
  thorax_up <- thorax_data_filtered %>%
    filter(log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  thorax_down <- thorax_data_filtered %>%
    filter(log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  # Combine data for heatmap, adding the species column
  heatmap_data <- bind_rows(
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
  stop("No valid data available for heatmap generation.")
}

# Fix duplicate GeneIDs: Aggregate log2FoldChange by taking the mean**
final_heatmap_data <- final_heatmap_data %>%
  group_by(GeneID, Species) %>%
  summarise(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop")

# *Create heatmap matrix without duplicates**
heatmap_matrix <- final_heatmap_data %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Thorax Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Thorax Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Five species

Combined tissues

# Define the species for Group 1
allspecies <- c("gregaria", "piceifrons", "cancellata","americana", "cubense")

# Initialize an empty list to store DEG data
venn_data_allspecies_up <- list()
venn_data_allspecies_down <- list()
venn_data_allspecies_all <- list()

# Function to load DEGs for a given group of species for head
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in allspecies) {
    head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
    
    head_data <- read.csv(head_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(head_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    head_up <- head_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    head_down <- head_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- head_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- head_up$GeneID
    degs_down[[species]] <- head_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for head
venn_data_allspecies <- load_deg_data(allspecies)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_allspecies$up[["gregaria"]],
  piceifrons = venn_data_allspecies$up[["piceifrons"]],
  cancellata = venn_data_allspecies$up[["cancellata"]],
  americana = venn_data_allspecies$up[["americana"]],
  cubense = venn_data_allspecies$up[["cubense"]]
)

venn_data_down <- list(
  gregaria = venn_data_allspecies$down[["gregaria"]],
  piceifrons = venn_data_allspecies$down[["piceifrons"]],
  cancellata = venn_data_allspecies$down[["cancellata"]],
  americana = venn_data_allspecies$down[["americana"]],
  cubense = venn_data_allspecies$down[["cubense"]]
)

venn_data_all <- list(
  gregaria = venn_data_allspecies$all[["gregaria"]],
  piceifrons = venn_data_allspecies$all[["piceifrons"]],
  cancellata = venn_data_allspecies$all[["cancellata"]],
  americana = venn_data_allspecies$all[["americana"]],
  cubense = venn_data_allspecies$all[["cubense"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("gregaria", "piceifrons", "cancellata","americana", "cubense"),
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid", "green", "yellow"),  # Set colors for the groups
    alpha = 0.5, 
    cex = 2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("gregaria", "piceifrons", "cancellata","americana", "cubense")
    legend_colors <- c("orange", "red", "orchid", "green", "yellow")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }      
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for head upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Head Upregulated DEGs - all species", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
# Display the Venn diagram and datatable for head downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Head Downregulated DEGs - all species", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - all species", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
# Thorax
# Initialize an empty list to store DEG data
venn_data_allspecies_up <- list()
venn_data_allspecies_down <- list()
venn_data_allspecies_all <- list()

# Function to load DEGs for a given group of species for thorax
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in allspecies) {
    thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))
    
    thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(thorax_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    thorax_up <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    thorax_down <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- thorax_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- thorax_up$GeneID
    degs_down[[species]] <- thorax_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for thorax
venn_data_allspecies <- load_deg_data(allspecies)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_allspecies$up[["gregaria"]],
  piceifrons = venn_data_allspecies$up[["piceifrons"]],
  cancellata = venn_data_allspecies$up[["cancellata"]],
  americana = venn_data_allspecies$up[["americana"]],
  cubense = venn_data_allspecies$up[["cubense"]]
)

venn_data_down <- list(
  gregaria = venn_data_allspecies$down[["gregaria"]],
  piceifrons = venn_data_allspecies$down[["piceifrons"]],
  cancellata = venn_data_allspecies$down[["cancellata"]],
  americana = venn_data_allspecies$down[["americana"]],
  cubense = venn_data_allspecies$down[["cubense"]]
)

venn_data_all <- list(
  gregaria = venn_data_allspecies$all[["gregaria"]],
  piceifrons = venn_data_allspecies$all[["piceifrons"]],
  cancellata = venn_data_allspecies$all[["cancellata"]],
  americana = venn_data_allspecies$all[["americana"]],
  cubense = venn_data_allspecies$all[["cubense"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("gregaria", "piceifrons", "cancellata","americana", "cubense"),
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid", "green", "yellow"),  # Set colors for the groups
    alpha = 0.5, 
    cex = 2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)
    # Manually create a custom legend
    legend_labels <- c("gregaria", "piceifrons", "cancellata","americana", "cubense")
    legend_colors <- c("orange", "red", "orchid", "green", "yellow")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }     
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for thorax upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Thorax Upregulated DEGs - all species", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
# Display the Venn diagram and datatable for head downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Thorax Downregulated DEGs - all species", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - all species", allspecies_df)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in species_list) {
  # Load DESeq2 results for head and thorax
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))
  
  # Load the data
  head_data <- read.csv(head_file, stringsAsFactors = FALSE)
  thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter for significant DEGs and select top 100 upregulated and downregulated genes for each tissue
  head_up <- head_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  head_down <- head_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  thorax_up <- thorax_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  thorax_down <- thorax_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species),
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
  stop("No valid data available for heatmap generation.")
}

# Create heatmap matrix
# Aggregate log2FoldChange values correctly
heatmap_matrix <- final_heatmap_data %>%
  group_by(GeneID, Species, Tissue) %>%
  summarize(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop") %>%

  # Convert to wide format (no need for redundant grouping)
  pivot_wider(names_from = c(Species, Tissue), values_from = log2FoldChange, values_fill = 0) %>%

  # Ensure unique row names
  column_to_rownames("GeneID") %>%
  as.matrix()


custom_cyan_orange_palette <- colorRampPalette(c("cyan", "cyan2", "cyan3", "black", "orange3", "orange2", "orange"))(100)
custom_blue_red_palette <- colorRampPalette(c("blue3", "blue2", "blue1", "white", "red", "red2", "red3"))(100)

# Define color breaks to ensure **black = 0**
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create first heatmap with blue-red gradient
pheatmap(
  heatmap_matrix,
  color = custom_blue_red_palette,  
  breaks = color_breaks,  
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head and Thorax Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
# Create second heatmap with cyan-black-orange gradient
pheatmap(
  heatmap_matrix,
  color = custom_cyan_orange_palette,  
  breaks = color_breaks,  
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head and Thorax Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27

Head tissues

# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in species_list) {
  # Load DESeq2 results for head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,"_togregaria.csv"))
  
  # Load the data
  head_data <- read.csv(head_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter for significant DEGs and select top 100 upregulated and downregulated genes for each tissue
  head_up <- head_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  head_down <- head_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data
final_heatmap_data <- bind_rows(heatmap_list)

# Ensure all species are represented, even if they have no significant DEGs
for (species in species_order) {
    if (!species %in% unique(final_heatmap_data$Species)) {
        message(paste("Adding placeholder for missing species:", species))
        final_heatmap_data <- bind_rows(
            final_heatmap_data,
            data.frame(
                GeneID = "Unassigned",  # Placeholder GeneID
                log2FoldChange = 0,
                Tissue = "Head",
                Regulation = "None",
                Species = species
            )
        )
    }
}

# Ensure species order in the data
final_heatmap_data$Species <- factor(final_heatmap_data$Species, levels = species_order)

# Create heatmap matrix (Thorax only)
heatmap_matrix <- final_heatmap_data %>%
  group_by(GeneID, Species) %>% 
  summarize(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop") %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Explicitly reorder the columns in heatmap_matrix
heatmap_matrix <- heatmap_matrix[, species_order, drop = FALSE]

# Define color palettes
custom_cyan_orange_palette <- colorRampPalette(c("cyan", "cyan2", "cyan3", "black", "orange3", "orange2", "orange"))(100)
custom_blue_red_palette <- colorRampPalette(c("blue3", "blue2", "blue1", "white", "red", "red2", "red3"))(100)

# Define color breaks to ensure **black = 0**
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create first heatmap with blue-red gradient
pheatmap(
  heatmap_matrix,
  color = custom_blue_red_palette,  
  breaks = color_breaks,  
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
# Create second heatmap with cyan-black-orange gradient
pheatmap(
  heatmap_matrix,
  color = custom_cyan_orange_palette,  
  breaks = color_breaks,  
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27

Thorax tissues

# Define species order explicitly to ensure consistency
species_order <- c("nitens", "cubense", "americana", "piceifrons", "cancellata", "gregaria")

# Initialize an empty list to store heatmap data
heatmap_list <- list()

# Loop through each species to process their Thorax data
for (species in species_order) {
  message(paste("Processing species:", species))

  # Define file path for Thorax
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,"_togregaria.csv"))

  # Check if file exists before loading
  if (!file.exists(thorax_file)) {
    message(paste("Missing Thorax file for:", species, "- Assigning empty dataset"))
    thorax_data <- data.frame(GeneID = character(), padj = numeric(), log2FoldChange = numeric(), stringsAsFactors = FALSE)
  } else {
    thorax_data <- tryCatch(read.csv(thorax_file, stringsAsFactors = FALSE), error = function(e) data.frame())
  }

  # Ensure GeneID column exists
  if (!"GeneID" %in% colnames(thorax_data) && "X" %in% colnames(thorax_data)) {
    colnames(thorax_data)[colnames(thorax_data) == "X"] <- "GeneID"
  }

  # Convert GeneID to character
  thorax_data$GeneID <- as.character(thorax_data$GeneID)

  # If no significant DEGs are found, ensure the structure is correct
  if (nrow(thorax_data) == 0) {
    message(paste("No significant Thorax DEGs for:", species, "- Assigning placeholder values"))
    thorax_data <- data.frame(
      GeneID = "Unassigned",
      log2FoldChange = 0,
      Tissue = "Thorax",
      Regulation = "None",
      Species = species
    )
  } else {
    # Filter for significant DEGs and select top 500 upregulated and downregulated genes
    thorax_up <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange > 1) %>%
      arrange(desc(log2FoldChange)) %>%
      slice(1:500)

    thorax_down <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange < -1) %>%
      arrange(log2FoldChange) %>%
      slice(1:500)

    # Combine data and prepare for heatmap
    thorax_data <- bind_rows(
      thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
      thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
    ) %>%
      select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  }

  # Append to heatmap list, ensuring species is represented
  heatmap_list[[species]] <- thorax_data
}

# Combine all species data
final_heatmap_data <- bind_rows(heatmap_list)

# Ensure all species are represented, even if they have no significant DEGs
for (species in species_order) {
    if (!species %in% unique(final_heatmap_data$Species)) {
        message(paste("Adding placeholder for missing species:", species))
        final_heatmap_data <- bind_rows(
            final_heatmap_data,
            data.frame(
                GeneID = "Unassigned",  # Placeholder GeneID
                log2FoldChange = 0,
                Tissue = "Thorax",
                Regulation = "None",
                Species = species
            )
        )
    }
}

# Ensure species order in the data
final_heatmap_data$Species <- factor(final_heatmap_data$Species, levels = species_order)

# Create heatmap matrix (Thorax only)
heatmap_matrix <- final_heatmap_data %>%
  group_by(GeneID, Species) %>% 
  summarize(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop") %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Explicitly reorder the columns in heatmap_matrix
heatmap_matrix <- heatmap_matrix[, species_order, drop = FALSE]

# Define color palettes
custom_cyan_orange_palette <- colorRampPalette(c("cyan", "cyan2", "cyan3", "black", "orange3", "orange2", "orange"))(100)
custom_blue_red_palette <- colorRampPalette(c("blue3", "blue2", "blue1", "white", "red", "red2", "red3"))(100)

# Define color breaks
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)

# Generate heatmaps (Only thorax)
pheatmap(
  heatmap_matrix,
  color = custom_blue_red_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of GeneID Expression in Thorax Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
pheatmap(
  heatmap_matrix,
  color = custom_cyan_orange_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of GeneID Expression in Thorax Tissue - STRATEGY 1"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27

All species

Combined tissues

# Define the species list
allspecies <- c("nitens", "cubense", "americana", "piceifrons", "cancellata", "gregaria")

# Function to load DEGs for all species
load_species_deg_data <- function(tissue) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in allspecies) {
    deg_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0(species, "/", tissue, "/DESeq2_sigresults_sva_", tissue, "_", species, "_togregaria.csv"))
    
    if (!file.exists(deg_file)) {
      message(paste("File missing for species:", species))
      next  # Skip if the file doesn't exist
    }
    
    deg_data <- read.csv(deg_file, stringsAsFactors = FALSE)
    
    # Check if data is empty
    if (nrow(deg_data) == 0) {
      message(paste("No data for species:", species))
      next
    }
    
    # Select significant DEGs (Up, Down, All)
    degs_up[[species]] <- deg_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      pull(X)  # Ensure GeneID is extracted properly
    
    degs_down[[species]] <- deg_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      pull(X)
    
    degs_all[[species]] <- deg_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      pull(X)
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for head and thorax
venn_data_head <- load_species_deg_data("Head")
venn_data_thorax <- load_species_deg_data("Thorax")

# Function to visualize Venn diagram using ggVennDiagram
display_ggvenn_plot <- function(venn_data, title) {
  gg_venn <- ggVennDiagram(venn_data, label_alpha = 0, edge_lty = "dashed") +
    scale_fill_gradient(low = "lightblue", high = "darkblue") +
    labs(title = title) +
    theme_minimal(base_size = 14)
  
  return(gg_venn)
}

# **Generate Venn diagrams**
ggvenn_head_up <- display_ggvenn_plot(venn_data_head$up, "Venn Diagram of Head Upregulated DEGs - All Species")
ggvenn_head_down <- display_ggvenn_plot(venn_data_head$down, "Venn Diagram of Head Downregulated DEGs - All Species")
ggvenn_head_all <- display_ggvenn_plot(venn_data_head$all, "Venn Diagram of All Significant DEGs (Head) - All Species")

ggvenn_thorax_up <- display_ggvenn_plot(venn_data_thorax$up, "Venn Diagram of Thorax Upregulated DEGs - All Species")
ggvenn_thorax_down <- display_ggvenn_plot(venn_data_thorax$down, "Venn Diagram of Thorax Downregulated DEGs - All Species")
ggvenn_thorax_all <- display_ggvenn_plot(venn_data_thorax$all, "Venn Diagram of All Significant DEGs (Thorax) - All Species") 


####### Upset plots

load_deseq2_upset_data <- function(tissue) {
  # Initialize an empty list to store gene sets per species
  species_deg_list <- list()

  for (species in allspecies) {
    # Construct the correct file path
    deg_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", species, tissue, 
                          paste0("DESeq2_sigresults_sva_", tissue, "_", species, "_togregaria.csv"))
    
    # Skip if file does not exist
    if (!file.exists(deg_file)) {
      message(paste("File missing for species:", species))
      next
    }
    
    # Load the DESeq2 results file
    deseq_data <- read.csv(deg_file, stringsAsFactors = FALSE)

    # Check for the correct column name
    if (!"GeneID" %in% colnames(deseq_data)) {
      if ("X" %in% colnames(deseq_data)) {
        colnames(deseq_data)[colnames(deseq_data) == "X"] <- "GeneID"
      } else {
        stop(paste("Error: No 'GeneID' or 'X' column found in", deg_file))
      }
    }

    # Filter for significant DEGs
    significant_genes <- deseq_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      pull(GeneID)

    # Store the gene list for the species
    species_deg_list[[species]] <- significant_genes
  }

  # Create a binary matrix for UpSet plot
  all_genes <- unique(unlist(species_deg_list))  # Collect all unique DEGs
  upset_data <- data.frame(GeneID = all_genes)

  for (species in allspecies) {
    upset_data[[species]] <- as.integer(all_genes %in% species_deg_list[[species]])
  }

  return(upset_data)
}

upset_data_head <- load_deseq2_upset_data("Head")
upset_data_thorax <- load_deseq2_upset_data("Thorax")


###############

display_upset_plot <- function(upset_data, title) {
    upset_plot <- upset(
        upset_data,
        allspecies,
                queries = list(
            upset_query(
                intersect = c('gregaria', 'cancellata'),
                color = 'orange',
                fill = 'orange',
                only_components = c('intersections_matrix', 'Intersection size')
            ),
            upset_query(
                intersect = c('gregaria', 'piceifrons'),
                color = 'orange',
                fill = 'orange',
                only_components = c('intersections_matrix', 'Intersection size')
            ),
            upset_query(
                intersect = c('cancellata', 'piceifrons'),
                color = 'orange',
                fill = 'orange',
                only_components = c('intersections_matrix', 'Intersection size')
            ),
            upset_query(
                intersect = c('gregaria', 'piceifrons', 'cancellata'),
                color = 'darkred',
                fill = 'darkred',
                only_components = c('intersections_matrix', 'Intersection size')
            ),
            upset_query(
                intersect = c('gregaria', 'piceifrons', 'cancellata', 'americana'),
                color = 'purple',
                fill = 'purple',
                only_components = c('intersections_matrix', 'Intersection size')
            ),
            upset_query(set = 'gregaria', fill = 'darkred'),
            upset_query(set = 'piceifrons', fill = 'darkred'),
            upset_query(set = 'cancellata', fill = 'darkred'),
            upset_query(set = 'americana', fill = 'black'),
            upset_query(set = 'cubense', fill = 'black'),
            upset_query(set = 'nitens', fill = 'black')
        ),
        sort_sets = FALSE,
        base_annotations = list(
            'Intersection size' = intersection_size(counts = FALSE) + 
                ylab('# DEGs in intersection') + 
                scale_y_continuous(expand = expansion(mult = c(0, 0.05)))
        ),
        intersection_matrix(
            geom = geom_point(
                shape = 'circle',
                size = 4
            ),
            segment = geom_segment(
                linetype = 'solid',
                size = 3
            ),
            outline_color = list(
                active = 'black',
                inactive = 'grey80'
            )
        ) +
        theme(
            axis.text.x = element_text(face = "bold", size = 12, angle = 45, hjust = 1, color = "darkblue"),
            axis.text.y = element_text(face = "italic", size = 12, color = "darkred"),
            axis.ticks.length = unit(0.25, "cm")
        ),
        set_sizes = upset_set_size(
            geom = geom_bar(width = 0.8),
            position = 'right'
        ) + 
        ylab('# DEGs per species') + 
        theme(
            axis.line.x = element_line(colour = 'black'),
            axis.ticks.x = element_line()
        ),
        stripes = upset_stripes(
            geom = geom_segment(size = 12),
            colors = c('grey95', 'white')
        ),
    ) +
    theme_minimal() +
    theme(
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        axis.line = element_line(colour = 'black'),
        text = element_text(size = 14),
        axis.text.x = element_text(face = "italic"),
        plot.title = element_text(hjust = 0.5, face = "bold", size = 16)
    ) +
    ggtitle(title)

    return(upset_plot)
}


upset_head <- display_upset_plot(upset_data_head, "Intersection from Head")
ggvenn_head_all; print(upset_head)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
upset_thorax <- display_upset_plot(upset_data_thorax, "Intersection from Thorax")
ggvenn_thorax_all; print(upset_thorax)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
# Function to extract GeneIDs for a specific intersection
extract_geneids_from_intersection <- function(upset_data, selected_species) {
    # Ensure the input species exist in the dataset
    selected_species <- intersect(selected_species, colnames(upset_data))
    
    # Select rows where all selected species have '1' (present in the intersection)
    intersecting_genes <- upset_data[rowSums(upset_data[selected_species]) == length(selected_species), ]
    
    # Return only the GeneIDs as a DataFrame
    return(data.frame(GeneID = intersecting_genes$GeneID))
}

# **Shared GeneIDs among Gregaria, Cancellata, and Piceifrons (Head)**
shared_geneids_head <- extract_geneids_from_intersection(upset_data_head, c("gregaria", "cancellata", "piceifrons"))

kable(shared_geneids_head, col.names = c("Head: shared genes among all locusts")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Head: shared genes among all locusts
LOC126334877
LOC126335646
LOC126282005
LOC126335148
LOC126353995
LOC126269449
LOC126268224
LOC126272473
LOC126271867
LOC126282249
LOC126282270
LOC126284250
LOC126293182
LOC126274577
LOC126336183
LOC126353962
LOC126354941
LOC126266785
LOC126268096
LOC126273126
LOC126272290
LOC126272573
LOC126272572
LOC126272125
LOC126281584
LOC126281827
LOC126284303
LOC126291853
LOC126292013
# **Shared GeneIDs among Gregaria, Cancellata, Piceifrons, and Americana (Head)**
shared_geneids_head_americana <- extract_geneids_from_intersection(upset_data_head, c("gregaria", "cancellata", "piceifrons", "americana"))

kable(shared_geneids_head_americana, col.names = c("Head: shared genes among all locusts + americana")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Head: shared genes among all locusts + americana
LOC126353995
LOC126269449
LOC126268224
LOC126272473
LOC126271867
LOC126282249
LOC126282270
LOC126284250
LOC126293182
# **Shared GeneIDs among Gregaria, Cancellata, and Piceifrons (Thorax)**
shared_geneids_thorax <- extract_geneids_from_intersection(upset_data_thorax, c("gregaria", "cancellata", "piceifrons"))

kable(shared_geneids_thorax, col.names = c("Thorax: shared genes among all locusts")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Thorax: shared genes among all locusts
LOC126284484
LOC126284981
LOC126334921
LOC126345116
LOC126336408
LOC126336415
LOC126353822
LOC126355556
LOC126355925
LOC126267550
LOC126272787
LOC126272949
LOC126273038
LOC126282270
LOC126284671
LOC126284704
LOC126291753
LOC126298650
LOC126280525
LOC126334877
LOC126335646
LOC126337060
LOC126334545
LOC126335148
LOC126335450
LOC126353962
LOC126267274
LOC126273126
LOC126273132
LOC126273129
LOC126271867
LOC126272290
LOC126272573
LOC126277894
LOC126282005
LOC126281827
LOC126284240
LOC126291616
LOC126293406
LOC126293648
LOC126294994
# **Shared GeneIDs among Gregaria, Cancellata, Piceifrons, and Americana (Thorax)**
shared_geneids_thorax_americana <- extract_geneids_from_intersection(upset_data_thorax, c("gregaria", "cancellata", "piceifrons", "americana"))

kable(shared_geneids_thorax_americana, col.names = c("Thorax: shared genes among all locusts + americana")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Thorax: shared genes among all locusts + americana
LOC126284981
LOC126334921
LOC126345116
LOC126336408
LOC126336415
LOC126353822
LOC126355556
LOC126355925
LOC126267550
LOC126272787
LOC126272949
LOC126273038
LOC126282270
LOC126284671
LOC126284704
LOC126291753
LOC126298650
# **Shared GeneIDs among Gregaria and Piceifrons (Head)**
shared_geneids_head_piceifrons_gregaria <- extract_geneids_from_intersection(upset_data_head, c("gregaria", "piceifrons"))

kable(shared_geneids_head_piceifrons_gregaria, col.names = c("Head: shared genes between gregaria & piceifrons")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Head: shared genes between gregaria & piceifrons
LOC126312183
LOC126334877
LOC126335646
LOC126356328
LOC126282005
LOC126281697
LOC126291991
LOC126298049
LOC126346425
LOC126335148
LOC126353995
LOC126269449
LOC126336415
LOC126334614
LOC126335450
LOC126335543
LOC126354368
LOC126268224
LOC126272473
LOC126271867
LOC126271906
LOC126278164
LOC126278162
LOC126282249
LOC126282250
LOC126282270
LOC126284250
LOC126293182
LOC126325647
LOC126360279
LOC126274561
LOC126274577
LOC126274545
LOC126281377
LOC126335605
LOC126336183
LOC126332220
LOC126335083
LOC126353961
LOC126353962
LOC126354707
LOC126354240
LOC126355817
LOC126354934
LOC126354935
LOC126354941
LOC126365802
LOC126266785
LOC126268096
LOC126267730
LOC126268046
LOC126273126
LOC126272290
LOC126272573
LOC126272572
LOC126271905
LOC126272598
LOC126272125
LOC126281584
LOC126282247
LOC126281222
LOC126281273
LOC126281827
LOC126282035
LOC126284303
LOC126284500
LOC126291853
LOC126292013
LOC126295132
LOC126295147
# **Shared GeneIDs among Piceifrons and Cancellata (Head)**
shared_geneids_head_piceifrons_cancellata <- extract_geneids_from_intersection(upset_data_head, c("piceifrons", "cancellata"))

kable(shared_geneids_head_piceifrons_cancellata, col.names = c("Head: shared genes between piceifrons & Cancellata")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Head: shared genes between piceifrons & Cancellata
LOC126334877
LOC126335646
LOC126335513
LOC126354343
LOC126267269
LOC126272571
LOC126272901
LOC126282005
LOC126297489
LOC126336417
LOC126334880
LOC126335148
LOC126272684
LOC126334921
LOC126349427
LOC126353995
LOC126269449
LOC126283252
LOC126334992
LOC126354154
LOC126355490
LOC126355258
LOC126355925
LOC126268224
LOC126266629
LOC126267952
LOC126272473
LOC126272886
LOC126273222
LOC126271867
LOC126272311
LOC126272289
LOC126282249
LOC126282270
LOC126285392
LOC126284250
LOC126284858
LOC126293182
LOC126293510
LOC126295330
LOC126298855
LOC126268233
LOC126274577
LOC126336491
LOC126336183
LOC126354441
LOC126354172
LOC126353959
LOC126353962
LOC126354941
LOC126355774
LOC126268219
LOC126267280
LOC126365811
LOC126266785
LOC126267420
LOC126365781
LOC126268096
LOC126365968
LOC126273126
LOC126272290
LOC126272573
LOC126272572
LOC126272125
LOC126278646
LOC126278158
LOC126282076
LOC126281584
LOC126281687
LOC126281827
LOC126282264
LOC126284446
LOC126284704
LOC126284303
LOC126284428
LOC126284300
LOC126291853
LOC126292013
LOC126294994
LOC126295256
LOC126299329
LOC126302631
# **Shared GeneIDs among Cancellata and Gregaria (Head)**
shared_geneids_head_cancellata_gregaria <- extract_geneids_from_intersection(upset_data_head, c("cancellata", "gregaria"))

kable(shared_geneids_head_cancellata_gregaria, col.names = c("Head: shared genes between cancellata & gregaria")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Head: shared genes between cancellata & gregaria
LOC126334876
LOC126334878
LOC126334877
LOC126335646
LOC126332740
LOC126282005
LOC126284695
LOC126295127
LOC126294990
LOC126297738
LOC126335148
LOC126353995
LOC126269449
LOC126279173
LOC126336408
LOC126336602
LOC126336884
LOC126325085
LOC126336469
LOC126353822
LOC126356046
LOC126356355
LOC126356365
LOC126355429
LOC126355428
LOC126355446
LOC126355498
LOC126355525
LOC126355499
LOC126355503
LOC126355507
LOC126355515
LOC126355693
LOC126355555
LOC126354383
LOC126355587
LOC126268224
LOC126267148
LOC126365815
LOC126267904
LOC126272473
LOC126272787
LOC126271867
LOC126271903
LOC126273012
LOC126271954
LOC126282249
LOC126282270
LOC126281236
LOC126284276
LOC126284565
LOC126284785
LOC126284250
LOC126284312
LOC126291970
LOC126291614
LOC126293182
LOC126298817
LOC126298478
LOC126299078
LOC126274577
LOC126336183
LOC126353962
LOC126354941
LOC126266785
LOC126268096
LOC126273126
LOC126272290
LOC126272573
LOC126272572
LOC126272125
LOC126281584
LOC126281827
LOC126284303
LOC126291853
LOC126292013
LOC126320407
LOC126343123
LOC126324532
LOC126336760
LOC126352291
LOC126357728
LOC126362843
LOC126280573
LOC126283162
LOC126335599
LOC126336443
LOC126336492
LOC126336724
LOC126336739
LOC126336117
LOC126336874
LOC126337054
LOC126335945
LOC126334845
LOC126334987
LOC126356330
LOC126356349
LOC126356348
LOC126354549
LOC126355134
LOC126354736
LOC126354973
LOC126355686
LOC126355447
LOC126355527
LOC126355700
LOC126355841
LOC126355869
LOC126267971
LOC126267274
LOC126267143
LOC126267358
LOC126365834
LOC126267838
LOC126271849
LOC126272860
LOC126272341
LOC126272541
LOC126273131
LOC126273129
LOC126272397
LOC126278589
LOC126277894
LOC126282251
LOC126284968
LOC126285109
LOC126284086
LOC126284087
LOC126284577
LOC126284694
LOC126285154
LOC126284936
LOC126291806
LOC126292117
LOC126291956
LOC126291615
LOC126292016
LOC126293406
LOC126293175
LOC126295317
LOC126298645
# **Shared GeneIDs among Gregaria and Piceifrons (Thorax)**
shared_geneids_thorax_piceifrons_gregaria <- extract_geneids_from_intersection(upset_data_thorax, c("gregaria", "piceifrons"))

kable(shared_geneids_thorax_piceifrons_gregaria, col.names = c("Thorax: shared genes between gregaria & piceifrons")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Thorax: shared genes between gregaria & piceifrons
LOC126276951
LOC126334681
LOC126354552
LOC126284530
LOC126284484
LOC126284822
LOC126292150
LOC126298400
LOC126298731
LOC126284981
LOC126334921
LOC126345116
LOC126269449
LOC126336408
LOC126336415
LOC126334760
LOC126334899
LOC126335140
LOC126353822
LOC126355645
LOC126355556
LOC126355925
LOC126267550
LOC126272143
LOC126272895
LOC126272787
LOC126272949
LOC126273038
LOC126272301
LOC126278623
LOC126282249
LOC126282270
LOC126284671
LOC126284704
LOC126291753
LOC126299485
LOC126298650
LOC126299510
LOC126338779
LOC126339755
LOC126344463
LOC126354620
LOC126356143
LOC126365808
LOC126362949
LOC126280525
LOC126281377
LOC126334877
LOC126335646
LOC126336425
LOC126335664
LOC126336534
LOC126335750
LOC126337060
LOC126337064
LOC126330717
LOC126332617
LOC126334545
LOC126334648
LOC126334817
LOC126335117
LOC126335147
LOC126335148
LOC126335335
LOC126335365
LOC126335450
LOC126335483
LOC126354632
LOC126354459
LOC126354547
LOC126354550
LOC126353960
LOC126353959
LOC126353961
LOC126353962
LOC126355943
LOC126355134
LOC126354934
LOC126354924
LOC126354936
LOC126354383
LOC126355818
LOC126267274
LOC126267355
LOC126272026
LOC126272338
LOC126273222
LOC126273028
LOC126273056
LOC126273126
LOC126273131
LOC126273132
LOC126273129
LOC126273136
LOC126271867
LOC126272290
LOC126272573
LOC126272572
LOC126273368
LOC126278538
LOC126277894
LOC126277903
LOC126278164
LOC126278162
LOC126282005
LOC126282181
LOC126282250
LOC126281273
LOC126281288
LOC126281827
LOC126281236
LOC126282035
LOC126284626
LOC126284656
LOC126284729
LOC126284418
LOC126284240
LOC126284295
LOC126285233
LOC126291424
LOC126292127
LOC126291853
LOC126291924
LOC126291955
LOC126291949
LOC126291616
LOC126292013
LOC126293406
LOC126293423
LOC126293648
LOC126295242
LOC126294962
LOC126295382
LOC126294994
LOC126295494
LOC126298698
LOC126298883
LOC126299133
LOC126299274
LOC126302901
# **Shared GeneIDs among Piceifrons and Cancellata (Thorax)**
shared_geneids_thorax_piceifrons_cancellata <- extract_geneids_from_intersection(upset_data_thorax, c("piceifrons", "cancellata"))

kable(shared_geneids_thorax_piceifrons_cancellata, col.names = c("Thorax: shared genes between piceifrons & cancellata")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Thorax: shared genes between piceifrons & cancellata
LOC126272147
LOC126284484
LOC126333298
LOC126284981
LOC126334921
LOC126341450
LOC126345116
LOC126336408
LOC126336415
LOC126336417
LOC126334873
LOC126334880
LOC126334879
LOC126335724
LOC126336938
LOC126334985
LOC126335202
LOC126336094
LOC126355128
LOC126353822
LOC126354941
LOC126355556
LOC126355798
LOC126355925
LOC126267550
LOC126272787
LOC126272571
LOC126272949
LOC126273038
LOC126278310
LOC126282270
LOC126284671
LOC126284704
LOC126284585
LOC126284335
LOC126285253
LOC126285254
LOC126284105
LOC126284316
LOC126291753
LOC126298650
LOC126299195
LOC126349798
LOC126350814
LOC126354677
LOC126354626
LOC126355911
LOC126358175
LOC126361126
LOC126269384
LOC126273067
LOC126280525
LOC126280638
LOC126281350
LOC126336229
LOC126334877
LOC126335646
LOC126334874
LOC126336703
LOC126336144
LOC126337060
LOC126337069
LOC126328190
LOC126334543
LOC126334545
LOC126336332
LOC126334806
LOC126334951
LOC126335148
LOC126335450
LOC126355277
LOC126356064
LOC126356230
LOC126354789
LOC126353962
LOC126354818
LOC126355499
LOC126355880
LOC126268224
LOC126267280
LOC126267259
LOC126267274
LOC126365781
LOC126267269
LOC126266988
LOC126266725
LOC126365969
LOC126365968
LOC126267486
LOC126267904
LOC126272831
LOC126272474
LOC126273126
LOC126273132
LOC126273129
LOC126273152
LOC126271867
LOC126272290
LOC126272573
LOC126272884
LOC126278644
LOC126277894
LOC126278026
LOC126278122
LOC126282005
LOC126281827
LOC126281298
LOC126282264
LOC126284428
LOC126284235
LOC126284240
LOC126284300
LOC126284552
LOC126284306
LOC126284858
LOC126291788
LOC126291616
LOC126293173
LOC126293406
LOC126293648
LOC126294994
LOC126295256
LOC126299011
LOC126301328
LOC126302695
LOC126306484
LOC126313207
# **Shared GeneIDs among Cancellata and Gregaria (Thorax)**
shared_geneids_thorax_cancellata_gregaria <- extract_geneids_from_intersection(upset_data_thorax, c("cancellata", "gregaria"))

kable(shared_geneids_thorax_cancellata_gregaria, col.names = c("Thorax: shared genes between cancellata & gregaria")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Thorax: shared genes between cancellata & gregaria
LOC126268104
LOC126324148
LOC126284484
LOC126284981
LOC126334921
LOC126335260
LOC126345116
LOC126349427
LOC126352043
LOC126353995
LOC126361237
LOC126279939
LOC126336408
LOC126336415
LOC126336533
LOC126335770
LOC126323429
LOC126325085
LOC126336250
LOC126334708
LOC126334803
LOC126334801
LOC126334853
LOC126335513
LOC126354154
LOC126353822
LOC126354916
LOC126356343
LOC126355490
LOC126353831
LOC126354413
LOC126355446
LOC126355503
LOC126355515
LOC126355694
LOC126355556
LOC126355587
LOC126355925
LOC126267550
LOC126272787
LOC126273138
LOC126272949
LOC126272395
LOC126273038
LOC126272905
LOC126282270
LOC126283952
LOC126283983
LOC126284671
LOC126284704
LOC126284089
LOC126284421
LOC126284312
LOC126291753
LOC126291970
LOC126295317
LOC126295253
LOC126299147
LOC126299067
LOC126298817
LOC126298478
LOC126297427
LOC126298810
LOC126298645
LOC126298650
LOC126298738
LOC126297876
LOC126280525
LOC126334877
LOC126335646
LOC126337060
LOC126334545
LOC126335148
LOC126335450
LOC126353962
LOC126267274
LOC126273126
LOC126273132
LOC126273129
LOC126271867
LOC126272290
LOC126272573
LOC126277894
LOC126282005
LOC126281827
LOC126284240
LOC126291616
LOC126293406
LOC126293648
LOC126294994
LOC126335304
LOC126339830
LOC126350536
LOC126350552
LOC126356526
LOC126267256
LOC126274561
LOC126274577
LOC126274545
LOC126316749
LOC126284647
LOC126334878
LOC126336602
LOC126335616
LOC126336724
LOC126337054
LOC126320451
LOC126334795
LOC126334802
LOC126334805
LOC126335943
LOC126335945
LOC126334988
LOC126335014
LOC126335190
LOC126335449
LOC126355251
LOC126356355
LOC126354474
LOC126354147
LOC126355447
LOC126355507
LOC126355555
LOC126355700
LOC126355890
LOC126267553
LOC126266935
LOC126267473
LOC126365834
LOC126365944
LOC126271849
LOC126271851
LOC126271850
LOC126272524
LOC126272823
LOC126273066
LOC126271949
LOC126278180
LOC126278068
LOC126277975
LOC126282077
LOC126282251
LOC126281687
LOC126281237
LOC126282091
LOC126281300
LOC126284968
LOC126284728
LOC126284977
LOC126284438
LOC126284087
LOC126284565
LOC126285309
LOC126291956
LOC126292027
LOC126293603
LOC126293564
LOC126295127
LOC126294990
LOC126298832
LOC126299078
LOC126298614

STRATEGY 2: Own RefSeq genome

Here the difference with STRATEGY 1 is that to look at the correspondance of genes across species for comparison, we will have to use orthologs (see section Orthofinder).

We load from our previous conversion

# Path for all species
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

1. DEGs comparison among species

We summarized the number of genes differential expressed between density for each species and each tissues.

library(data.table)
library(dplyr)

# Initialize empty lists
summary_list_head <- list()
summary_list_thorax <- list()
gene_ids_list <- list()

# Loop through each species
for (species in species_list) {
  message("Processing: ", species)
  
  # Read files
  head_results_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species ,".csv"))
  thorax_results_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))
  
  head_sigresults <- fread(head_results_file)
  thorax_sigresults <- fread(thorax_results_file)
  
  # Replace NULL with empty tables if needed
  if (nrow(head_sigresults) == 0) head_sigresults <- data.table(log2FoldChange = numeric(0), padj = numeric(0), GeneID = character(0))
  if (nrow(thorax_sigresults) == 0) thorax_sigresults <- data.table(log2FoldChange = numeric(0), padj = numeric(0), GeneID = character(0))
  
  # Count summary (strict LFC)
  summary_list_head[[species]] <- data.frame(
    Species = species,
    Head_Upregulated_Strict = sum(head_sigresults$log2FoldChange > 1, na.rm = TRUE),
    Head_Downregulated_Strict = sum(head_sigresults$log2FoldChange < -1, na.rm = TRUE)
  )
  
  summary_list_thorax[[species]] <- data.frame(
    Species = species,
    Thorax_Upregulated_Strict = sum(thorax_sigresults$log2FoldChange > 1, na.rm = TRUE),
    Thorax_Downregulated_Strict = sum(thorax_sigresults$log2FoldChange < -1, na.rm = TRUE)
  )
  
  # Extract GeneIDs with filtering (padj + lfc)
  head_up_strict_ids <- if (all(c("log2FoldChange", "padj", "GeneID") %in% colnames(head_sigresults))) {
    head_sigresults %>% filter(log2FoldChange > 1, padj < 0.05) %>% pull(GeneID)
  } else character(0)
  
  head_down_strict_ids <- if (all(c("log2FoldChange", "padj", "GeneID") %in% colnames(head_sigresults))) {
    head_sigresults %>% filter(log2FoldChange < -1, padj < 0.05) %>% pull(GeneID)
  } else character(0)
  
  thorax_up_strict_ids <- if (all(c("log2FoldChange", "padj", "GeneID") %in% colnames(thorax_sigresults))) {
    thorax_sigresults %>% filter(log2FoldChange > 1, padj < 0.05) %>% pull(GeneID)
  } else character(0)
  
  thorax_down_strict_ids <- if (all(c("log2FoldChange", "padj", "GeneID") %in% colnames(thorax_sigresults))) {
    thorax_sigresults %>% filter(log2FoldChange < -1, padj < 0.05) %>% pull(GeneID)
  } else character(0)
  
  # Store gene IDs
  gene_ids_list[[species]] <- list(
    Head_Upregulated_Strict = head_up_strict_ids,
    Head_Downregulated_Strict = head_down_strict_ids,
    Thorax_Upregulated_Strict = thorax_up_strict_ids,
    Thorax_Downregulated_Strict = thorax_down_strict_ids
  )
}

# Combine tables
summary_table_head <- bind_rows(summary_list_head)
summary_table_thorax <- bind_rows(summary_list_thorax)


# Print the summary table in a markdown-friendly format
knitr::kable(summary_table_head, format = "markdown", caption = "Summary of differentially expressed genes in head per species")
Summary of differentially expressed genes in head per species
Species Head_Upregulated_Strict Head_Downregulated_Strict
gregaria 397 327
piceifrons 245 210
cancellata 292 387
americana 322 313
cubense 24 31
nitens 104 234
# Convert the summary table to a long format for easier plotting
summary_long_head <- summary_table_head %>%
  pivot_longer(cols = c(Head_Upregulated_Strict, Head_Downregulated_Strict),
               names_to = "Tissue", values_to = "Count")

# Adjust the values for downregulated genes to be negative
summary_long_head <- summary_long_head %>%
  mutate(Count = ifelse(Tissue == "Head_Downregulated_Strict", -Count, Count))

summary_long_head$Species <- factor(summary_long_head$Species, levels = species_order)

# Plot barplot for head
ggplot(summary_long_head, aes(x = Species, y = Count, fill = Tissue)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Upregulated and Downregulated Genes in Head (absolute lfc >1)",
       x = "Species", y = "Number of Genes") +
  scale_fill_manual(values = c("Head_Upregulated_Strict" = "red2", "Head_Downregulated_Strict" = "blue")) +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200)) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1)) +
  coord_flip()

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Print the summary table for thorax
knitr::kable(summary_table_thorax, format = "markdown", caption = "Summary of differentially expressed genes in thorax per species")
Summary of differentially expressed genes in thorax per species
Species Thorax_Upregulated_Strict Thorax_Downregulated_Strict
gregaria 463 620
piceifrons 556 256
cancellata 270 321
americana 155 357
cubense 44 148
nitens 0 0
# Convert the summary table to a long format for thorax
summary_long_thorax <- summary_table_thorax %>%
  pivot_longer(cols = c(Thorax_Upregulated_Strict, Thorax_Downregulated_Strict),
               names_to = "Tissue", values_to = "Count")

# Adjust the values for downregulated genes to be negative
summary_long_thorax <- summary_long_thorax %>%
  mutate(Count = ifelse(Tissue == "Thorax_Downregulated_Strict", -Count, Count))

summary_long_thorax$Species <- factor(summary_long_thorax$Species, levels = species_order)

# Plot barplot for thorax
ggplot(summary_long_thorax, aes(x = Species, y = Count, fill = Tissue)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Upregulated and Downregulated Genes in Thorax (absolute lfc >1)",
       x = "Species", y = "Number of Genes") +
  scale_fill_manual(values = c("Thorax_Upregulated_Strict" = "red2", "Thorax_Downregulated_Strict" = "blue")) +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200)) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1)) +
  coord_flip()

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
library(data.table)
library(dplyr)

# Initialize empty lists
summary_list_head <- list()
summary_list_thorax <- list()
gene_ids_list <- list()

# Loop through each species
for (species in species_list) {
  message("Processing: ", species)
  
  # Read files
  head_results_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species ,".csv"))
  thorax_results_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))
  
  head_sigresults <- fread(head_results_file)
  thorax_sigresults <- fread(thorax_results_file)
  
  # Replace NULL with empty tables if needed
  if (nrow(head_sigresults) == 0) head_sigresults <- data.table(log2FoldChange = numeric(0), padj = numeric(0), GeneID = character(0))
  if (nrow(thorax_sigresults) == 0) thorax_sigresults <- data.table(log2FoldChange = numeric(0), padj = numeric(0), GeneID = character(0))
  
  # Count summary (strict LFC)
  summary_list_head[[species]] <- data.frame(
    Species = species,
    Head_Upregulated_Strict = sum(head_sigresults$log2FoldChange > 1, na.rm = TRUE),
    Head_Downregulated_Strict = sum(head_sigresults$log2FoldChange < -1, na.rm = TRUE)
  )
  
  summary_list_thorax[[species]] <- data.frame(
    Species = species,
    Thorax_Upregulated_Strict = sum(thorax_sigresults$log2FoldChange > 1, na.rm = TRUE),
    Thorax_Downregulated_Strict = sum(thorax_sigresults$log2FoldChange < -1, na.rm = TRUE)
  )
  
  # Extract GeneIDs with filtering (padj + lfc)
  head_up_strict_ids <- if (all(c("log2FoldChange", "padj", "GeneID") %in% colnames(head_sigresults))) {
    head_sigresults %>% filter(log2FoldChange > 1, padj < 0.05) %>% pull(GeneID)
  } else character(0)
  
  head_down_strict_ids <- if (all(c("log2FoldChange", "padj", "GeneID") %in% colnames(head_sigresults))) {
    head_sigresults %>% filter(log2FoldChange < -1, padj < 0.05) %>% pull(GeneID)
  } else character(0)
  
  thorax_up_strict_ids <- if (all(c("log2FoldChange", "padj", "GeneID") %in% colnames(thorax_sigresults))) {
    thorax_sigresults %>% filter(log2FoldChange > 1, padj < 0.05) %>% pull(GeneID)
  } else character(0)
  
  thorax_down_strict_ids <- if (all(c("log2FoldChange", "padj", "GeneID") %in% colnames(thorax_sigresults))) {
    thorax_sigresults %>% filter(log2FoldChange < -1, padj < 0.05) %>% pull(GeneID)
  } else character(0)
  
  # Store gene IDs
  gene_ids_list[[species]] <- list(
    Head_Upregulated_Strict = head_up_strict_ids,
    Head_Downregulated_Strict = head_down_strict_ids,
    Thorax_Upregulated_Strict = thorax_up_strict_ids,
    Thorax_Downregulated_Strict = thorax_down_strict_ids
  )
}

# Combine tables
summary_table_head <- bind_rows(summary_list_head)
summary_table_thorax <- bind_rows(summary_list_thorax)

library(tidyverse)

# Initialize list for summary data
summary_overlap_detailed <- list()

# Loop through species to compute overlap categories
for (species in names(gene_ids_list)) {
    degs <- gene_ids_list[[species]]
    
    # Use `intersect` and `setdiff` to split into exclusive/shared categories
    up_shared <- intersect(degs$Head_Upregulated_Strict, degs$Thorax_Upregulated_Strict)
    down_shared <- intersect(degs$Head_Downregulated_Strict, degs$Thorax_Downregulated_Strict)
    
    up_head_only <- setdiff(degs$Head_Upregulated_Strict, degs$Thorax_Upregulated_Strict)
    up_thorax_only <- setdiff(degs$Thorax_Upregulated_Strict, degs$Head_Upregulated_Strict)
    
    down_head_only <- setdiff(degs$Head_Downregulated_Strict, degs$Thorax_Downregulated_Strict)
    down_thorax_only <- setdiff(degs$Thorax_Downregulated_Strict, degs$Head_Downregulated_Strict)
    
    # Store results in a data frame
    df <- tibble(
        Species = species,
        Group = c("Up_HeadOnly", "Up_ThoraxOnly", "Up_Shared", 
                  "Down_HeadOnly", "Down_ThoraxOnly", "Down_Shared"),
        Count = c(
            length(up_head_only),
            length(up_thorax_only),
            length(up_shared),
            -length(down_head_only),
            -length(down_thorax_only),
            -length(down_shared)
        )
    )
    
    summary_overlap_detailed[[species]] <- df
}

# Combine all into one data frame
summary_overlap_df <- bind_rows(summary_overlap_detailed)

# Add factors for plotting
summary_overlap_df <- summary_overlap_df %>%
    mutate(
        Direction = ifelse(str_detect(Group, "^Down"), "Downregulated", "Upregulated"),
        Tissue = case_when(
            str_detect(Group, "HeadOnly") ~ "Head only",
            str_detect(Group, "ThoraxOnly") ~ "Thorax only",
            str_detect(Group, "Shared") ~ "Shared"
        ),
        Tissue = factor(Tissue, levels = c("Head only", "Thorax only", "Shared")),
        Species = factor(Species, levels = species_order)  # set species_order beforehand
    )

# Define custom fill colors
custom_fill <- c("Head only" = "black", "Thorax only" = "gray", "Shared" = "purple")

# Plot
ggplot(summary_overlap_df, aes(x = Species, y = Count, fill = Tissue)) +
    geom_bar(stat = "identity", position = "stack", width = 0.7) +
    scale_fill_manual(values = custom_fill) +
    scale_y_continuous(labels = abs, limits = c(-1200, 1200)) +
    labs(title = "Strict DEGs (LFC > 1) by Tissue-Specificity per Species",
         y = "Number of Genes", x = "Species", fill = "Tissue origin") +
    theme_minimal(base_size = 12) +
    theme(
        legend.position = "top",
        plot.title = element_text(hjust = 0.5, face = "bold", size = 14),
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1)
    ) +
    coord_flip()

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
# Assign new fill colors based on both Direction and Tissue
summary_overlap_df <- summary_overlap_df %>%
  mutate(
    FillGroup = paste(Direction, Tissue, sep = "_"),
    FillGroup = factor(FillGroup, levels = c(
      "Upregulated_Head only", "Upregulated_Thorax only", "Upregulated_Shared",
      "Downregulated_Head only", "Downregulated_Thorax only", "Downregulated_Shared"
    ))
  )

# Define color palette: reds for up, blues for down
custom_fill_colors <- c(
  "Upregulated_Head only" = "#fcbba1",     # deep red
  "Upregulated_Thorax only" = "#fb6a4a",   # medium red
  "Upregulated_Shared" = "red3",        # light red
  "Downregulated_Head only" = "#c6dbef",   # deep blue
  "Downregulated_Thorax only" = "#6baed6", # medium blue
  "Downregulated_Shared" = "blue2"       # light blue
)

# Plot
ggplot(summary_overlap_df, aes(x = Species, y = Count, fill = FillGroup)) +
  geom_bar(stat = "identity", position = "stack", width = 0.8, color = "black", linewidth = 0.2) +  # <- add color = "black"
  scale_fill_manual(values = custom_fill_colors, name = "Tissue + Regulation") +
  scale_y_continuous(labels = abs, limits = c(-1200, 1200)) +
  geom_hline(yintercept = 0, color = "black", linewidth = 1.2) +
  labs(title = "Strict DEGs (LFC > 1) by Tissue and Regulation Direction",
       y = "Number of Genes", x = "Species") +
  coord_flip() +
  theme_minimal(base_size = 12) +
  theme(
    legend.position = "top",
    plot.title = element_text(hjust = 0.5, face = "bold", size = 14),
    axis.text.x = element_text(size = 12, angle = 45, hjust = 1)
  )

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
# Define custom colors for each GeneType
custom_colors <- c(
  "transcribed_pseudogene" = "#F4F1BB",  # Example color for transcribed_pseudogene
  "protein-coding" = "#9B57D3",         # Example color for protein-coding
  "lncRNA" = "#A5300F",                 # Example color for lncRNA
  "tRNA" = "#74D055FF",                   # Example color for tRNA
  "misc_RNA" = "#3B6978",               # Example color for misc_RNA
  "ncRNA" = "#29AF7FFF",                  # Example color for ncRNA
  "pseudogene" = "#81B29A",             # Example color for pseudogene
  "rRNA" = "#5982DB",                   # Example color for rRNA
  "snoRNA" = "#DCE318FF",                 # Example color for snoRNA
  "snRNA" = "#665EB8"                   # Example color for snRNA
)

# Use scale_fill_manual to map the custom colors to the GeneTypes
custom_color_scale <- scale_fill_manual(
  values = custom_colors
)
# Create an empty list to store the data for all species
all_species_data <- list()

# Loop through each species to process their data
for (species in species_list) {
  # Read the DESeq2 results for head and thorax
   head_results_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species ,".csv"))
    thorax_results_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))
  
  head_sigresults <- read.csv(head_results_file, stringsAsFactors = FALSE)
  thorax_sigresults <- read.csv(thorax_results_file, stringsAsFactors = FALSE)
  
  # Add GeneType and Species columns (from `allspecies_df`)
  head_results_merged <- merge(head_sigresults, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
  thorax_results_merged <- merge(thorax_sigresults, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
  
  # Count for upregulated and downregulated genes in head
  head_upregulated <- head_results_merged %>%
    filter(log2FoldChange > 1) %>%
    mutate(Regulation = "Upregulated", Tissue = "Head", Count = 1)
  
  head_downregulated <- head_results_merged %>%
    filter(log2FoldChange < -1) %>%
    mutate(Regulation = "Downregulated", Tissue = "Head", Count = -1)  # Mutate downregulated genes to negative
  
  # Combine upregulated and downregulated genes for head
  head_combined <- rbind(head_upregulated, head_downregulated)
  
  # Ensure all GeneTypes are represented for this species, even if they have no DEGs
  head_combined <- head_combined %>%
    complete(GeneType = unique(allspecies_df$GeneType), 
             fill = list(Count = 0))  # Fill missing GeneTypes with Count = 0
  
  # Count for upregulated and downregulated genes in thorax
  thorax_upregulated <- thorax_results_merged %>%
    filter(log2FoldChange > 1) %>%
    mutate(Regulation = "Upregulated", Tissue = "Thorax", Count = 1)
  
  thorax_downregulated <- thorax_results_merged %>%
    filter(log2FoldChange < -1) %>%
    mutate(Regulation = "Downregulated", Tissue = "Thorax", Count = -1)  # Mutate downregulated genes to negative
  
  # Combine upregulated and downregulated genes for thorax
  thorax_combined <- rbind(thorax_upregulated, thorax_downregulated)
  
  # Ensure all GeneTypes are represented for this species in thorax, even if they have no DEGs
  thorax_combined <- thorax_combined %>%
    complete(GeneType = unique(allspecies_df$GeneType), 
             fill = list(Count = 0))  # Fill missing GeneTypes with Count = 0
  
  # Combine data for head and thorax into one
  combined_data <- rbind(head_combined, thorax_combined)
  
  # Add species column to the data
  combined_data$Species <- species
  
  # Append the data to the list for all species
  all_species_data[[species]] <- combined_data
}

# Combine all species data into one data frame
final_data <- bind_rows(all_species_data)

# Reorder species according to the desired order
final_data$Species <- factor(final_data$Species, levels = species_order)

# Filter for head tissue only
final_data_head <- final_data %>% filter(Tissue == "Head")
final_data_thorax <- final_data %>% filter(Tissue == "Thorax")

# Create the barplot for all species and only head tissue
ggplot(final_data_head, aes(x = Species, y = Count, fill = GeneType)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "DEGs by Gene Biotype for Head (absolute lfc >1)",
       x = "Species",
       y = "Number of Genes") +
  custom_color_scale +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200))+
theme_minimal(base_size = 12) + 
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1), 
        axis.text.y = element_text(size = 12), 
        panel.grid.major.y = element_line(color = "grey90", linetype = "dashed"),
        panel.grid.minor = element_blank()) +
  coord_flip()  # Flip coordinates to make the plot horizontal

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Create the barplot for all species and only thorax tissue
ggplot(final_data_thorax, aes(x = Species, y = Count, fill = GeneType)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "DEGs by Gene Biotype for Thorax (absolute lfc >1)",
       x = "Species",
       y = "Number of Genes") +
  custom_color_scale +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200))+
theme_minimal(base_size = 12) + 
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1), 
        axis.text.y = element_text(size = 12), 
        panel.grid.major.y = element_line(color = "grey90", linetype = "dashed"),
        panel.grid.minor = element_blank()) +
  coord_flip()  # Flip coordinates to make the plot horizontal

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

2. Overlap DEGs between tissues

gregaria


species <- "gregaria"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))



head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }
    
    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
aab712a Maeva TECHER 2025-02-04
8df3d7c Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

piceifrons


species <- "piceifrons"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

cancellata


species <- "cancellata"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

americana


species <- "americana"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

cubense


species <- "cubense"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

nitens


species <- "nitens"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

3. Overlap DEGs among species

Locusts

Head tissues

# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")

ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Function to load DEGs for a given group of species
load_deg_data <- function(locusts, allspecies_df, filtered_final_orthotable) {
    degs_up <- list()
    degs_down <- list()
    degs_all <- list()
    
    # Rename the "gene_id" column in filtered_final_orthotable for consistency
    #colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
    
    for (species in locusts) {
        head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))
        
        # Check if the file exists
        if (!file.exists(head_file)) {
            message(paste("File not found for species:", species))
            next  # Skip this iteration if the file is missing
        }
        
        # Read the data
        head_data <- read.csv(head_file, stringsAsFactors = FALSE)
        
        # Rename the "X" column to "GeneID"
        #colnames(head_data)[colnames(head_data) == "X"] <- "GeneID"
        
        # Merge DEG data with GeneType and Orthogroup information
        head_data_merged <- merge(head_data, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
        head_data_merged <- merge(head_data_merged, filtered_final_orthotable[, c("GeneID", "Orthogroup")], by = "GeneID")
        
        # Handle missing Orthogroups
        head_data_merged$Orthogroup[is.na(head_data_merged$Orthogroup)] <- "Unknown"
        
        # Filter for significant DEGs (both upregulated and downregulated)
        head_up <- head_data_merged %>%
            filter(padj < 0.05 & log2FoldChange >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        head_down <- head_data_merged %>%
            filter(padj < 0.05 & log2FoldChange <= -1) %>%
            select(Orthogroup) %>%
            distinct()
        
        all_deg <- head_data_merged %>%
            filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        # Store the DEGs in the list
        degs_up[[species]] <- head_up$Orthogroup
        degs_down[[species]] <- head_down$Orthogroup
        degs_all[[species]] <- all_deg$Orthogroup
    }
    
    return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Function to display Venn diagram and corresponding datatable based on Orthogroups
display_venn_with_datatable <- function(venn_data, title, allspecies_df, filtered_final_orthotable, output_prefix = NULL) {
    
    # Calculate overlapping Orthogroups
    overlap_orthogroups <- Reduce(intersect, venn_data)
    
    # Print overlap info
    cat("Overlapping Orthogroups: \n")
    print(overlap_orthogroups)
    
    # If no overlaps exist, display a message and an empty plot
    if (length(overlap_orthogroups) == 0) {
        message("⚠️ No overlapping Orthogroups found. Displaying an empty Venn diagram.")
        
        # Create an empty Venn diagram placeholder
        plot.new()
        text(0.5, 0.5, "No overlapping Orthogroups found", cex = 1.5, col = "red")
        
        return(NULL)  # Exit the function gracefully
    }
    
    # Create a data frame for the overlapping Orthogroups
    overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
    
    # Merge to get species and other information from filtered_final_orthotable
    meta_brock_df <- merge(overlap_df, filtered_final_orthotable, by = "Orthogroup", all.x = TRUE)
    
    # Ensure merged data exists
    if (nrow(meta_brock_df) == 0) {
        message("⚠️ Merge failed: No matching rows after merging Orthogroups.")
        return(NULL)
    }
    
# === ✨ Save the results to CSV
if (!is.null(output_prefix)) {
    write_csv(
        meta_brock_df,
        file.path(workDir, "overlap/Locusts", paste0(output_prefix, "_overlap_table.csv"))
    )
}
    
    # Generate the Venn diagram
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("gregaria", "piceifrons", "cancellata"), 
        filename = NULL, 
        output = TRUE,
        fill = c("orange", "red", "orchid"),
        alpha = 0.5,
        cex = 3,
        fontface = "bold",
        cat.cex = 0,
        main = title,
        main.cex = 1.2
    )
    
    # Clear the current plotting area before drawing the Venn diagram
    grid.newpage()
    
    # Display the Venn diagram
    grid.draw(venn_plot)
    
    # Display the datatable for overlapping Orthogroups
    datatable(meta_brock_df, options = list(
        pageLength = 10,
        scrollX = TRUE,
        autoWidth = TRUE,
        searchHighlight = TRUE
    ),
    rownames = FALSE,
    escape = FALSE
    ) %>%
        formatStyle(
            'Species', target = 'cell',
            fontStyle = 'italic'
        )
}

# Load DEGs for locusts
venn_data_locusts <- load_deg_data(locusts, allspecies_df, filtered_final_orthotable)

# Prepare the data for Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_locusts$up[["gregaria"]],
  piceifrons = venn_data_locusts$up[["piceifrons"]],
  cancellata = venn_data_locusts$up[["cancellata"]]
)

venn_data_down <- list(
  gregaria = venn_data_locusts$down[["gregaria"]],
  piceifrons = venn_data_locusts$down[["piceifrons"]],
  cancellata = venn_data_locusts$down[["cancellata"]]
)

venn_data_all <- list(
  gregaria = venn_data_locusts$all[["gregaria"]],
  piceifrons = venn_data_locusts$all[["piceifrons"]],
  cancellata = venn_data_locusts$all[["cancellata"]]
)

# Display the Venn diagrams with fallback for missing overlaps
message("Processing Venn diagram for head upregulated DEGs...")
display_venn_with_datatable(venn_data_up, "Venn Diagram of Head Upregulated DEGs - Locusts", allspecies_df, filtered_final_orthotable, output_prefix = "Venn_Up_HEAD_Locusts")
Overlapping Orthogroups: 
 [1] "OG0000104" "OG0000105" "OG0000371" "OG0003126" "OG0004306" "OG0009321"
 [7] "OG0002734" "OG0000391" "OG0009808" "OG0000015" "OG0000212" "OG0002726"
[13] "OG0000027" "OG0011877" "OG0009113"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
message("Processing Venn diagram for head downregulated DEGs...")
display_venn_with_datatable(venn_data_down, "Venn Diagram of Head Downregulated DEGs - Locusts", allspecies_df, filtered_final_orthotable, output_prefix = "Venn_Down_HEAD_Locusts")
Overlapping Orthogroups: 
 [1] "OG0012979" "OG0015157" "OG0000796" "OG0000296" "OG0010128" "OG0000788"
 [7] "OG0000922" "OG0000212" "OG0007990" "OG0002737" "OG0000073" "OG0000334"
[13] "OG0000823" "OG0002738" "OG0000093" "OG0000553"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
message("Processing Venn diagram for all significant DEGs...")
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Head DEGs - Locusts", allspecies_df, filtered_final_orthotable, output_prefix = "Venn_ALL_HEAD_Locusts")
Overlapping Orthogroups: 
 [1] "OG0000104" "OG0000015" "OG0000105" "OG0004296" "OG0000371" "OG0003126"
 [7] "OG0004306" "OG0012979" "OG0009321" "OG0000073" "OG0002734" "OG0015157"
[13] "OG0000796" "OG0000151" "OG0000222" "OG0000296" "OG0000391" "OG0000196"
[19] "OG0010128" "OG0009808" "OG0000788" "OG0000922" "OG0000120" "OG0000212"
[25] "OG0002726" "OG0007990" "OG0003478" "OG0000027" "OG0002737" "OG0011877"
[31] "OG0007957" "OG0009113" "OG0000334" "OG0000823" "OG0002738" "OG0000093"
[37] "OG0000552" "OG0000553"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")

ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

load_deg_data <- function(locusts, allspecies_df, filtered_final_orthotable) {
  degs_by_type <- list(
    SingleCopy = list(up = list(), down = list(), all = list()),
    MultiCopy = list(up = list(), down = list(), all = list())
  )
  
  for (species in locusts) {
    head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))
    if (!file.exists(head_file)) {
      message(paste("File not found for species:", species))
      next
    }
    
    head_data <- read.csv(head_file, stringsAsFactors = FALSE)
    
    head_data_merged <- head_data %>%
      inner_join(allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID") %>%
      inner_join(filtered_final_orthotable[, c("GeneID", "Orthogroup", "Orthogroup_Type")], by = "GeneID")
    
    head_data_merged$Orthogroup[is.na(head_data_merged$Orthogroup)] <- "Unknown"
    
    for (type in c("SingleCopy", "MultiCopy")) {
      tmp <- head_data_merged %>% filter(Orthogroup_Type == type)
      
      degs_by_type[[type]]$up[[species]] <- tmp %>%
        filter(padj < 0.05 & log2FoldChange >= 1) %>%
        pull(Orthogroup) %>%
        unique()
      
      degs_by_type[[type]]$down[[species]] <- tmp %>%
        filter(padj < 0.05 & log2FoldChange <= -1) %>%
        pull(Orthogroup) %>%
        unique()
      
      degs_by_type[[type]]$all[[species]] <- tmp %>%
        filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
        pull(Orthogroup) %>%
        unique()
    }
  }
  
  return(degs_by_type)
}

venn_data_locusts <- load_deg_data(locusts, allspecies_df, filtered_final_orthotable)

# Accessing specific types
venn_data_all_MC <- venn_data_locusts$MultiCopy$all
venn_data_all_SC <- venn_data_locusts$SingleCopy$all

# Plot
message("🧬 Head All SingleCopy Orthogroups")
display_venn_with_datatable(venn_data_all_SC, "SingleCopy All DEGs", allspecies_df, filtered_final_orthotable)
Overlapping Orthogroups: 
[1] "OG0011877" "OG0015157" "OG0012979" "OG0010128" "OG0009808" "OG0007990"

message("🧬 Head All MultiCopy Orthogroups")
display_venn_with_datatable(venn_data_all_MC, "MultiCopy All DEGs", allspecies_df, filtered_final_orthotable)
Overlapping Orthogroups: 
 [1] "OG0000823" "OG0000552" "OG0000553" "OG0003126" "OG0000796" "OG0000296"
 [7] "OG0002737" "OG0007957" "OG0009113" "OG0003478" "OG0000027" "OG0000073"
[13] "OG0000093" "OG0000334" "OG0002738" "OG0004296" "OG0000371" "OG0000105"
[19] "OG0000104" "OG0000015" "OG0004306" "OG0009321" "OG0002734" "OG0000151"
[25] "OG0000222" "OG0000196" "OG0000391" "OG0000922" "OG0000788" "OG0000120"
[31] "OG0002726" "OG0000212"

# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)
# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in locusts) {
  # Load DESeq2 results for head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))
  
  # Load the DESeq2 results
  head_data <- read.csv(head_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Rename the "gene_id" column in filtered_final_orthotable for consistency
  #colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
  
  # Merge with filtered_final_orthotable to include Orthogroup
  merged_data <- merge(head_data, filtered_final_orthotable, by = "GeneID", all.x = TRUE)
  
  # Check if merge was successful
  if (nrow(merged_data) == 0) {
    message(paste("No matching data for species:", species))
    next  # Skip if no matching data after merging
  }

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes for each tissue
  head_up <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  head_down <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
    stop("No valid data available for heatmap generation.")
}

# Filter out rows with missing Orthogroup values
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(Orthogroup))

# Check if there are any missing values in log2FoldChange (optional, just in case)
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(log2FoldChange))

# Create heatmap matrix using Orthogroup instead of GeneID
heatmap_matrix <- final_heatmap_data %>%
    group_by(Orthogroup, Species) %>%
    summarize(
        Head_Combined = sum(log2FoldChange[Tissue == "Head"], na.rm = TRUE),
        .groups = 'drop'
    ) %>%
    pivot_wider(names_from = Species, 
                values_from = Head_Combined, 
                values_fill = list(Head_Combined = 0)) %>%
    column_to_rownames("Orthogroup") %>%
    as.matrix()

# Check if heatmap_matrix is empty
if (nrow(heatmap_matrix) == 0) {
    stop("No valid data available for heatmap matrix.")
}

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Thorax tissues

# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Function to load DEGs for a given group of species
load_deg_data <- function(locusts, allspecies_df, filtered_final_orthotable) {
    degs_up <- list()
    degs_down <- list()
    degs_all <- list()
    
    # Rename the "gene_id" column in filtered_final_orthotable for consistency
    #colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
    
    for (species in locusts) {
        thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))
        
        # Check if the file exists
        if (!file.exists(thorax_file)) {
            message(paste("File not found for species:", species))
            next  # Skip this iteration if the file is missing
        }
        
        # Read the data
        thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
        
        # Rename the "X" column to "GeneID"
        #colnames(thorax_data)[colnames(thorax_data) == "X"] <- "GeneID"
        
        # Merge DEG data with GeneType and Orthogroup information
        thorax_data_merged <- merge(thorax_data, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
        thorax_data_merged <- merge(thorax_data_merged, filtered_final_orthotable[, c("GeneID", "Orthogroup")], by = "GeneID")
        
        # Handle missing Orthogroups
        thorax_data_merged$Orthogroup[is.na(thorax_data_merged$Orthogroup)] <- "Unknown"
        
        # Filter for significant DEGs (both upregulated and downregulated)
        thorax_up <- thorax_data_merged %>%
            filter(padj < 0.05 & log2FoldChange >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        thorax_down <- thorax_data_merged %>%
            filter(padj < 0.05 & log2FoldChange <= -1) %>%
            select(Orthogroup) %>%
            distinct()
        
        all_deg <- thorax_data_merged %>%
            filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        # Store the DEGs in the list
        degs_up[[species]] <- thorax_up$Orthogroup
        degs_down[[species]] <- thorax_down$Orthogroup
        degs_all[[species]] <- all_deg$Orthogroup
    }
    
    return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Function to display Venn diagram and corresponding datatable based on Orthogroups
# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df, filtered_final_orthotable, output_prefix = NULL) {
    
    # Calculate overlapping Orthogroups
    overlap_orthogroups <- Reduce(intersect, venn_data)
    
    # Print overlap info
    cat("Overlapping Orthogroups: \n")
    print(overlap_orthogroups)
    
    # If no overlaps exist, display a message and an empty plot
    if (length(overlap_orthogroups) == 0) {
        message("⚠️ No overlapping Orthogroups found. Displaying an empty Venn diagram.")
        
        # Create an empty Venn diagram placeholder
        plot.new()
        text(0.5, 0.5, "No overlapping Orthogroups found", cex = 1.5, col = "red")
        
        return(NULL)  # Exit the function gracefully
    }
    
    # Create a data frame for the overlapping Orthogroups
    overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
    
    # Merge to get species and other information from filtered_final_orthotable
    meta_brock_df <- merge(overlap_df, filtered_final_orthotable, by = "Orthogroup", all.x = TRUE)
    
    # Ensure merged data exists
    if (nrow(meta_brock_df) == 0) {
        message("⚠️ Merge failed: No matching rows after merging Orthogroups.")
        return(NULL)
    }
    
# === ✨ Save the results to CSV
if (!is.null(output_prefix)) {
    write_csv(
        meta_brock_df,
        file.path(workDir, "overlap/Locusts", paste0(output_prefix, "_overlap_table.csv"))
    )
}
    
    # Generate the Venn diagram
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("gregaria", "piceifrons", "cancellata"), 
        filename = NULL, 
        output = TRUE,
        fill = c("orange", "red", "orchid"),
        alpha = 0.5,
        cex = 3,
        fontface = "bold",
        cat.cex = 0,
        main = title,
        main.cex = 1.2
    )
    
    # Clear the current plotting area before drawing the Venn diagram
    grid.newpage()
    
    # Display the Venn diagram
    grid.draw(venn_plot)
    
    # Display the datatable for overlapping Orthogroups
    datatable(meta_brock_df, options = list(
        pageLength = 10,
        scrollX = TRUE,
        autoWidth = TRUE,
        searchHighlight = TRUE
    ),
    rownames = FALSE,
    escape = FALSE
    ) %>%
        formatStyle(
            'Species', target = 'cell',
            fontStyle = 'italic'
        )
}

# Load DEGs for locusts
venn_data_locusts <- load_deg_data(locusts, allspecies_df, filtered_final_orthotable)

# Prepare the data for Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_locusts$up[["gregaria"]],
  piceifrons = venn_data_locusts$up[["piceifrons"]],
  cancellata = venn_data_locusts$up[["cancellata"]]
)

venn_data_down <- list(
  gregaria = venn_data_locusts$down[["gregaria"]],
  piceifrons = venn_data_locusts$down[["piceifrons"]],
  cancellata = venn_data_locusts$down[["cancellata"]]
)

venn_data_all <- list(
  gregaria = venn_data_locusts$all[["gregaria"]],
  piceifrons = venn_data_locusts$all[["piceifrons"]],
  cancellata = venn_data_locusts$all[["cancellata"]]
)

# Display the Venn diagrams with fallback for missing overlaps
message("Processing Venn diagram for thorax upregulated DEGs...")
display_venn_with_datatable(venn_data_up, "Venn Diagram of Thorax Upregulated DEGs - Locusts", allspecies_df, filtered_final_orthotable, output_prefix = "Venn_Up_THORAX_Locusts")
Overlapping Orthogroups: 
 [1] "OG0000679" "OG0004144" "OG0000104" "OG0014033" "OG0003126" "OG0004306"
 [7] "OG0000001" "OG0009321" "OG0012103" "OG0000346" "OG0000014" "OG0000151"
[13] "OG0013138" "OG0000003" "OG0000391" "OG0001029" "OG0000015" "OG0010429"
[19] "OG0010105" "OG0000090" "OG0005229" "OG0011877" "OG0009700" "OG0011824"
[25] "OG0005741" "OG0003890" "OG0000130" "OG0000327"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
message("Processing Venn diagram for thorax downregulated DEGs...")
display_venn_with_datatable(venn_data_down, "Venn Diagram of Thorax Downregulated DEGs - Locusts", allspecies_df, filtered_final_orthotable, output_prefix = "Venn_Down_THORAX_Locusts")
Overlapping Orthogroups: 
 [1] "OG0012058" "OG0015157" "OG0004309" "OG0000014" "OG0000796" "OG0011174"
 [7] "OG0010128" "OG0000922" "OG0012312" "OG0000017" "OG0002737" "OG0002903"
[13] "OG0003752" "OG0000043" "OG0000334" "OG0002738"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
8df3d7c Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
message("Processing Venn diagram for all significant DEGs...")
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Thorax DEGs - Locusts", allspecies_df, filtered_final_orthotable, output_prefix = "Venn_ALL_THORAX_Locusts")
Overlapping Orthogroups: 
 [1] "OG0000679" "OG0004144" "OG0000104" "OG0000015" "OG0014033" "OG0012058"
 [7] "OG0000357" "OG0000090" "OG0003126" "OG0004306" "OG0000001" "OG0009321"
[13] "OG0015157" "OG0012103" "OG0004309" "OG0000346" "OG0000014" "OG0000796"
[19] "OG0000151" "OG0011174" "OG0013138" "OG0000003" "OG0000391" "OG0007893"
[25] "OG0001029" "OG0010128" "OG0001327" "OG0011184" "OG0000615" "OG0000922"
[31] "OG0000095" "OG0012329" "OG0010429" "OG0010105" "OG0012400" "OG0012312"
[37] "OG0000012" "OG0005229" "OG0000532" "OG0000027" "OG0000017" "OG0009757"
[43] "OG0002737" "OG0002903" "OG0011877" "OG0003752" "OG0007957" "OG0000000"
[49] "OG0009700" "OG0011824" "OG0000043" "OG0011853" "OG0000347" "OG0005741"
[55] "OG0003890" "OG0000334" "OG0000419" "OG0002738" "OG0000065" "OG0000130"
[61] "OG0000327" "OG0000093"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")

ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

load_deg_data <- function(locusts, allspecies_df, filtered_final_orthotable) {
  degs_by_type <- list(
    SingleCopy = list(up = list(), down = list(), all = list()),
    MultiCopy = list(up = list(), down = list(), all = list())
  )
  
  for (species in locusts) {
    head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))
    if (!file.exists(head_file)) {
      message(paste("File not found for species:", species))
      next
    }
    
    head_data <- read.csv(head_file, stringsAsFactors = FALSE)
    
    head_data_merged <- head_data %>%
      inner_join(allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID") %>%
      inner_join(filtered_final_orthotable[, c("GeneID", "Orthogroup", "Orthogroup_Type")], by = "GeneID")
    
    head_data_merged$Orthogroup[is.na(head_data_merged$Orthogroup)] <- "Unknown"
    
    for (type in c("SingleCopy", "MultiCopy")) {
      tmp <- head_data_merged %>% filter(Orthogroup_Type == type)
      
      degs_by_type[[type]]$up[[species]] <- tmp %>%
        filter(padj < 0.05 & log2FoldChange >= 1) %>%
        pull(Orthogroup) %>%
        unique()
      
      degs_by_type[[type]]$down[[species]] <- tmp %>%
        filter(padj < 0.05 & log2FoldChange <= -1) %>%
        pull(Orthogroup) %>%
        unique()
      
      degs_by_type[[type]]$all[[species]] <- tmp %>%
        filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
        pull(Orthogroup) %>%
        unique()
    }
  }
  
  return(degs_by_type)
}

venn_data_locusts <- load_deg_data(locusts, allspecies_df, filtered_final_orthotable)

# Accessing specific types
venn_data_all_MC <- venn_data_locusts$MultiCopy$all
venn_data_all_SC <- venn_data_locusts$SingleCopy$all

# Plot
message("🧬 Thorax All SingleCopy Orthogroups")
display_venn_with_datatable(venn_data_all_SC, "SingleCopy All DEGs", allspecies_df, filtered_final_orthotable)
Overlapping Orthogroups: 
 [1] "OG0005741" "OG0013138" "OG0011824" "OG0011853" "OG0011877" "OG0012058"
 [7] "OG0014033" "OG0012103" "OG0015157" "OG0010128" "OG0010429" "OG0012329"
[13] "OG0012400" "OG0012312" "OG0005229"

message("🧬 Thorax All MultiCopy Orthogroups")
display_venn_with_datatable(venn_data_all_MC, "MultiCopy All DEGs", allspecies_df, filtered_final_orthotable)
Overlapping Orthogroups: 
 [1] "OG0002903" "OG0000347" "OG0003126" "OG0000796" "OG0001327" "OG0000000"
 [7] "OG0009700" "OG0002737" "OG0000043" "OG0007957" "OG0000532" "OG0000027"
[13] "OG0000017" "OG0009757" "OG0003752" "OG0003890" "OG0000093" "OG0000419"
[19] "OG0000334" "OG0000001" "OG0002738" "OG0000065" "OG0000391" "OG0000130"
[25] "OG0000327" "OG0000357" "OG0004144" "OG0000679" "OG0000104" "OG0000015"
[31] "OG0000090" "OG0000014" "OG0004306" "OG0009321" "OG0004309" "OG0000346"
[37] "OG0000151" "OG0011174" "OG0000003" "OG0007893" "OG0001029" "OG0000922"
[43] "OG0000615" "OG0011184" "OG0000095" "OG0010105" "OG0000012"

# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)
# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in locusts) {
  # Load DESeq2 results for head
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))
  
  # Load the DESeq2 results
  thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Rename the "gene_id" column in filtered_final_orthotable for consistency
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
  
  # Merge with filtered_final_orthotable to include Orthogroup
  merged_data <- merge(thorax_data, filtered_final_orthotable, by = "GeneID", all.x = TRUE)
  
  # Check if merge was successful
  if (nrow(merged_data) == 0) {
    message(paste("No matching data for species:", species))
    next  # Skip if no matching data after merging
  }

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes for each tissue
  thorax_up <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  thorax_down <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
    stop("No valid data available for heatmap generation.")
}

# Filter out rows with missing Orthogroup values
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(Orthogroup))

# Check if there are any missing values in log2FoldChange (optional, just in case)
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(log2FoldChange))

# Create heatmap matrix using Orthogroup instead of GeneID
heatmap_matrix <- final_heatmap_data %>%
    group_by(Orthogroup, Species) %>%
    summarize(
        Thorax_Combined = sum(log2FoldChange[Tissue == "Thorax"], na.rm = TRUE),
        .groups = 'drop'
    ) %>%
    pivot_wider(names_from = Species, 
                values_from = Thorax_Combined, 
                values_fill = list(Thorax_Combined = 0)) %>%
    column_to_rownames("Orthogroup") %>%
    as.matrix()

# Check if heatmap_matrix is empty
if (nrow(heatmap_matrix) == 0) {
    stop("No valid data available for heatmap matrix.")
}

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

piceifrons-americana-cubense

Head tissues

# Define the species for PACclade
PACclade <- c("piceifrons", "americana", "cubense")
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Function to load DEGs for a given group of species
load_deg_data <- function(PACclade, allspecies_df, filtered_final_orthotable) {
    degs_up <- list()
    degs_down <- list()
    degs_all <- list()
    
    # Rename the "gene_id" column in filtered_final_orthotable for consistency
    colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
    
    for (species in PACclade) {
        head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))
        
        # Check if the file exists
        if (!file.exists(head_file)) {
            message(paste("File not found for species:", species))
            next  # Skip this iteration if the file is missing
        }
        
        # Read the data
        head_data <- read.csv(head_file, stringsAsFactors = FALSE)
        
        # Rename the "X" column to "GeneID"
        #colnames(head_data)[colnames(head_data) == "X"] <- "GeneID"
        
        # Merge DEG data with GeneType and Orthogroup information
        head_data_merged <- merge(head_data, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
        head_data_merged <- merge(head_data_merged, filtered_final_orthotable[, c("GeneID", "Orthogroup")], by = "GeneID")
        
        # Handle missing Orthogroups
        head_data_merged$Orthogroup[is.na(head_data_merged$Orthogroup)] <- "Unknown"
        
        # Filter for significant DEGs (both upregulated and downregulated)
        head_up <- head_data_merged %>%
            filter(padj < 0.05 & log2FoldChange >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        head_down <- head_data_merged %>%
            filter(padj < 0.05 & log2FoldChange <= -1) %>%
            select(Orthogroup) %>%
            distinct()
        
        all_deg <- head_data_merged %>%
            filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        # Store the DEGs in the list
        degs_up[[species]] <- head_up$Orthogroup
        degs_down[[species]] <- head_down$Orthogroup
        degs_all[[species]] <- all_deg$Orthogroup
    }
    
    return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Function to display Venn diagram and corresponding datatable based on Orthogroups
# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df, filtered_final_orthotable, output_prefix = NULL) {
    
    # Calculate overlapping Orthogroups
    overlap_orthogroups <- Reduce(intersect, venn_data)
    
    # Print overlap info
    cat("Overlapping Orthogroups: \n")
    print(overlap_orthogroups)
    
    # If no overlaps exist, display a message and an empty plot
    if (length(overlap_orthogroups) == 0) {
        message("⚠️ No overlapping Orthogroups found. Displaying an empty Venn diagram.")
        
        # Create an empty Venn diagram placeholder
        plot.new()
        text(0.5, 0.5, "No overlapping Orthogroups found", cex = 1.5, col = "red")
        
        return(NULL)  # Exit the function gracefully
    }
    
    # Create a data frame for the overlapping Orthogroups
    overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
    
    # Merge to get species and other information from filtered_final_orthotable
    meta_brock_df <- merge(overlap_df, filtered_final_orthotable, by = "Orthogroup", all.x = TRUE)
    
    # Ensure merged data exists
    if (nrow(meta_brock_df) == 0) {
        message("⚠️ Merge failed: No matching rows after merging Orthogroups.")
        return(NULL)
    }
    
# === ✨ Save the results to CSV
if (!is.null(output_prefix)) {
    write_csv(
        meta_brock_df,
        file.path(workDir, "overlap/Bulk_RNAseq", paste0(output_prefix, "_overlap_table.csv"))
    )
}
    
    # Generate the Venn diagram
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("piceifrons", "americana", "cubense"), 
        filename = NULL, 
        output = TRUE,
        fill = c("red", "green", "yellow"),
        alpha = 0.5,
        cex = 3,
        fontface = "bold",
        cat.cex = 0,
        main = title,
        main.cex = 1.2
    )
    
    # Clear the current plotting area before drawing the Venn diagram
    grid.newpage()
    
    # Display the Venn diagram
    grid.draw(venn_plot)
    
    # Manually create a custom legend
    legend_labels <- c("piceifrons", "americana", "cubense")
    legend_colors <- c("red", "green", "yellow")
    
    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")   # Lower the legend vertically
    
    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }
    
    # Display the merged overlapping Orthogroups table with datatable
    datatable(meta_brock_df, options = list(
        pageLength = 10,
        scrollX = TRUE,
        autoWidth = TRUE,
        searchHighlight = TRUE
    ),
    rownames = FALSE,
    escape = FALSE
    ) %>%
        formatStyle(
            'Species', target = 'cell',
            fontStyle = 'italic'
        ) %>%
        formatStyle(
            columns = names(meta_brock_df), 
            target = 'row',
            color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
            fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
            backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
        )
}

# Example for testing with your data (for PACclade)
venn_data_pacclade <- load_deg_data(PACclade, allspecies_df, filtered_final_orthotable)

# Prepare the data for the Venn diagrams for PACclade
venn_data_up <- list(
  piceifrons = venn_data_pacclade$up[["piceifrons"]],
  americana = venn_data_pacclade$up[["americana"]],
  cubense = venn_data_pacclade$up[["cubense"]]
)

venn_data_down <- list(
  piceifrons = venn_data_pacclade$down[["piceifrons"]],
  americana = venn_data_pacclade$down[["americana"]],
  cubense = venn_data_pacclade$down[["cubense"]]
)

venn_data_all <- list(
  piceifrons = venn_data_pacclade$all[["piceifrons"]],
  americana = venn_data_pacclade$all[["americana"]],
  cubense = venn_data_pacclade$all[["cubense"]]
)

message("Processing Venn diagram for head upregulated DEGs...")
display_venn_with_datatable(venn_data_up, "Venn Diagram of Head Upregulated DEGs - PAC", allspecies_df, filtered_final_orthotable, output_prefix = "Venn_Up_HEAD_PAC")
Overlapping Orthogroups: 
[1] "OG0011789" "OG0011038"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
message("Processing Venn diagram for head downregulated DEGs...")
display_venn_with_datatable(venn_data_down, "Venn Diagram of Head Downregulated DEGs - PAC", allspecies_df, filtered_final_orthotable, output_prefix = "Venn_Down_HEAD_PAC")
Overlapping Orthogroups: 
character(0)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
NULL
message("Processing Venn diagram for all significant DEGs...")
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Head DEGs - PAC", allspecies_df, filtered_final_orthotable, output_prefix = "Venn_ALL_HEAD_PAC")
Overlapping Orthogroups: 
[1] "OG0000076" "OG0000005" "OG0011789" "OG0011038"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
5fe5034 Maeva TECHER 2025-02-27
34c299a Maeva TECHER 2025-02-06
# Define the species for Group 1
PACclade <- c("piceifrons", "americana", "cubense")
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)
# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in PACclade) {
  # Load DESeq2 results for head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))
  
  # Load the DESeq2 results
  head_data <- read.csv(head_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Rename the "gene_id" column in filtered_final_orthotable for consistency
  #colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
  
  # Merge with filtered_final_orthotable to include Orthogroup
  merged_data <- merge(head_data, filtered_final_orthotable, by = "GeneID", all.x = TRUE)
  
  # Check if merge was successful
  if (nrow(merged_data) == 0) {
    message(paste("No matching data for species:", species))
    next  # Skip if no matching data after merging
  }

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes for each tissue
  head_up <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  head_down <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
    stop("No valid data available for heatmap generation.")
}

# Filter out rows with missing Orthogroup values
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(Orthogroup))

# Check if there are any missing values in log2FoldChange (optional, just in case)
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(log2FoldChange))

# Create heatmap matrix using Orthogroup instead of GeneID
heatmap_matrix <- final_heatmap_data %>%
    group_by(Orthogroup, Species) %>%
    summarize(
        Head_Combined = sum(log2FoldChange[Tissue == "Head"], na.rm = TRUE),
        .groups = 'drop'
    ) %>%
    pivot_wider(names_from = Species, 
                values_from = Head_Combined, 
                values_fill = list(Head_Combined = 0)) %>%
    column_to_rownames("Orthogroup") %>%
    as.matrix()

# Check if heatmap_matrix is empty
if (nrow(heatmap_matrix) == 0) {
    stop("No valid data available for heatmap matrix.")
}

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Thorax tissues

# Define the species for PACclade
PACclade <- c("piceifrons", "americana", "cubense")
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Function to load DEGs for a given group of species
load_deg_data <- function(PACclade, allspecies_df, filtered_final_orthotable) {
    degs_up <- list()
    degs_down <- list()
    degs_all <- list()
    
    # Rename the "gene_id" column in filtered_final_orthotable for consistency
    colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
    
    for (species in PACclade) {
        thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))
        
        # Check if the file exists
        if (!file.exists(thorax_file)) {
            message(paste("File not found for species:", species))
            next  # Skip this iteration if the file is missing
        }
        
        # Read the data
        thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
        
        # Rename the "X" column to "GeneID"
        #colnames(thorax_data)[colnames(thorax_data) == "X"] <- "GeneID"
        
        # Merge DEG data with GeneType and Orthogroup information
        thorax_data_merged <- merge(thorax_data, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
        thorax_data_merged <- merge(thorax_data_merged, filtered_final_orthotable[, c("GeneID", "Orthogroup")], by = "GeneID")
        
        # Handle missing Orthogroups
        thorax_data_merged$Orthogroup[is.na(thorax_data_merged$Orthogroup)] <- "Unknown"
        
        # Filter for significant DEGs (both upregulated and downregulated)
        thorax_up <- thorax_data_merged %>%
            filter(padj < 0.05 & log2FoldChange >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        thorax_down <- thorax_data_merged %>%
            filter(padj < 0.05 & log2FoldChange <= -1) %>%
            select(Orthogroup) %>%
            distinct()
        
        all_deg <- thorax_data_merged %>%
            filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        # Store the DEGs in the list
        degs_up[[species]] <- thorax_up$Orthogroup
        degs_down[[species]] <- thorax_down$Orthogroup
        degs_all[[species]] <- all_deg$Orthogroup
    }
    
    return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Function to display Venn diagram and corresponding datatable based on Orthogroups
# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df, filtered_final_orthotable, output_prefix = NULL) {
    
    # Calculate overlapping Orthogroups
    overlap_orthogroups <- Reduce(intersect, venn_data)
    
    # Print overlap info
    cat("Overlapping Orthogroups: \n")
    print(overlap_orthogroups)
    
    # If no overlaps exist, display a message and an empty plot
    if (length(overlap_orthogroups) == 0) {
        message("⚠️ No overlapping Orthogroups found. Displaying an empty Venn diagram.")
        
        # Create an empty Venn diagram placeholder
        plot.new()
        text(0.5, 0.5, "No overlapping Orthogroups found", cex = 1.5, col = "red")
        
        return(NULL)  # Exit the function gracefully
    }
    
    # Create a data frame for the overlapping Orthogroups
    overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
    
    # Merge to get species and other information from filtered_final_orthotable
    meta_brock_df <- merge(overlap_df, filtered_final_orthotable, by = "Orthogroup", all.x = TRUE)
    
    # Ensure merged data exists
    if (nrow(meta_brock_df) == 0) {
        message("⚠️ Merge failed: No matching rows after merging Orthogroups.")
        return(NULL)
    }

    # === ✨ Save the results to CSV
if (!is.null(output_prefix)) {
    write_csv(
        meta_brock_df,
        file.path(workDir, "overlap/Bulk_RNAseq", paste0(output_prefix, "_overlap_table.csv"))
    )
}
   
    # Generate the Venn diagram
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("piceifrons", "americana", "cubense"), 
        filename = NULL, 
        output = TRUE,
        fill = c("red", "green", "yellow"),
        alpha = 0.5,
        cex = 2,
        cat.cex = 0,
        main = title,
        main.cex = 1.2
    )
    
    # Clear the current plotting area before drawing the Venn diagram
    grid.newpage()
    
    # Display the Venn diagram
    grid.draw(venn_plot)
    
    # Manually create a custom legend
    legend_labels <- c("piceifrons", "americana", "cubense")
    legend_colors <- c("red", "green", "yellow")
    
    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")   # Lower the legend vertically
    
    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }
    
    # Display the merged overlapping Orthogroups table with datatable
    datatable(meta_brock_df, options = list(
        pageLength = 10,
        scrollX = TRUE,
        autoWidth = TRUE,
        searchHighlight = TRUE
    ),
    rownames = FALSE,
    escape = FALSE
    ) %>%
        formatStyle(
            'Species', target = 'cell',
            fontStyle = 'italic'
        ) %>%
        formatStyle(
            columns = names(meta_brock_df), 
            target = 'row',
            color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
            fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
            backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
        )
}

# Example for testing with your data (for PACclade)
venn_data_pacclade <- load_deg_data(PACclade, allspecies_df, filtered_final_orthotable)

# Prepare the data for the Venn diagrams for PACclade
venn_data_up <- list(
  piceifrons = venn_data_pacclade$up[["piceifrons"]],
  americana = venn_data_pacclade$up[["americana"]],
  cubense = venn_data_pacclade$up[["cubense"]]
)

venn_data_down <- list(
  piceifrons = venn_data_pacclade$down[["piceifrons"]],
  americana = venn_data_pacclade$down[["americana"]],
  cubense = venn_data_pacclade$down[["cubense"]]
)

venn_data_all <- list(
  piceifrons = venn_data_pacclade$all[["piceifrons"]],
  americana = venn_data_pacclade$all[["americana"]],
  cubense = venn_data_pacclade$all[["cubense"]]
)

message("Processing Venn diagram for thorax upregulated DEGs...")
display_venn_with_datatable(venn_data_up, "Venn Diagram of Thorax Upregulated DEGs - PAC", allspecies_df, filtered_final_orthotable, output_prefix = "Venn_Up_THORAX_PAC")
Overlapping Orthogroups: 
[1] "OG0000283"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
message("Processing Venn diagram for thorax downregulated DEGs...")
display_venn_with_datatable(venn_data_down, "Venn Diagram of Thorax Downregulated DEGs - PAC", allspecies_df, filtered_final_orthotable, output_prefix = "Venn_Down_THORAX_PAC")
Overlapping Orthogroups: 
[1] "OG0000014"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
message("Processing Venn diagram for all significant DEGs...")
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Thorax DEGs - PAC", allspecies_df, filtered_final_orthotable, output_prefix = "Venn_ALL_THORAX_PAC")
Overlapping Orthogroups: 
[1] "OG0000015" "OG0000233" "OG0000195" "OG0000283" "OG0001502" "OG0000218"
[7] "OG0000249" "OG0000014"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
3746422 Maeva TECHER 2025-02-12
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Define the species for PACclade
PACclade <- c("piceifrons", "americana", "cubense")
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)
# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in PACclade) {
  # Load DESeq2 results for head
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))
  
  # Load the DESeq2 results
  thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Rename the "gene_id" column in filtered_final_orthotable for consistency
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
  
  # Merge with filtered_final_orthotable to include Orthogroup
  merged_data <- merge(thorax_data, filtered_final_orthotable, by = "GeneID", all.x = TRUE)
  
  # Check if merge was successful
  if (nrow(merged_data) == 0) {
    message(paste("No matching data for species:", species))
    next  # Skip if no matching data after merging
  }

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes for each tissue
  thorax_up <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  thorax_down <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
    stop("No valid data available for heatmap generation.")
}

# Filter out rows with missing Orthogroup values
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(Orthogroup))

# Check if there are any missing values in log2FoldChange (optional, just in case)
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(log2FoldChange))

# Create heatmap matrix using Orthogroup instead of GeneID
heatmap_matrix <- final_heatmap_data %>%
    group_by(Orthogroup, Species) %>%
    summarize(
        Thorax_Combined = sum(log2FoldChange[Tissue == "Thorax"], na.rm = TRUE),
        .groups = 'drop'
    ) %>%
    pivot_wider(names_from = Species, 
                values_from = Thorax_Combined, 
                values_fill = list(Thorax_Combined = 0)) %>%
    column_to_rownames("Orthogroup") %>%
    as.matrix()

# Check if heatmap_matrix is empty
if (nrow(heatmap_matrix) == 0) {
    stop("No valid data available for heatmap matrix.")
}

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Plastic species

Head tissues

# Define the species for plastic_species
plastic_species <- c("gregaria", "piceifrons", "cancellata", "americana")
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)


# Function to load DEGs for a given group of species
load_deg_data <- function(plastic_species, allspecies_df, filtered_final_orthotable) {
    degs_up <- list()
    degs_down <- list()
    degs_all <- list()
    
    # Rename the "gene_id" column in filtered_final_orthotable for consistency
    colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
    
    for (species in plastic_species) {
        head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))
        
        # Check if the file exists
        if (!file.exists(head_file)) {
            message(paste("File not found for species:", species))
            next  # Skip this iteration if the file is missing
        }
        
        # Read the data
        head_data <- read.csv(head_file, stringsAsFactors = FALSE)
        
        # Rename the "X" column to "GeneID"
        #colnames(head_data)[colnames(head_data) == "X"] <- "GeneID"
        
        # Merge DEG data with GeneType and Orthogroup information
        head_data_merged <- merge(head_data, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
        head_data_merged <- merge(head_data_merged, filtered_final_orthotable[, c("GeneID", "Orthogroup")], by = "GeneID")
        
        # Handle missing Orthogroups
        head_data_merged$Orthogroup[is.na(head_data_merged$Orthogroup)] <- "Unknown"
        
        # Filter for significant DEGs (both upregulated and downregulated)
        head_up <- head_data_merged %>%
            filter(padj < 0.05 & log2FoldChange >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        head_down <- head_data_merged %>%
            filter(padj < 0.05 & log2FoldChange <= -1) %>%
            select(Orthogroup) %>%
            distinct()
        
        all_deg <- head_data_merged %>%
            filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        # Store the DEGs in the list
        degs_up[[species]] <- head_up$Orthogroup
        degs_down[[species]] <- head_down$Orthogroup
        degs_all[[species]] <- all_deg$Orthogroup
    }
    
    return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Function to display Venn diagram and corresponding datatable based on Orthogroups
# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df, filtered_final_orthotable) {
    
    # Calculate overlapping Orthogroups
    overlap_orthogroups <- Reduce(intersect, venn_data)
    
    # Print overlap info
    cat("Overlapping Orthogroups: \n")
    print(overlap_orthogroups)
    
    # If no overlaps exist, display a message and an empty plot
    if (length(overlap_orthogroups) == 0) {
        message("⚠️ No overlapping Orthogroups found. Displaying an empty Venn diagram.")
        
        # Create an empty Venn diagram placeholder
        plot.new()
        text(0.5, 0.5, "No overlapping Orthogroups found", cex = 1.5, col = "red")
        
        return(NULL)  # Exit the function gracefully
    }
    
    # Create a data frame for the overlapping Orthogroups
    overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
    
    # Merge to get species and other information from filtered_final_orthotable
    meta_brock_df <- merge(overlap_df, filtered_final_orthotable, by = "Orthogroup", all.x = TRUE)
    
    # Ensure merged data exists
    if (nrow(meta_brock_df) == 0) {
        message("⚠️ Merge failed: No matching rows after merging Orthogroups.")
        return(NULL)
    }
    
    # Generate the Venn diagram
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("piceifrons", "americana", "cubense", "gregaria"), 
        filename = NULL, 
        output = TRUE,
        fill = c("red", "green", "yellow", "orange"),
        alpha = 0.5,
        cex = 2,
        cat.cex = 0,
        main = title,
        main.cex = 1.2
    )
    
    # Clear the current plotting area before drawing the Venn diagram
    grid.newpage()
    
    # Display the Venn diagram
    grid.draw(venn_plot)
    
    # Manually create a custom legend
    legend_labels <- c("piceifrons", "americana", "cubense", "gregaria")
    legend_colors <- c("red", "green", "yellow", "orange")
    
    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")   # Lower the legend vertically
    
    # Draw the legend
    #for (i in 1:length(legend_labels)) {
    #    grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
    #              width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
    #              gp = gpar(fill = legend_colors[i], col = NA))
    #    grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
    #              y = legend_y - unit((i - 1) * 0.05, "npc"), 
    #              just = "left", gp = gpar(cex = 0.8))
    #}
    
    # Display the merged overlapping Orthogroups table with datatable
    datatable(meta_brock_df, options = list(
        pageLength = 10,
        scrollX = TRUE,
        autoWidth = TRUE,
        searchHighlight = TRUE
    ),
    rownames = FALSE,
    escape = FALSE
    ) %>%
        formatStyle(
            'Species', target = 'cell',
            fontStyle = 'italic'
        ) %>%
        formatStyle(
            columns = names(meta_brock_df), 
            target = 'row',
            color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
            fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
            backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
        )
}

# Example for testing with your data (for plastic_species)
venn_data_plastic_species <- load_deg_data(plastic_species, allspecies_df, filtered_final_orthotable)

# Prepare the data for the Venn diagrams for plastic_species
venn_data_up <- list(
  gregaria = venn_data_plastic_species$up[["gregaria"]],
  piceifrons = venn_data_plastic_species$up[["piceifrons"]],
  cancellata = venn_data_plastic_species$up[["cancellata"]],
  americana = venn_data_plastic_species$up[["americana"]]
)

venn_data_down <- list(
  gregaria = venn_data_plastic_species$down[["gregaria"]],
  piceifrons = venn_data_plastic_species$down[["piceifrons"]],
  cancellata = venn_data_plastic_species$down[["cancellata"]],
  americana = venn_data_plastic_species$down[["americana"]]
)

venn_data_all <- list(
  gregaria = venn_data_plastic_species$all[["gregaria"]],
  piceifrons = venn_data_plastic_species$all[["piceifrons"]],
  cancellata = venn_data_plastic_species$all[["cancellata"]],
  americana = venn_data_plastic_species$all[["americana"]]
)

# Display the Venn diagram and datatable for head upregulated DEGs (plastic_species)
display_venn_with_datatable(venn_data_up, "Venn Diagram of Head Upregulated DEGs - Plastic Species", allspecies_df, filtered_final_orthotable)
Overlapping Orthogroups: 
[1] "OG0000371" "OG0003126" "OG0004306" "OG0009808" "OG0000015" "OG0000212"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for head downregulated DEGs (plastic_species)
display_venn_with_datatable(venn_data_down, "Venn Diagram of Head Downregulated DEGs - Plastic Species", allspecies_df, filtered_final_orthotable)
Overlapping Orthogroups: 
 [1] "OG0000796" "OG0000296" "OG0010128" "OG0000922" "OG0007990" "OG0000073"
 [7] "OG0000334" "OG0000823" "OG0002738" "OG0000093" "OG0000553"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for all significant DEGs (plastic_species)
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Head DEGs - Plastic Species", allspecies_df, filtered_final_orthotable)
Overlapping Orthogroups: 
 [1] "OG0000015" "OG0000371" "OG0003126" "OG0004306" "OG0000073" "OG0000796"
 [7] "OG0000222" "OG0000296" "OG0010128" "OG0009808" "OG0000922" "OG0000120"
[13] "OG0000212" "OG0007990" "OG0003478" "OG0000334" "OG0000823" "OG0002738"
[19] "OG0000093" "OG0000553"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Define the species for Group 1
plastic_species <- c("gregaria", "piceifrons", "cancellata", "americana")
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)
# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in plastic_species) {
  # Load DESeq2 results for head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))
  
  # Load the DESeq2 results
  head_data <- read.csv(head_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Rename the "gene_id" column in filtered_final_orthotable for consistency
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
  
  # Merge with filtered_final_orthotable to include Orthogroup
  merged_data <- merge(head_data, filtered_final_orthotable, by = "GeneID", all.x = TRUE)
  
  # Check if merge was successful
  if (nrow(merged_data) == 0) {
    message(paste("No matching data for species:", species))
    next  # Skip if no matching data after merging
  }

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes for each tissue
  head_up <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  head_down <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
    stop("No valid data available for heatmap generation.")
}

# Filter out rows with missing Orthogroup values
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(Orthogroup))

# Check if there are any missing values in log2FoldChange (optional, just in case)
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(log2FoldChange))

# Create heatmap matrix using Orthogroup instead of GeneID
heatmap_matrix <- final_heatmap_data %>%
    group_by(Orthogroup, Species) %>%
    summarize(
        Head_Combined = sum(log2FoldChange[Tissue == "Head"], na.rm = TRUE),
        .groups = 'drop'
    ) %>%
    pivot_wider(names_from = Species, 
                values_from = Head_Combined, 
                values_fill = list(Head_Combined = 0)) %>%
    column_to_rownames("Orthogroup") %>%
    as.matrix()

# Check if heatmap_matrix is empty
if (nrow(heatmap_matrix) == 0) {
    stop("No valid data available for heatmap matrix.")
}

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
8df3d7c Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Thorax tissues

# Define the species for plastic_species
plastic_species <- c("gregaria", "piceifrons", "cancellata", "americana")
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)


# Function to load DEGs for a given group of species
load_deg_data <- function(plastic_species, allspecies_df, filtered_final_orthotable) {
    degs_up <- list()
    degs_down <- list()
    degs_all <- list()
    
    # Rename the "gene_id" column in filtered_final_orthotable for consistency
    colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
    
    for (species in plastic_species) {
        thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))
        
        # Check if the file exists
        if (!file.exists(thorax_file)) {
            message(paste("File not found for species:", species))
            next  # Skip this iteration if the file is missing
        }
        
        # Read the data
        thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
        
        # Rename the "X" column to "GeneID"
        #colnames(thorax_data)[colnames(thorax_data) == "X"] <- "GeneID"
        
        # Merge DEG data with GeneType and Orthogroup information
        thorax_data_merged <- merge(thorax_data, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
        thorax_data_merged <- merge(thorax_data_merged, filtered_final_orthotable[, c("GeneID", "Orthogroup")], by = "GeneID")
        
        # Handle missing Orthogroups
        thorax_data_merged$Orthogroup[is.na(thorax_data_merged$Orthogroup)] <- "Unknown"
        
        # Filter for significant DEGs (both upregulated and downregulated)
        thorax_up <- thorax_data_merged %>%
            filter(padj < 0.05 & log2FoldChange >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        thorax_down <- thorax_data_merged %>%
            filter(padj < 0.05 & log2FoldChange <= -1) %>%
            select(Orthogroup) %>%
            distinct()
        
        all_deg <- thorax_data_merged %>%
            filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        # Store the DEGs in the list
        degs_up[[species]] <- thorax_up$Orthogroup
        degs_down[[species]] <- thorax_down$Orthogroup
        degs_all[[species]] <- all_deg$Orthogroup
    }
    
    return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Function to display Venn diagram and corresponding datatable based on Orthogroups
# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df, filtered_final_orthotable) {
    
    # Calculate overlapping Orthogroups
    overlap_orthogroups <- Reduce(intersect, venn_data)
    
    # Print overlap info
    cat("Overlapping Orthogroups: \n")
    print(overlap_orthogroups)
    
    # If no overlaps exist, display a message and an empty plot
    if (length(overlap_orthogroups) == 0) {
        message("⚠️ No overlapping Orthogroups found. Displaying an empty Venn diagram.")
        
        # Create an empty Venn diagram placeholder
        plot.new()
        text(0.5, 0.5, "No overlapping Orthogroups found", cex = 1.5, col = "red")
        
        return(NULL)  # Exit the function gracefully
    }
    
    # Create a data frame for the overlapping Orthogroups
    overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
    
    # Merge to get species and other information from filtered_final_orthotable
    meta_brock_df <- merge(overlap_df, filtered_final_orthotable, by = "Orthogroup", all.x = TRUE)
    
    # Ensure merged data exists
    if (nrow(meta_brock_df) == 0) {
        message("⚠️ Merge failed: No matching rows after merging Orthogroups.")
        return(NULL)
    }
      
    # Generate the Venn diagram
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("piceifrons", "americana", "cancellata", "gregaria"), 
        filename = NULL, 
        output = TRUE,
        fill = c("red", "green", "yellow", "orange"),
        alpha = 0.5,
        cex = 2,
        cat.cex = 0,
        main = title,
        main.cex = 1.2
    )
    
    # Clear the current plotting area before drawing the Venn diagram
    grid.newpage()
    
    # Display the Venn diagram
    grid.draw(venn_plot)
    
    # Manually create a custom legend
    legend_labels <- c("piceifrons", "americana", "cancellata", "gregaria")
    legend_colors <- c("red", "green", "yellow", "orange")
    
    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")   # Lower the legend vertically
    
    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }
    
    # Display the merged overlapping Orthogroups table with datatable
    datatable(meta_brock_df, options = list(
        pageLength = 10,
        scrollX = TRUE,
        autoWidth = TRUE,
        searchHighlight = TRUE
    ),
    rownames = FALSE,
    escape = FALSE
    ) %>%
        formatStyle(
            'Species', target = 'cell',
            fontStyle = 'italic'
        ) %>%
        formatStyle(
            columns = names(meta_brock_df), 
            target = 'row',
            color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
            fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
            backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
        )
}

# Example for testing with your data (for plastic_species)
venn_data_plastic_species <- load_deg_data(plastic_species, allspecies_df, filtered_final_orthotable)

# Prepare the data for the Venn diagrams for plastic_species
venn_data_up <- list(
  gregaria = venn_data_plastic_species$up[["gregaria"]],
  piceifrons = venn_data_plastic_species$up[["piceifrons"]],
  cancellata = venn_data_plastic_species$up[["cancellata"]],
  americana = venn_data_plastic_species$up[["americana"]]
)

venn_data_down <- list(
  gregaria = venn_data_plastic_species$down[["gregaria"]],
  piceifrons = venn_data_plastic_species$down[["piceifrons"]],
  cancellata = venn_data_plastic_species$down[["cancellata"]],
  americana = venn_data_plastic_species$down[["americana"]]
)

venn_data_all <- list(
  gregaria = venn_data_plastic_species$all[["gregaria"]],
  piceifrons = venn_data_plastic_species$all[["piceifrons"]],
  cancellata = venn_data_plastic_species$all[["cancellata"]],
  americana = venn_data_plastic_species$all[["americana"]]
)

# Display the Venn diagram and datatable for thorax upregulated DEGs (plastic_species)
display_venn_with_datatable(venn_data_up, "Venn Diagram of Thorax Upregulated DEGs - Plastic Species", allspecies_df, filtered_final_orthotable)
Overlapping Orthogroups: 
 [1] "OG0014033" "OG0003126" "OG0012103" "OG0000346" "OG0000014" "OG0000015"
 [7] "OG0010429" "OG0005229" "OG0009700" "OG0011824" "OG0005741" "OG0003890"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for head downregulated DEGs (plastic_species)
display_venn_with_datatable(venn_data_down, "Venn Diagram of Thorax Downregulated DEGs - Plastic Species", allspecies_df, filtered_final_orthotable)
Overlapping Orthogroups: 
 [1] "OG0012058" "OG0004309" "OG0000014" "OG0010128" "OG0000922" "OG0012312"
 [7] "OG0002737" "OG0002903" "OG0000043" "OG0002738"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Display the Venn diagram and datatable for all significant DEGs (plastic_species)
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Thorax DEGs - Plastic Species", allspecies_df, filtered_final_orthotable)
Overlapping Orthogroups: 
 [1] "OG0000015" "OG0014033" "OG0012058" "OG0000357" "OG0003126" "OG0000001"
 [7] "OG0012103" "OG0004309" "OG0000346" "OG0000014" "OG0010128" "OG0000922"
[13] "OG0000095" "OG0010429" "OG0012312" "OG0000012" "OG0005229" "OG0000027"
[19] "OG0009757" "OG0002737" "OG0002903" "OG0000000" "OG0009700" "OG0011824"
[25] "OG0000043" "OG0000347" "OG0005741" "OG0003890" "OG0000334" "OG0002738"
[31] "OG0000065" "OG0000093"

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
89984c0 Maeva TECHER 2025-02-19
d7fa779 Maeva TECHER 2025-02-14
3746422 Maeva TECHER 2025-02-12
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Define the species for PACclade
plastic_species <- c("gregaria", "piceifrons", "cancellata", "americana")
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)
# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in plastic_species) {
  # Load DESeq2 results for head
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))
  
  # Load the DESeq2 results
  thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Rename the "gene_id" column in filtered_final_orthotable for consistency
  #colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
  
  # Merge with filtered_final_orthotable to include Orthogroup
  merged_data <- merge(thorax_data, filtered_final_orthotable, by = "GeneID", all.x = TRUE)
  
  # Check if merge was successful
  if (nrow(merged_data) == 0) {
    message(paste("No matching data for species:", species))
    next  # Skip if no matching data after merging
  }

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes for each tissue
  thorax_up <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  thorax_down <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
    stop("No valid data available for heatmap generation.")
}

# Filter out rows with missing Orthogroup values
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(Orthogroup))

# Check if there are any missing values in log2FoldChange (optional, just in case)
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(log2FoldChange))

# Create heatmap matrix using Orthogroup instead of GeneID
heatmap_matrix <- final_heatmap_data %>%
    group_by(Orthogroup, Species) %>%
    summarize(
        Thorax_Combined = sum(log2FoldChange[Tissue == "Thorax"], na.rm = TRUE),
        .groups = 'drop'
    ) %>%
    pivot_wider(names_from = Species, 
                values_from = Thorax_Combined, 
                values_fill = list(Thorax_Combined = 0)) %>%
    column_to_rownames("Orthogroup") %>%
    as.matrix()

# Check if heatmap_matrix is empty
if (nrow(heatmap_matrix) == 0) {
    stop("No valid data available for heatmap matrix.")
}

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19
# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
d7fa779 Maeva TECHER 2025-02-14
34c299a Maeva TECHER 2025-02-06
aab712a Maeva TECHER 2025-02-04
faf2db3 Maeva TECHER 2025-01-13
fe6dae9 Maeva TECHER 2024-11-19

Five species

Combined tissues

# Load orthogroup mapping
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Ensure column names are correctly set
if ("gene_id" %in% colnames(filtered_final_orthotable)) {
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
}

# Select only relevant columns and ensure uniqueness
filtered_final_orthotable <- filtered_final_orthotable %>%
  select(GeneID, Orthogroup) %>%
  distinct(GeneID, .keep_all = TRUE)  # Ensure one entry per GeneID

# Define species list
allspecies <- c("gregaria", "piceifrons", "cancellata", "americana", "cubense")

# Function to load DEGs for a given set of species and a specific tissue
load_deg_data <- function(species_list, tissue) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in species_list) {
    # Define the correct file path based on tissue
    deg_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/", tissue, "/DESeq2_sigresults_sva_", tissue, "_", species, ".csv"))
    # Read DESeq2 results
    deg_data <- read.csv(deg_file, stringsAsFactors = FALSE)
    
    # Ensure 'GeneID' column exists (some DESeq2 outputs use 'X')
    #if (!"GeneID" %in% colnames(deg_data)) {
    #  if ("X" %in% colnames(deg_data)) {
    #    colnames(deg_data)[colnames(deg_data) == "X"] <- "GeneID"
    #  } else {
    #    message(paste("No GeneID column found for", species, "in", tissue, "- Skipping"))
    #    next
    #  }
    #}
    
    # Convert to character for safe merging
    deg_data$GeneID <- as.character(deg_data$GeneID)
    filtered_final_orthotable$GeneID <- as.character(filtered_final_orthotable$GeneID)

    # Merge with orthogroup information
    deg_data <- left_join(deg_data, filtered_final_orthotable, by = "GeneID") %>%
      mutate(Orthogroup = ifelse(is.na(Orthogroup), "Unassigned", Orthogroup))  # Handle missing orthogroups
    
    # Check if data is empty
    if (nrow(deg_data) == 0) {
      message(paste("No data for species:", species, "in tissue:", tissue))
      next
    }
    
    # Filter for significant DEGs based on `log2FoldChange`
    upregulated <- deg_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(Orthogroup) %>%
      distinct()
    
    downregulated <- deg_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(Orthogroup) %>%
      distinct()
    
    all_degs <- deg_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(Orthogroup) %>%
      distinct()
    
    # Store the DEGs in the lists
    degs_up[[species]] <- upregulated$Orthogroup
    degs_down[[species]] <- downregulated$Orthogroup
    degs_all[[species]] <- all_degs$Orthogroup
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Head
venn_data_allspecies_head <- load_deg_data(allspecies, "Head")

# Load DEG data for Thorax
venn_data_allspecies_thorax <- load_deg_data(allspecies, "Thorax")

# Function to generate Venn diagrams with Orthogroups
display_venn_with_datatable <- function(venn_data, title) {
  # Calculate overlapping genes
  overlap_orthogroups <- Reduce(intersect, venn_data)
  
  # Create a dataframe for overlapping orthogroups
  overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
  
  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = allspecies,
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid", "green", "yellow"),
    alpha = 0.5, 
    cex = 2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear plotting area and display Venn diagram
  grid.newpage()
  grid.draw(venn_plot)

  # Manually create a custom legend
  legend_labels <- allspecies
  legend_colors <- c("orange", "red", "orchid", "green", "yellow")

  # Position legend
  legend_x <- unit(0.85, "npc")  
  legend_y <- unit(0.2, "npc")

  for (i in 1:length(legend_labels)) {
    grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
              width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
              gp = gpar(fill = legend_colors[i], col = NA))
    grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
              y = legend_y - unit((i - 1) * 0.05, "npc"), 
              just = "left", gp = gpar(cex = 0.8))
  }

  # Display the overlapping Orthogroups as a datatable
  datatable(overlap_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ), 
  rownames = FALSE)
}

# Display Venn diagrams and tables for HEAD
display_venn_with_datatable(venn_data_allspecies_head$up, "Venn Diagram of Upregulated Orthogroups - Head")

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
display_venn_with_datatable(venn_data_allspecies_head$down, "Venn Diagram of Downregulated Orthogroups - Head")

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
display_venn_with_datatable(venn_data_allspecies_head$all, "Venn Diagram of All Significant Orthogroups - Head")

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
# Display Venn diagrams and tables for THORAX
display_venn_with_datatable(venn_data_allspecies_thorax$up, "Venn Diagram of Upregulated Orthogroups - Thorax")

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
display_venn_with_datatable(venn_data_allspecies_thorax$down, "Venn Diagram of Downregulated Orthogroups - Thorax")

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
display_venn_with_datatable(venn_data_allspecies_thorax$all, "Venn Diagram of All Significant Orthogroups - Thorax")

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
# Load Orthogroup information
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Ensure correct column names
if ("gene_id" %in% colnames(filtered_final_orthotable)) {
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
}

# Select relevant columns and ensure uniqueness
filtered_final_orthotable <- filtered_final_orthotable %>%
  select(GeneID, Orthogroup) %>%
  distinct(GeneID, .keep_all = TRUE)  # Keep unique mapping

# Define species order explicitly
species_order <- c("nitens", "cubense", "americana", "piceifrons", "cancellata", "gregaria")

# Initialize an empty list to store heatmap data
heatmap_list <- list()

# Loop through each species to process their data
for (species in species_order) {
  message(paste("Processing species:", species))

  # Define file paths
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))

  # Check if files exist before loading
  if (!file.exists(head_file)) {
    message(paste("Missing Head file for:", species, "- Assigning empty dataset"))
    head_data <- data.frame(GeneID = character(), padj = numeric(), log2FoldChange = numeric(), stringsAsFactors = FALSE)
  } else {
    head_data <- tryCatch(read.csv(head_file, stringsAsFactors = FALSE), error = function(e) data.frame())
  }

  if (!file.exists(thorax_file)) {
    message(paste("Missing Thorax file for:", species, "- Assigning empty dataset"))
    thorax_data <- data.frame(GeneID = character(), padj = numeric(), log2FoldChange = numeric(), stringsAsFactors = FALSE)
  } else {
    thorax_data <- tryCatch(read.csv(thorax_file, stringsAsFactors = FALSE), error = function(e) data.frame())
  }

  # Ensure GeneID column exists
  #if (!"GeneID" %in% colnames(head_data) && "X" %in% colnames(head_data)) {
  #  colnames(head_data)[colnames(head_data) == "X"] <- "GeneID"
  #}
  #if (!"GeneID" %in% colnames(thorax_data) && "X" %in% colnames(thorax_data)) {
  #  colnames(thorax_data)[colnames(thorax_data) == "X"] <- "GeneID"
  #}

  # Convert GeneID to character
  head_data$GeneID <- as.character(head_data$GeneID)
  thorax_data$GeneID <- as.character(thorax_data$GeneID)
  filtered_final_orthotable$GeneID <- as.character(filtered_final_orthotable$GeneID)

  # Ensure species is not skipped if one dataset is empty
  if (nrow(head_data) == 0 && nrow(thorax_data) == 0) {
    message(paste("No data for species:", species, "- Skipping"))
    next
  }

  # If thorax data is missing, assign zero values
  if (nrow(thorax_data) == 0) {
    message(paste("No Thorax data for:", species, "- Assigning 0 values"))
    thorax_data <- data.frame(GeneID = head_data$GeneID, padj = 1, log2FoldChange = 0, stringsAsFactors = FALSE)
  }

  # Merge with orthogroup information
  head_data <- left_join(head_data, filtered_final_orthotable, by = "GeneID") %>%
    mutate(Orthogroup = ifelse(is.na(Orthogroup), "Unassigned", Orthogroup))

  thorax_data <- left_join(thorax_data, filtered_final_orthotable, by = "GeneID") %>%
    mutate(Orthogroup = ifelse(is.na(Orthogroup), "Unassigned", Orthogroup))

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes per tissue
  head_up <- head_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)

  head_down <- head_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)

  thorax_up <- thorax_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)

  thorax_down <- thorax_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)

  # Combine data and prepare for heatmap
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species),
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)

  # Append to heatmap list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data
final_heatmap_data <- bind_rows(heatmap_list)

# Ensure species order in the data
final_heatmap_data$Species <- factor(final_heatmap_data$Species, levels = species_order)

# Create heatmap matrix (Thorax only)
heatmap_matrix <- final_heatmap_data %>%
    group_by(Orthogroup, Species) %>%  # Remove Tissue to ensure unique Orthogroup rows
    summarize(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop") %>%
    pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
    distinct(Orthogroup, .keep_all = TRUE) %>%  # Ensure unique Orthogroup rows
    column_to_rownames("Orthogroup") %>%
    as.matrix()

# Explicitly reorder the columns in heatmap_matrix
heatmap_matrix <- heatmap_matrix[, species_order, drop = FALSE]  # Ensure order is applied

# Define color palettes
custom_cyan_orange_palette <- colorRampPalette(c("cyan", "cyan2", "cyan3", "black", "orange3", "orange2", "orange"))(100)
custom_blue_red_palette <- colorRampPalette(c("blue3", "blue2", "blue1", "white", "red", "red2", "red3"))(100)

# Define color breaks
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)

# Generate heatmaps
pheatmap(
  heatmap_matrix,
  color = custom_blue_red_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of Orthologs Expression in Head and Thorax Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
pheatmap(
  heatmap_matrix,
  color = custom_cyan_orange_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of Orthologs Expression in Head and Thorax Tissue- STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27

Head tissues

# Load orthogroup mapping
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Ensure column names are correctly set
if ("gene_id" %in% colnames(filtered_final_orthotable)) {
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
}

# Select only relevant columns and ensure uniqueness
filtered_final_orthotable <- filtered_final_orthotable %>%
  select(GeneID, Orthogroup) %>%
  distinct(GeneID, .keep_all = TRUE)  # Ensure one entry per GeneID

# Define species list
allspecies <- c("gregaria", "piceifrons", "cancellata", "americana", "cubense")

# Function to load DEGs for a given set of species and a specific tissue (ONLY HEAD)
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in species_list) {
    # Define the correct file path for Head
        deg_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species, ".csv"))
    
    # Read DESeq2 results
    deg_data <- read.csv(deg_file, stringsAsFactors = FALSE)
    
    # Ensure 'GeneID' column exists (some DESeq2 outputs use 'X')
    #if (!"GeneID" %in% colnames(deg_data)) {
    #  if ("X" %in% colnames(deg_data)) {
    #    colnames(deg_data)[colnames(deg_data) == "X"] <- "GeneID"
    #  } else {
    #    message(paste("No GeneID column found for", species, "in Head - Skipping"))
    #    next
    #  }
    #}
    
    # Convert to character for safe merging
    deg_data$GeneID <- as.character(deg_data$GeneID)
    filtered_final_orthotable$GeneID <- as.character(filtered_final_orthotable$GeneID)

    # Merge with orthogroup information
    deg_data <- left_join(deg_data, filtered_final_orthotable, by = "GeneID") %>%
      mutate(Orthogroup = ifelse(is.na(Orthogroup), "Unassigned", Orthogroup))  # Handle missing orthogroups
    
    # Check if data is empty
    if (nrow(deg_data) == 0) {
      message(paste("No data for species:", species, "in Head"))
      next
    }
    
    # Filter for significant DEGs based on `log2FoldChange`
    upregulated <- deg_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(Orthogroup) %>%
      distinct()
    
    downregulated <- deg_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(Orthogroup) %>%
      distinct()
    
    all_degs <- deg_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(Orthogroup) %>%
      distinct()
    
    # Store the DEGs in the lists
    degs_up[[species]] <- upregulated$Orthogroup
    degs_down[[species]] <- downregulated$Orthogroup
    degs_all[[species]] <- all_degs$Orthogroup
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Head only
venn_data_allspecies_head <- load_deg_data(allspecies)

# Function to generate Venn diagrams with Orthogroups (ONLY HEAD)
display_venn_with_datatable <- function(venn_data, title) {
  # Calculate overlapping genes
  overlap_orthogroups <- Reduce(intersect, venn_data)
  
  # Create a dataframe for overlapping orthogroups
  overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
  
  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = allspecies,
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid", "green", "yellow"),
    alpha = 0.5, 
    cex = 2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear plotting area and display Venn diagram
  grid.newpage()
  grid.draw(venn_plot)

  # Display overlapping Orthogroups as a datatable
  datatable(overlap_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ), rownames = FALSE)
}

# Display Venn diagrams and tables for HEAD only
display_venn_with_datatable(venn_data_allspecies_head$up, "Venn Diagram of Upregulated Orthogroups - Head")

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
display_venn_with_datatable(venn_data_allspecies_head$down, "Venn Diagram of Downregulated Orthogroups - Head")

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
display_venn_with_datatable(venn_data_allspecies_head$all, "Venn Diagram of All Significant Orthogroups - Head")

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
# Load Orthogroup information
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Ensure correct column names
if ("gene_id" %in% colnames(filtered_final_orthotable)) {
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
}

# Select relevant columns and ensure uniqueness
filtered_final_orthotable <- filtered_final_orthotable %>%
  select(GeneID, Orthogroup) %>%
  distinct(GeneID, .keep_all = TRUE)  # Keep unique mapping

# Define species order explicitly
species_order <- c("nitens", "cubense", "americana", "piceifrons", "cancellata", "gregaria")

# Initialize an empty list to store heatmap data
heatmap_list <- list()

# Loop through each species to process their Head data
for (species in species_order) {
  message(paste("Processing species:", species))

  # Define file path for Head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Head/DESeq2_sigresults_sva_Head_", species,".csv"))

  # Check if file exists before loading
  if (!file.exists(head_file)) {
    message(paste("Missing Head file for:", species, "- Assigning empty dataset"))
    head_data <- data.frame(GeneID = character(), padj = numeric(), log2FoldChange = numeric(), stringsAsFactors = FALSE)
  } else {
    head_data <- tryCatch(read.csv(head_file, stringsAsFactors = FALSE), error = function(e) data.frame())
  }

  # Ensure GeneID column exists
  #if (!"GeneID" %in% colnames(head_data) && "X" %in% colnames(head_data)) {
  #  colnames(head_data)[colnames(head_data) == "X"] <- "GeneID"
  #}

  # Convert GeneID to character
  head_data$GeneID <- as.character(head_data$GeneID)
  filtered_final_orthotable$GeneID <- as.character(filtered_final_orthotable$GeneID)

  # Merge with orthogroup information
  head_data <- left_join(head_data, filtered_final_orthotable, by = "GeneID") %>%
    mutate(Orthogroup = ifelse(is.na(Orthogroup), "Unassigned", Orthogroup))

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes
  head_up <- head_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)

  head_down <- head_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)

  # Combine data and prepare for heatmap
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)

  # Append to heatmap list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data
final_heatmap_data <- bind_rows(heatmap_list)

# Ensure all species are represented, even if they have no significant DEGs
for (species in species_order) {
    if (!species %in% unique(final_heatmap_data$Species)) {
        message(paste("Adding placeholder for missing species:", species))
        final_heatmap_data <- bind_rows(
            final_heatmap_data,
            data.frame(
                Orthogroup = "Unassigned",  # Placeholder Orthogroup
                log2FoldChange = 0,
                Tissue = "Head",
                Regulation = "None",
                Species = species
            )
        )
    }
}

# Ensure species order in the data
final_heatmap_data$Species <- factor(final_heatmap_data$Species, levels = species_order)

# Create heatmap matrix (Thorax only)
heatmap_matrix <- final_heatmap_data %>%
  group_by(Orthogroup, Species) %>% 
  summarize(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop") %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("Orthogroup") %>%
  as.matrix()

# Explicitly reorder the columns in heatmap_matrix
heatmap_matrix <- heatmap_matrix[, species_order, drop = FALSE]  # Ensure order is applied

# Define color palettes
custom_cyan_orange_palette <- colorRampPalette(c("cyan", "cyan2", "cyan3", "black", "orange3", "orange2", "orange"))(100)
custom_blue_red_palette <- colorRampPalette(c("blue3", "blue2", "blue1", "white", "red", "red2", "red3"))(100)

# Define color breaks
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)

# Generate heatmaps (Only Head)
pheatmap(
  heatmap_matrix,
  color = custom_blue_red_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
pheatmap(
  heatmap_matrix,
  color = custom_cyan_orange_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27

Thorax tissues

# Load orthogroup mapping
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Ensure column names are correctly set
if ("gene_id" %in% colnames(filtered_final_orthotable)) {
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
}

# Select only relevant columns and ensure uniqueness
filtered_final_orthotable <- filtered_final_orthotable %>%
  select(GeneID, Orthogroup) %>%
  distinct(GeneID, .keep_all = TRUE)  # Ensure one entry per GeneID

# Define species list
allspecies <- c("gregaria", "piceifrons", "cancellata", "americana", "cubense")

# Function to load DEGs for a given set of species and a specific tissue (ONLY thorax)
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in species_list) {
    # Define the correct file path for thorax
    deg_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species, ".csv"))

    # Read DESeq2 results
    deg_data <- read.csv(deg_file, stringsAsFactors = FALSE)
    
    # Ensure 'GeneID' column exists (some DESeq2 outputs use 'X')
    #if (!"GeneID" %in% colnames(deg_data)) {
    #  if ("X" %in% colnames(deg_data)) {
    #    colnames(deg_data)[colnames(deg_data) == "X"] <- "GeneID"
    #  } else {
    #    message(paste("No GeneID column found for", species, "in Thorax - Skipping"))
    #    next
    #  }
    #}
    
    # Convert to character for safe merging
    deg_data$GeneID <- as.character(deg_data$GeneID)
    filtered_final_orthotable$GeneID <- as.character(filtered_final_orthotable$GeneID)

    # Merge with orthogroup information
    deg_data <- left_join(deg_data, filtered_final_orthotable, by = "GeneID") %>%
      mutate(Orthogroup = ifelse(is.na(Orthogroup), "Unassigned", Orthogroup))  # Handle missing orthogroups
    
    # Check if data is empty
    if (nrow(deg_data) == 0) {
      message(paste("No data for species:", species, "in Thorax"))
      next
    }
    
    # Filter for significant DEGs based on `log2FoldChange`
    upregulated <- deg_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(Orthogroup) %>%
      distinct()
    
    downregulated <- deg_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(Orthogroup) %>%
      distinct()
    
    all_degs <- deg_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(Orthogroup) %>%
      distinct()
    
    # Store the DEGs in the lists
    degs_up[[species]] <- upregulated$Orthogroup
    degs_down[[species]] <- downregulated$Orthogroup
    degs_all[[species]] <- all_degs$Orthogroup
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Thorax only
venn_data_allspecies_thorax <- load_deg_data(allspecies)

# Function to generate Venn diagrams with Orthogroups (ONLY thorax)
display_venn_with_datatable <- function(venn_data, title) {
  # Calculate overlapping genes
  overlap_orthogroups <- Reduce(intersect, venn_data)
  
  # Create a dataframe for overlapping orthogroups
  overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
  
  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = allspecies,
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid", "green", "yellow"),
    alpha = 0.5, 
    cex = 2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear plotting area and display Venn diagram
  grid.newpage()
  grid.draw(venn_plot)

  # Display overlapping Orthogroups as a datatable
  datatable(overlap_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ), rownames = FALSE)
}

# Display Venn diagrams and tables for thorax only
display_venn_with_datatable(venn_data_allspecies_thorax$up, "Venn Diagram of Upregulated Orthogroups - Thorax")

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
display_venn_with_datatable(venn_data_allspecies_thorax$down, "Venn Diagram of Downregulated Orthogroups - Thorax")

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
display_venn_with_datatable(venn_data_allspecies_thorax$all, "Venn Diagram of All Significant Orthogroups - Thorax")

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
# Load Orthogroup information
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Ensure correct column names
if ("gene_id" %in% colnames(filtered_final_orthotable)) {
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
}

# Select relevant columns and ensure uniqueness
filtered_final_orthotable <- filtered_final_orthotable %>%
  select(GeneID, Orthogroup) %>%
  distinct(GeneID, .keep_all = TRUE)  # Keep unique mapping

# Define species order explicitly
species_order <- c("nitens", "cubense", "americana", "piceifrons", "cancellata", "gregaria")

# Initialize an empty list to store heatmap data
heatmap_list <- list()

# Loop through each species to process their Thorax data
for (species in species_order) {
  message(paste("Processing species:", species))

  # Define file path for Thorax
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq/", paste0(species, "/Thorax/DESeq2_sigresults_sva_Thorax_", species,".csv"))

  # Check if file exists before loading
  if (!file.exists(thorax_file)) {
    message(paste("Missing Thorax file for:", species, "- Assigning empty dataset"))
    thorax_data <- data.frame(GeneID = character(), padj = numeric(), log2FoldChange = numeric(), stringsAsFactors = FALSE)
  } else {
    thorax_data <- tryCatch(read.csv(thorax_file, stringsAsFactors = FALSE), error = function(e) data.frame())
  }

  # Ensure GeneID column exists
  if (!"GeneID" %in% colnames(thorax_data) && "X" %in% colnames(thorax_data)) {
    colnames(thorax_data)[colnames(thorax_data) == "X"] <- "GeneID"
  }

  # Convert GeneID to character
  thorax_data$GeneID <- as.character(thorax_data$GeneID)
  filtered_final_orthotable$GeneID <- as.character(filtered_final_orthotable$GeneID)

  # Merge with orthogroup information
  thorax_data <- left_join(thorax_data, filtered_final_orthotable, by = "GeneID") %>%
    mutate(Orthogroup = ifelse(is.na(Orthogroup), "Unassigned", Orthogroup))

  # If no significant DEGs are found, ensure the structure is correct
  if (nrow(thorax_data) == 0) {
    message(paste("No significant Thorax DEGs for:", species, "- Assigning placeholder values"))
    thorax_data <- data.frame(
      Orthogroup = character(),
      log2FoldChange = numeric(),
      Tissue = character(),
      Regulation = character(),
      Species = character()
    )
  } else {
    # Filter for significant DEGs and select top 500 upregulated and downregulated genes
    thorax_up <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange > 1) %>%
      arrange(desc(log2FoldChange)) %>%
      slice(1:500)

    thorax_down <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange < -1) %>%
      arrange(log2FoldChange) %>%
      slice(1:500)

    # Combine data and prepare for heatmap
    thorax_data <- bind_rows(
      thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
      thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
    ) %>%
      select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)
  }

  # Append to heatmap list, ensuring species is represented
  heatmap_list[[species]] <- thorax_data
}

# Combine all species data
final_heatmap_data <- bind_rows(heatmap_list)

# Ensure all species are represented, even if they have no significant DEGs
for (species in species_order) {
    if (!species %in% unique(final_heatmap_data$Species)) {
        message(paste("Adding placeholder for missing species:", species))
        final_heatmap_data <- bind_rows(
            final_heatmap_data,
            data.frame(
                Orthogroup = "Unassigned",  # Placeholder Orthogroup
                log2FoldChange = 0,
                Tissue = "Thorax",
                Regulation = "None",
                Species = species
            )
        )
    }
}

# Ensure species order in the data
final_heatmap_data$Species <- factor(final_heatmap_data$Species, levels = species_order)

# Create heatmap matrix (Thorax only)
heatmap_matrix <- final_heatmap_data %>%
  group_by(Orthogroup, Species) %>% 
  summarize(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop") %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("Orthogroup") %>%
  as.matrix()

# Explicitly reorder the columns in heatmap_matrix
heatmap_matrix <- heatmap_matrix[, species_order, drop = FALSE]  # Ensure order is applied

# Define color palettes
custom_cyan_orange_palette <- colorRampPalette(c("cyan", "cyan2", "cyan3", "black", "orange3", "orange2", "orange"))(100)
custom_blue_red_palette <- colorRampPalette(c("blue3", "blue2", "blue1", "white", "red", "red2", "red3"))(100)

# Define color breaks
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)

# Generate heatmaps (Only thorax)
pheatmap(
  heatmap_matrix,
  color = custom_blue_red_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
pheatmap(
  heatmap_matrix,
  color = custom_cyan_orange_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27

All species

Combined tissues

# Define the species list
allspecies <- c("nitens", "cubense", "americana", "piceifrons", "cancellata", "gregaria")

ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Polyneoptera/Results_I2_iqtree/"
input_file <- file.path(ortho_dir, "Orthogroups_genesproteinbiotype_13species_annotated_May2025.csv")

# Load the orthogroup mapping file
if (!file.exists(input_file)) {
  stop("Error: Orthogroup mapping file not found!")
}
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Function to load DESeq2 data and map GeneIDs to Orthogroups
load_deseq2_upset_data <- function(tissue) {
  species_deg_list <- list()  # Store significant Orthogroups per species

  for (species in allspecies) {
    # Construct the correct file path
    deg_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", species, tissue, 
                          paste0("DESeq2_sigresults_sva_", tissue, "_", species, ".csv"))
    
    # Skip if file does not exist
    if (!file.exists(deg_file)) {
      message(paste("File missing for species:", species))
      next
    }
    
    # Load the DESeq2 results file
    deseq_data <- read.csv(deg_file, stringsAsFactors = FALSE)

    # Check for GeneID column
    if (!"GeneID" %in% colnames(deseq_data)) {
      if ("X" %in% colnames(deseq_data)) {
        colnames(deseq_data)[colnames(deseq_data) == "X"] <- "GeneID"
      } else {
        stop(paste("Error: No 'GeneID' column found in", deg_file))
      }
    }

    # Merge DESeq2 results with the orthogroup mapping
    deseq_data_merged <- merge(deseq_data, filtered_final_orthotable[, c("GeneID", "Orthogroup")], by = "GeneID", all.x = TRUE)

    # Handle missing Orthogroups
    deseq_data_merged$Orthogroup[is.na(deseq_data_merged$Orthogroup)] <- "Unknown"

    # Filter for significant DEGs based on Orthogroups
    significant_orthogroups <- deseq_data_merged %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      pull(Orthogroup) %>%
      unique()  # Remove duplicates

    # Store the Orthogroup list for the species
    species_deg_list[[species]] <- significant_orthogroups
  }

  # Create a binary matrix for UpSet plot
  all_orthogroups <- unique(unlist(species_deg_list))  # Collect all unique Orthogroups
  upset_data <- data.frame(Orthogroup = all_orthogroups)

  for (species in allspecies) {
    upset_data[[species]] <- as.integer(all_orthogroups %in% species_deg_list[[species]])
  }

  return(upset_data)
}


# Load DEG data based on Orthogroups for Head and Thorax
upset_data_head <- load_deseq2_upset_data("Head")
upset_data_thorax <- load_deseq2_upset_data("Thorax")

convert_upset_to_venn <- function(upset_data) {
    species_sets <- list()
    
    for (species in allspecies) {
        species_sets[[species]] <- upset_data$Orthogroup[upset_data[[species]] == 1]  # Get Orthogroups present in the species
    }
    
    return(species_sets)
}


venn_data_head <- convert_upset_to_venn(upset_data_head)
venn_data_thorax <- convert_upset_to_venn(upset_data_thorax)

# Function to visualize Venn diagram using ggVennDiagram
display_ggvenn_plot <- function(venn_data, title) {
  gg_venn <- ggVennDiagram(venn_data, label_alpha = 0, edge_lty = "dashed") +
    scale_fill_gradient(low = "lightblue", high = "darkblue") +
    labs(title = title) +
    theme_minimal(base_size = 14)
  
  return(gg_venn)
}

# **Generate Venn diagrams with ORTHOGROUPS**
ggvenn_head_all <- display_ggvenn_plot(venn_data_head, "Venn Diagram of All Significant Orthogroups (Head) - All Species")
ggvenn_thorax_all <- display_ggvenn_plot(venn_data_thorax, "Venn Diagram of All Significant Orthogroups (Thorax) - All Species") 


display_upset_plot <- function(upset_data, title) {
    upset_plot <- upset(
        upset_data,
        allspecies,
        sort_sets = FALSE,
        base_annotations = list(
            'Intersection size' = intersection_size(counts = FALSE) + 
                ylab('# Orthogroups in intersection') + 
                scale_y_continuous(expand = expansion(mult = c(0, 0.05)))
        ),
        matrix = (
            intersection_matrix(
                geom = geom_point(
                    shape = 'circle',
                    size = 4
                ),
                segment = geom_segment(
                    linetype = 'solid',
                    size = 1
                ),
                outline_color = list(
                    active = 'black',
                    inactive = 'grey80'
                )
            )
        ),
        queries = list(
            upset_query(
                intersect = c('gregaria', 'cancellata'),
                color = 'orange',
                fill = 'orange',
                only_components = c('intersections_matrix', 'Intersection size')
            ),
            upset_query(
                intersect = c('gregaria', 'piceifrons'),
                color = 'orange',
                fill = 'orange',
                only_components = c('intersections_matrix', 'Intersection size')
            ),
            upset_query(
                intersect = c('cancellata', 'piceifrons'),
                color = 'orange',
                fill = 'orange',
                only_components = c('intersections_matrix', 'Intersection size')
            ),
            upset_query(
                intersect = c('gregaria', 'piceifrons', 'cancellata'),
                color = 'darkred',
                fill = 'darkred',
                only_components = c('intersections_matrix', 'Intersection size')
            ),
            upset_query(
                intersect = c('gregaria', 'piceifrons', 'cancellata', 'americana'),
                color = 'purple',
                fill = 'purple',
                only_components = c('intersections_matrix', 'Intersection size')
            ),
            upset_query(set = 'gregaria', fill = 'darkred'),
            upset_query(set = 'piceifrons', fill = 'darkred'),
            upset_query(set = 'cancellata', fill = 'darkred'),
            upset_query(set = 'americana', fill = 'black'),
            upset_query(set = 'cubense', fill = 'black'),
            upset_query(set = 'nitens', fill = 'black')
        ),
        set_sizes = upset_set_size(
            geom = geom_bar(width = 0.8),
            position = 'right'
        ) + 
        ylab('# Orthogroups per species') + 
        theme(
            axis.line.x = element_line(colour = 'black'),
            axis.ticks.x = element_line()
        ),
        stripes = upset_stripes(
            geom = geom_segment(size = 12),
            colors = c('grey95', 'white')
        )
    ) +
    theme_minimal() +
    theme(
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        axis.line = element_line(colour = 'black'),
        text = element_text(size = 14),
        axis.text.x = element_text(face = "italic"),
        plot.title = element_text(hjust = 0.5, face = "bold", size = 16)
    ) +
    ggtitle(title)

    return(upset_plot)
}

# **Generate UpSet plots**
upset_head <- display_upset_plot(upset_data_head, "Intersection from Head")
ggvenn_head_all; print(upset_head)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
9451c02 Maeva TECHER 2025-03-03
b540a1e Maeva TECHER 2025-02-27

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
upset_thorax <- display_upset_plot(upset_data_thorax, "Intersection from Thorax")
ggvenn_thorax_all; print(upset_thorax)

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
9451c02 Maeva TECHER 2025-03-03
b540a1e Maeva TECHER 2025-02-27

Version Author Date
4e391c3 Maeva TECHER 2025-05-30
b540a1e Maeva TECHER 2025-02-27
# Function to extract Orthogroups for a specific intersection
extract_orthogroups_from_intersection <- function(upset_data, selected_species) {
    # Ensure the input species exist in the dataset
    selected_species <- intersect(selected_species, colnames(upset_data))
    
    # Select rows where all selected species have '1' (present in the intersection)
    intersecting_orthogroups <- upset_data[rowSums(upset_data[selected_species]) == length(selected_species), ]
    
    # Return only the Orthogroups as a DataFrame
    return(data.frame(Orthogroup = intersecting_orthogroups$Orthogroup))
}

shared_orthogroups <- extract_orthogroups_from_intersection(upset_data_head, c("gregaria", "cancellata", "piceifrons"))

kable(shared_orthogroups, col.names = c("Head: shared orthogroups among all locusts")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Head: shared orthogroups among all locusts
Unknown
OG0000391
OG0000922
OG0004296
OG0002737
OG0000296
OG0000015
OG0000553
OG0000120
OG0000073
OG0003126
OG0000796
OG0000212
OG0007990
OG0003478
OG0000823
OG0000334
OG0000093
OG0002738
OG0000371
OG0004306
OG0000222
OG0010128
OG0009808
OG0000788
OG0002726
OG0000196
OG0009113
OG0000027
OG0011877
OG0000104
OG0000105
OG0009321
OG0015157
OG0012979
OG0002734
OG0007957
OG0000151
OG0000552
shared_orthogroups <- extract_orthogroups_from_intersection(upset_data_head, c("gregaria", "cancellata", "piceifrons"))

kable(shared_orthogroups, col.names = c("Head: shared orthogroups among all locusts + americana")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Head: shared orthogroups among all locusts + americana
Unknown
OG0000391
OG0000922
OG0004296
OG0002737
OG0000296
OG0000015
OG0000553
OG0000120
OG0000073
OG0003126
OG0000796
OG0000212
OG0007990
OG0003478
OG0000823
OG0000334
OG0000093
OG0002738
OG0000371
OG0004306
OG0000222
OG0010128
OG0009808
OG0000788
OG0002726
OG0000196
OG0009113
OG0000027
OG0011877
OG0000104
OG0000105
OG0009321
OG0015157
OG0012979
OG0002734
OG0007957
OG0000151
OG0000552
shared_orthogroups <- extract_orthogroups_from_intersection(upset_data_thorax, c("gregaria", "cancellata", "piceifrons"))

kable(shared_orthogroups, col.names = c("Thorax: shared orthogroups among all locusts")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Thorax: shared orthogroups among all locusts
Unknown
OG0000015
OG0012400
OG0000532
OG0000130
OG0000014
OG0000095
OG0000922
OG0003126
OG0010429
OG0000012
OG0012312
OG0005229
OG0000347
OG0009757
OG0000027
OG0000043
OG0000000
OG0002737
OG0011824
OG0009700
OG0002903
OG0000334
OG0003890
OG0000065
OG0002738
OG0000093
OG0014033
OG0012058
OG0000357
OG0012103
OG0000346
OG0004309
OG0000001
OG0005741
OG0010128
OG0000796
OG0011184
OG0000615
OG0012329
OG0010105
OG0000090
OG0000003
OG0000017
OG0003752
OG0007957
OG0011853
OG0011877
OG0000327
OG0000419
OG0000104
OG0004144
OG0000679
OG0009321
OG0013138
OG0015157
OG0004306
OG0001327
OG0000151
OG0011174
OG0001029
OG0000391
OG0007893
shared_orthogroups <- extract_orthogroups_from_intersection(upset_data_thorax, c("gregaria", "cancellata", "piceifrons"))

kable(shared_orthogroups, col.names = c("Thorax: shared orthogroups among all locusts + americana")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Thorax: shared orthogroups among all locusts + americana
Unknown
OG0000015
OG0012400
OG0000532
OG0000130
OG0000014
OG0000095
OG0000922
OG0003126
OG0010429
OG0000012
OG0012312
OG0005229
OG0000347
OG0009757
OG0000027
OG0000043
OG0000000
OG0002737
OG0011824
OG0009700
OG0002903
OG0000334
OG0003890
OG0000065
OG0002738
OG0000093
OG0014033
OG0012058
OG0000357
OG0012103
OG0000346
OG0004309
OG0000001
OG0005741
OG0010128
OG0000796
OG0011184
OG0000615
OG0012329
OG0010105
OG0000090
OG0000003
OG0000017
OG0003752
OG0007957
OG0011853
OG0011877
OG0000327
OG0000419
OG0000104
OG0004144
OG0000679
OG0009321
OG0013138
OG0015157
OG0004306
OG0001327
OG0000151
OG0011174
OG0001029
OG0000391
OG0007893
# **Shared Orthogroups among Gregaria and Piceifrons (Head)**
shared_orthogroups_head_piceifrons_gregaria <- extract_orthogroups_from_intersection(upset_data_head, c("gregaria", "piceifrons"))

kable(shared_orthogroups_head_piceifrons_gregaria, col.names = c("Head: shared orthogroups between gregaria & piceifrons")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Head: shared orthogroups between gregaria & piceifrons
Unknown
OG0000157
OG0000391
OG0000922
OG0001997
OG0004296
OG0000076
OG0002737
OG0000296
OG0000730
OG0000001
OG0002116
OG0000015
OG0000553
OG0000120
OG0000073
OG0003126
OG0000796
OG0000212
OG0007990
OG0000532
OG0003478
OG0011734
OG0009711
OG0011824
OG0000823
OG0000334
OG0008369
OG0000093
OG0002738
OG0000371
OG0004306
OG0000222
OG0001028
OG0000255
OG0001106
OG0010128
OG0009808
OG0007936
OG0011185
OG0012474
OG0000788
OG0000451
OG0002726
OG0000196
OG0000220
OG0000090
OG0009113
OG0000027
OG0011877
OG0011844
OG0011108
OG0007353
OG0013224
OG0005649
OG0000228
OG0000244
OG0001581
OG0000161
OG0000104
OG0003902
OG0008906
OG0000004
OG0000105
OG0009321
OG0015157
OG0012979
OG0009400
OG0002734
OG0007957
OG0000151
OG0000552
OG0012177
OG0000763
OG0013284
OG0010366
# **Shared Orthogroups among Piceifrons and Cancellata (Head)**
shared_orthogroups_head_piceifrons_cancellata <- extract_orthogroups_from_intersection(upset_data_head, c("piceifrons", "cancellata"))

kable(shared_orthogroups_head_piceifrons_cancellata, col.names = c("Head: shared orthogroups between piceifrons & cancellata")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Head: shared orthogroups between piceifrons & cancellata
Unknown
OG0000391
OG0000922
OG0004296
OG0003752
OG0000000
OG0002737
OG0000296
OG0013381
OG0005229
OG0001418
OG0011963
OG0002743
OG0011150
OG0012104
OG0000019
OG0000005
OG0011789
OG0000015
OG0000553
OG0000120
OG0000073
OG0003126
OG0000796
OG0000212
OG0012394
OG0007990
OG0002714
OG0005615
OG0000013
OG0003478
OG0011825
OG0000864
OG0000192
OG0000823
OG0002903
OG0000327
OG0000334
OG0008392
OG0000093
OG0001991
OG0002738
OG0004123
OG0000371
OG0014033
OG0000915
OG0012112
OG0006411
OG0009923
OG0006809
OG0004306
OG0000072
OG0015037
OG0008780
OG0002840
OG0000222
OG0010128
OG0009808
OG0014099
OG0001974
OG0012476
OG0000788
OG0002726
OG0009280
OG0014140
OG0000196
OG0000026
OG0012497
OG0008620
OG0009113
OG0005153
OG0000562
OG0000027
OG0002736
OG0011877
OG0014998
OG0000006
OG0004476
OG0000104
OG0011121
OG0000105
OG0009560
OG0009321
OG0015157
OG0012979
OG0002734
OG0007957
OG0000151
OG0000552
OG0010706
OG0011160
OG0000268
OG0012470
OG0013283
# **Shared Orthogroups among Cancellata and Gregaria (Head)**
shared_orthogroups_head_cancellata_gregaria <- extract_orthogroups_from_intersection(upset_data_head, c("cancellata", "gregaria"))

kable(shared_orthogroups_head_cancellata_gregaria, col.names = c("Head: shared orthogroups between cancellata & gregaria")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Head: shared orthogroups between cancellata & gregaria
Unknown
OG0000391
OG0000922
OG0016608
OG0000062
OG0004296
OG0003743
OG0000795
OG0000283
OG0002737
OG0000296
OG0006026
OG0014061
OG0009397
OG0000053
OG0000014
OG0007929
OG0000015
OG0000048
OG0000553
OG0009205
OG0002848
OG0009193
OG0000120
OG0000073
OG0003126
OG0000796
OG0004168
OG0000461
OG0012374
OG0000596
OG0000212
OG0000087
OG0007990
OG0010105
OG0000012
OG0001003
OG0008939
OG0009555
OG0000347
OG0006856
OG0000871
OG0003478
OG0011808
OG0003147
OG0009700
OG0008795
OG0000823
OG0011944
OG0000334
OG0013234
OG0011943
OG0001012
OG0009717
OG0002618
OG0011945
OG0003890
OG0000065
OG0000093
OG0002740
OG0002738
OG0011926
OG0011949
OG0010020
OG0000371
OG0007008
OG0012103
OG0013249
OG0004306
OG0000222
OG0001231
OG0010128
OG0009808
OG0000788
OG0002726
OG0000196
OG0009113
OG0000027
OG0011877
OG0000104
OG0000105
OG0009321
OG0015157
OG0012979
OG0002734
OG0007957
OG0000151
OG0000552
OG0005405
OG0001199
OG0001029
OG0008058
OG0009204
OG0004566
OG0009673
OG0009206
OG0002882
OG0001031
OG0001596
OG0013924
OG0011221
OG0000003
OG0000119
OG0010109
OG0010589
OG0012892
OG0010254
OG0010252
OG0000185
OG0011473
OG0010859
OG0000394
OG0006787
OG0000169
OG0007332
OG0000035
OG0010375
OG0014014
OG0013838
OG0000419
OG0011410
OG0001018
OG0010601
OG0011697
OG0010104
OG0011736
OG0003011
OG0008762
OG0012954
OG0005373
OG0010504
OG0000565
OG0012122
OG0000232
# **Shared Orthogroups among Gregaria and Piceifrons (Thorax)**
shared_orthogroups_thorax_piceifrons_gregaria <- extract_orthogroups_from_intersection(upset_data_thorax, c("gregaria", "piceifrons"))

kable(shared_orthogroups_thorax_piceifrons_gregaria, col.names = c("Thorax: shared orthogroups between gregaria & piceifrons")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Thorax: shared orthogroups between gregaria & piceifrons
Unknown
OG0000015
OG0000114
OG0012400
OG0013163
OG0000532
OG0011701
OG0000130
OG0000014
OG0000095
OG0003261
OG0011187
OG0000922
OG0010050
OG0003126
OG0000076
OG0010018
OG0010429
OG0007990
OG0000012
OG0012312
OG0005229
OG0008415
OG0009942
OG0000347
OG0003860
OG0009757
OG0000027
OG0009972
OG0000043
OG0000004
OG0000000
OG0009711
OG0012891
OG0002737
OG0011824
OG0009700
OG0000067
OG0000399
OG0012848
OG0002903
OG0000334
OG0007924
OG0001012
OG0012445
OG0003890
OG0000065
OG0002738
OG0000093
OG0014033
OG0012058
OG0000357
OG0012103
OG0000346
OG0004309
OG0000001
OG0009925
OG0015031
OG0012201
OG0005741
OG0010128
OG0012007
OG0000796
OG0011184
OG0012260
OG0012283
OG0004527
OG0009915
OG0000315
OG0000615
OG0012474
OG0001031
OG0013333
OG0000212
OG0010657
OG0013062
OG0000801
OG0001343
OG0012329
OG0000035
OG0010105
OG0009175
OG0000090
OG0004334
OG0000003
OG0012497
OG0012492
OG0001479
OG0008198
OG0008948
OG0004273
OG0011074
OG0010812
OG0006572
OG0000017
OG0008688
OG0003752
OG0012429
OG0007957
OG0000044
OG0004340
OG0007923
OG0009569
OG0008741
OG0011853
OG0011718
OG0011877
OG0011844
OG0001581
OG0000553
OG0014019
OG0007925
OG0000327
OG0013838
OG0003395
OG0000475
OG0010561
OG0011944
OG0000419
OG0000244
OG0013213
OG0010601
OG0011991
OG0009107
OG0000161
OG0000104
OG0003902
OG0004144
OG0011131
OG0011913
OG0000679
OG0000105
OG0012954
OG0009321
OG0013138
OG0015157
OG0002734
OG0006526
OG0012107
OG0004306
OG0001180
OG0000591
OG0012129
OG0001327
OG0000151
OG0011174
OG0000137
OG0004663
OG0009139
OG0001028
OG0001029
OG0001231
OG0000391
OG0001501
OG0007893
OG0012177
OG0000763
OG0015208
OG0013288
OG0001106
# **Shared Orthogroups among Piceifrons and Cancellata (Thorax)**
shared_orthogroups_thorax_piceifrons_cancellata <- extract_orthogroups_from_intersection(upset_data_thorax, c("piceifrons", "cancellata"))

kable(shared_orthogroups_thorax_piceifrons_cancellata, col.names = c("Thorax: shared orthogroups between piceifrons & cancellata")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Thorax: shared orthogroups between piceifrons & cancellata
Unknown
OG0011638
OG0000233
OG0000015
OG0000019
OG0012400
OG0000283
OG0000532
OG0000249
OG0005048
OG0000130
OG0000014
OG0000053
OG0000095
OG0013316
OG0012287
OG0001341
OG0012251
OG0009191
OG0000073
OG0000922
OG0003126
OG0000066
OG0010429
OG0000012
OG0012312
OG0005229
OG0008437
OG0000347
OG0010546
OG0000013
OG0008681
OG0009757
OG0000027
OG0008946
OG0000043
OG0011849
OG0000000
OG0000348
OG0002737
OG0011789
OG0011824
OG0009700
OG0008103
OG0002903
OG0011963
OG0000334
OG0003890
OG0000065
OG0008750
OG0002738
OG0000093
OG0014033
OG0012058
OG0000357
OG0012103
OG0000346
OG0004309
OG0000001
OG0002024
OG0001411
OG0000050
OG0005741
OG0000356
OG0010128
OG0000796
OG0000552
OG0000567
OG0001974
OG0011184
OG0010639
OG0013305
OG0000615
OG0001305
OG0004981
OG0003920
OG0000596
OG0012329
OG0010105
OG0003066
OG0000090
OG0000606
OG0000026
OG0007959
OG0000003
OG0000337
OG0008498
OG0005225
OG0008740
OG0000017
OG0000956
OG0003752
OG0009115
OG0000303
OG0007957
OG0002116
OG0011825
OG0011832
OG0002671
OG0013160
OG0011853
OG0011860
OG0011877
OG0000327
OG0008473
OG0004598
OG0010201
OG0000419
OG0000761
OG0011098
OG0011130
OG0000104
OG0009167
OG0004144
OG0000371
OG0011839
OG0000679
OG0011997
OG0000279
OG0008537
OG0009321
OG0003529
OG0013138
OG0015157
OG0011136
OG0000006
OG0001529
OG0004306
OG0001327
OG0000151
OG0005779
OG0011174
OG0001029
OG0000391
OG0011160
OG0007893
OG0012470
# **Shared Orthogroups among Cancellata and Gregaria (Thorax)**
shared_orthogroups_thorax_cancellata_gregaria <- extract_orthogroups_from_intersection(upset_data_thorax, c("cancellata", "gregaria"))

kable(shared_orthogroups_thorax_cancellata_gregaria, col.names = c("Thorax: shared orthogroups between cancellata & gregaria")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"))
Thorax: shared orthogroups between cancellata & gregaria
OG0000667
Unknown
OG0015164
OG0000015
OG0003743
OG0012400
OG0000080
OG0000532
OG0000130
OG0000014
OG0009272
OG0000095
OG0009216
OG0012279
OG0000301
OG0009625
OG0008999
OG0009193
OG0000922
OG0003126
OG0010077
OG0000461
OG0010429
OG0012374
OG0000119
OG0000012
OG0012312
OG0005647
OG0005615
OG0006448
OG0005038
OG0005229
OG0006530
OG0009556
OG0009555
OG0008938
OG0000347
OG0006856
OG0009757
OG0006607
OG0011746
OG0000005
OG0000027
OG0000324
OG0008009
OG0002736
OG0000043
OG0000000
OG0002737
OG0011824
OG0009700
OG0000823
OG0002903
OG0000334
OG0001973
OG0011943
OG0006633
OG0008392
OG0011274
OG0003890
OG0000065
OG0004297
OG0002738
OG0000093
OG0003031
OG0014033
OG0012058
OG0000357
OG0010800
OG0012103
OG0000346
OG0004309
OG0000001
OG0004339
OG0004616
OG0000618
OG0001823
OG0005741
OG0010128
OG0010985
OG0000796
OG0011184
OG0000615
OG0012329
OG0010105
OG0000090
OG0000003
OG0000017
OG0003752
OG0007957
OG0011853
OG0011877
OG0000327
OG0000419
OG0000104
OG0004144
OG0000679
OG0009321
OG0013138
OG0015157
OG0004306
OG0001327
OG0000151
OG0011174
OG0001029
OG0000391
OG0007893
OG0011127
OG0000296
OG0003317
OG0000060
OG0007920
OG0000463
OG0009386
OG0008253
OG0008501
OG0010215
OG0009659
OG0013036
OG0009205
OG0013045
OG0000497
OG0004566
OG0002882
OG0000196
OG0001596
OG0000787
OG0013924
OG0011233
OG0011221
OG0012494
OG0016608
OG0011134
OG0000387
OG0011191
OG0000340
OG0000557
OG0011053
OG0013159
OG0002177
OG0009643
OG0003147
OG0010254
OG0010859
OG0011739
OG0006789
OG0007921
OG0000087
OG0001048
OG0000213
OG0003883
OG0010410
OG0011926
OG0013234
OG0008939
OG0001003
OG0008154
OG0008629
OG0005022
OG0007100
OG0000267
OG0010504
OG0000565
OG0013252
OG0012122

sessionInfo()
R version 4.4.2 (2024-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Sequoia 15.5

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Asia/Tokyo
tzcode source: internal

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] ComplexUpset_1.3.6  UpSetR_1.4.0        data.table_1.17.6  
 [4] lubridate_1.9.4     forcats_1.0.0       stringr_1.5.1      
 [7] purrr_1.0.4         tidyverse_2.0.0     readr_2.1.5        
[10] DT_0.33             gridExtra_2.3       VennDiagram_1.7.3  
[13] futile.logger_1.4.3 tibble_3.3.0        kableExtra_1.4.0   
[16] viridis_0.6.5       viridisLite_0.4.2   RColorBrewer_1.1-3 
[19] tidyr_1.3.1         pheatmap_1.0.13     ggVennDiagram_1.5.4
[22] htmlwidgets_1.6.4   plotly_4.11.0       ggplot2_3.5.2      
[25] dplyr_1.1.4         knitr_1.50         

loaded via a namespace (and not attached):
 [1] gtable_0.3.6         xfun_0.52            bslib_0.9.0         
 [4] tzdb_0.5.0           crosstalk_1.2.1      vctrs_0.6.5         
 [7] tools_4.4.2          generics_0.1.4       parallel_4.4.2      
[10] pkgconfig_2.0.3      lifecycle_1.0.4      compiler_4.4.2      
[13] farver_2.1.2         git2r_0.36.2         textshaping_1.0.1   
[16] httpuv_1.6.16        htmltools_0.5.8.1    sass_0.4.10         
[19] yaml_2.3.10          lazyeval_0.2.2       crayon_1.5.3        
[22] later_1.4.2          pillar_1.10.2        jquerylib_0.1.4     
[25] whisker_0.4.1        cachem_1.1.0         tidyselect_1.2.1    
[28] digest_0.6.37        stringi_1.8.7        labeling_0.4.3      
[31] rprojroot_2.0.4      fastmap_1.2.0        colorspace_2.1-1    
[34] cli_3.6.5            magrittr_2.0.3       patchwork_1.3.1     
[37] dichromat_2.0-0.1    withr_3.0.2          scales_1.4.0        
[40] promises_1.3.3       bit64_4.6.0-1        timechange_0.3.0    
[43] rmarkdown_2.29       lambda.r_1.2.4       httr_1.4.7          
[46] bit_4.6.0            workflowr_1.7.1      ragg_1.4.0          
[49] hms_1.1.3            evaluate_1.0.4       rlang_1.1.6         
[52] futile.options_1.0.1 Rcpp_1.0.14          glue_1.8.0          
[55] formatR_1.14         xml2_1.3.8           vroom_1.6.5         
[58] svglite_2.2.1        rstudioapi_0.17.1    jsonlite_2.0.0      
[61] plyr_1.8.9           R6_2.6.1             systemfonts_1.2.3   
[64] fs_1.6.6