Last updated: 2025-02-11

Checks: 5 2

Knit directory: locust-comparative-genomics/

This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: uncommitted changes

The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Environment: empty

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

Seed: set.seed(20221025)

The command set.seed(20221025) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Session information: recorded

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Cache: none

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

File paths: absolute

Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.

absolute	relative
/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data	data
/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Schistocerca	data/orthofinder/Schistocerca

Repository version: 34c299a

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 34c299a. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    analysis/.DS_Store
    Ignored:    analysis/.Rhistory
    Ignored:    analysis/figure/
    Ignored:    data/.DS_Store
    Ignored:    data/WGCNA_input/.DS_Store
    Ignored:    data/WGCNA_output/.DS_Store
    Ignored:    data/behavioral_data/.DS_Store
    Ignored:    data/behavioral_data/Raw_data/.DS_Store
    Ignored:    data/list/.DS_Store
    Ignored:    data/list/GO_Annotations/.DS_Store
    Ignored:    data/orthofinder/.DS_Store
    Ignored:    data/orthofinder/Polyneoptera/.DS_Store
    Ignored:    data/orthofinder/Polyneoptera/Results_I2/.DS_Store
    Ignored:    data/orthofinder/Polyneoptera/Results_I2/Orthogroups/.DS_Store
    Ignored:    data/orthofinder/Polyneoptera/Results_I5/.DS_Store
    Ignored:    data/orthofinder/Polyneoptera/Results_I5/Orthogroups/.DS_Store
    Ignored:    data/orthofinder/Schistocerca/.DS_Store
    Ignored:    data/orthofinder/Schistocerca/Results_I2/.DS_Store
    Ignored:    data/orthofinder/Schistocerca/Results_I2/Orthogroups/.DS_Store
    Ignored:    data/orthofinder/Schistocerca/Results_I5/.DS_Store
    Ignored:    data/orthofinder/Schistocerca/Results_I5/Orthogroups/.DS_Store
    Ignored:    data/overlap/.DS_Store
    Ignored:    data/overlap/Bulk_RNAseq/
    Ignored:    data/readcounts/.DS_Store

Untracked files:
    Untracked:  RNAi/
    Untracked:  VennDiagram.2025-02-11_16-28-58.631802.log
    Untracked:  VennDiagram.2025-02-11_16-28-59.434349.log
    Untracked:  VennDiagram.2025-02-11_16-29-00.146234.log
    Untracked:  VennDiagram.2025-02-11_16-29-00.755715.log
    Untracked:  VennDiagram.2025-02-11_16-29-01.439233.log
    Untracked:  VennDiagram.2025-02-11_16-29-02.031421.log
    Untracked:  VennDiagram.2025-02-11_16-29-03.067761.log
    Untracked:  VennDiagram.2025-02-11_16-29-03.247269.log
    Untracked:  VennDiagram.2025-02-11_16-29-03.343314.log
    Untracked:  VennDiagram.2025-02-11_16-29-04.412786.log
    Untracked:  VennDiagram.2025-02-11_16-29-04.558659.log
    Untracked:  VennDiagram.2025-02-11_16-29-04.763436.log
    Untracked:  VennDiagram.2025-02-11_16-29-06.044627.log
    Untracked:  VennDiagram.2025-02-11_16-29-06.091697.log
    Untracked:  VennDiagram.2025-02-11_16-29-06.167874.log
    Untracked:  VennDiagram.2025-02-11_16-29-07.042968.log
    Untracked:  VennDiagram.2025-02-11_16-29-07.087622.log
    Untracked:  VennDiagram.2025-02-11_16-29-07.241275.log
    Untracked:  VennDiagram.2025-02-11_16-29-08.443598.log
    Untracked:  VennDiagram.2025-02-11_16-29-08.622287.log
    Untracked:  VennDiagram.2025-02-11_16-29-08.732962.log
    Untracked:  VennDiagram.2025-02-11_16-29-09.846772.log
    Untracked:  VennDiagram.2025-02-11_16-29-10.049437.log
    Untracked:  VennDiagram.2025-02-11_16-29-10.235192.log
    Untracked:  VennDiagram.2025-02-11_16-29-11.696463.log
    Untracked:  VennDiagram.2025-02-11_16-29-11.783422.log
    Untracked:  VennDiagram.2025-02-11_16-29-11.956005.log
    Untracked:  VennDiagram.2025-02-11_16-29-12.428473.log
    Untracked:  VennDiagram.2025-02-11_16-29-12.60033.log
    Untracked:  VennDiagram.2025-02-11_16-29-12.769138.log
    Untracked:  VennDiagram.2025-02-11_16-29-20.169018.log
    Untracked:  VennDiagram.2025-02-11_16-29-20.806329.log
    Untracked:  VennDiagram.2025-02-11_16-29-21.487985.log
    Untracked:  VennDiagram.2025-02-11_16-29-22.240217.log
    Untracked:  VennDiagram.2025-02-11_16-29-22.878057.log
    Untracked:  VennDiagram.2025-02-11_16-29-23.558018.log
    Untracked:  VennDiagram.2025-02-11_16-29-27.032137.log
    Untracked:  VennDiagram.2025-02-11_16-30-30.327967.log
    Untracked:  VennDiagram.2025-02-11_16-31-37.071483.log
    Untracked:  VennDiagram.2025-02-11_16-32-27.707777.log
    Untracked:  VennDiagram.2025-02-11_16-32-28.409819.log
    Untracked:  VennDiagram.2025-02-11_16-32-29.136093.log
    Untracked:  VennDiagram.2025-02-11_16-32-29.743477.log
    Untracked:  VennDiagram.2025-02-11_16-32-30.444722.log
    Untracked:  VennDiagram.2025-02-11_16-32-31.051898.log
    Untracked:  VennDiagram.2025-02-11_16-32-32.099499.log
    Untracked:  VennDiagram.2025-02-11_16-32-32.27424.log
    Untracked:  VennDiagram.2025-02-11_16-32-32.371268.log
    Untracked:  VennDiagram.2025-02-11_16-32-33.451101.log
    Untracked:  VennDiagram.2025-02-11_16-32-33.582738.log
    Untracked:  VennDiagram.2025-02-11_16-32-33.795264.log
    Untracked:  VennDiagram.2025-02-11_16-32-35.041248.log
    Untracked:  VennDiagram.2025-02-11_16-32-35.087599.log
    Untracked:  VennDiagram.2025-02-11_16-32-35.170429.log
    Untracked:  VennDiagram.2025-02-11_16-32-36.079622.log
    Untracked:  VennDiagram.2025-02-11_16-32-36.125552.log
    Untracked:  VennDiagram.2025-02-11_16-32-36.285186.log
    Untracked:  VennDiagram.2025-02-11_16-32-37.47559.log
    Untracked:  VennDiagram.2025-02-11_16-32-37.657153.log
    Untracked:  VennDiagram.2025-02-11_16-32-37.778685.log
    Untracked:  VennDiagram.2025-02-11_16-32-38.882006.log
    Untracked:  VennDiagram.2025-02-11_16-32-39.068153.log
    Untracked:  VennDiagram.2025-02-11_16-32-39.237383.log
    Untracked:  VennDiagram.2025-02-11_16-32-40.575343.log
    Untracked:  VennDiagram.2025-02-11_16-32-40.661657.log
    Untracked:  VennDiagram.2025-02-11_16-32-40.826102.log
    Untracked:  VennDiagram.2025-02-11_16-32-41.272571.log
    Untracked:  VennDiagram.2025-02-11_16-32-41.439814.log
    Untracked:  VennDiagram.2025-02-11_16-32-41.605129.log
    Untracked:  VennDiagram.2025-02-11_16-32-48.832096.log
    Untracked:  VennDiagram.2025-02-11_16-32-49.507615.log
    Untracked:  VennDiagram.2025-02-11_16-32-50.128269.log
    Untracked:  VennDiagram.2025-02-11_16-32-50.806022.log
    Untracked:  VennDiagram.2025-02-11_16-32-51.616252.log
    Untracked:  VennDiagram.2025-02-11_16-32-52.268723.log
    Untracked:  VennDiagram.2025-02-11_16-32-55.576606.log
    Untracked:  VennDiagram.2025-02-11_16-34-47.60746.log
    Untracked:  VennDiagram.2025-02-11_16-34-48.288101.log
    Untracked:  VennDiagram.2025-02-11_16-34-49.151053.log
    Untracked:  VennDiagram.2025-02-11_16-34-49.769407.log
    Untracked:  VennDiagram.2025-02-11_16-34-50.371225.log
    Untracked:  VennDiagram.2025-02-11_16-34-51.022607.log
    Untracked:  VennDiagram.2025-02-11_16-34-51.672735.log
    Untracked:  VennDiagram.2025-02-11_16-34-51.827487.log
    Untracked:  VennDiagram.2025-02-11_16-34-51.980743.log
    Untracked:  VennDiagram.2025-02-11_16-34-53.203995.log
    Untracked:  VennDiagram.2025-02-11_16-34-53.336765.log
    Untracked:  VennDiagram.2025-02-11_16-34-53.422336.log
    Untracked:  VennDiagram.2025-02-11_16-34-54.465802.log
    Untracked:  VennDiagram.2025-02-11_16-34-54.571537.log
    Untracked:  VennDiagram.2025-02-11_16-34-54.709321.log
    Untracked:  VennDiagram.2025-02-11_16-34-55.825915.log
    Untracked:  VennDiagram.2025-02-11_16-34-55.875009.log
    Untracked:  VennDiagram.2025-02-11_16-34-56.016672.log
    Untracked:  VennDiagram.2025-02-11_16-34-56.996992.log
    Untracked:  VennDiagram.2025-02-11_16-34-57.103292.log
    Untracked:  VennDiagram.2025-02-11_16-34-57.272784.log
    Untracked:  VennDiagram.2025-02-11_16-34-58.69873.log
    Untracked:  VennDiagram.2025-02-11_16-34-58.885516.log
    Untracked:  VennDiagram.2025-02-11_16-34-58.999791.log
    Untracked:  VennDiagram.2025-02-11_16-35-00.17981.log
    Untracked:  VennDiagram.2025-02-11_16-35-00.362745.log
    Untracked:  VennDiagram.2025-02-11_16-35-00.460432.log
    Untracked:  VennDiagram.2025-02-11_16-35-00.8397.log
    Untracked:  VennDiagram.2025-02-11_16-35-01.00865.log
    Untracked:  VennDiagram.2025-02-11_16-35-01.166973.log
    Untracked:  VennDiagram.2025-02-11_16-35-08.195904.log
    Untracked:  VennDiagram.2025-02-11_16-35-08.818023.log
    Untracked:  VennDiagram.2025-02-11_16-35-09.438261.log
    Untracked:  VennDiagram.2025-02-11_16-35-10.059038.log
    Untracked:  VennDiagram.2025-02-11_16-35-10.751324.log
    Untracked:  VennDiagram.2025-02-11_16-35-11.439151.log
    Untracked:  VennDiagram.2025-02-11_16-35-14.857981.log
    Untracked:  VennDiagram.2025-02-11_16-35-14.918923.log
    Untracked:  VennDiagram.2025-02-11_16-35-15.040173.log
    Untracked:  VennDiagram.2025-02-11_16-35-20.631505.log
    Untracked:  VennDiagram.2025-02-11_16-35-20.692026.log
    Untracked:  VennDiagram.2025-02-11_16-35-20.820498.log
    Untracked:  VennDiagram.2025-02-11_16-35-26.599452.log
    Untracked:  VennDiagram.2025-02-11_16-35-26.656089.log
    Untracked:  VennDiagram.2025-02-11_16-35-26.726063.log
    Untracked:  VennDiagram.2025-02-11_16-35-32.472167.log
    Untracked:  analysis/3_compiling_tables.Rmd
    Untracked:  analysis/4_RNAi_degs.Rmd
    Untracked:  data/DEG_results/
    Untracked:  data/WGCNA_input/Bulk_RNAseq/
    Untracked:  data/list/Bulk_RNAseq/
    Untracked:  data/list/RNAi/
    Untracked:  data/orthofinder/Polyneoptera/Results_I5/Orthogroups/Orthogroups.tsv
    Untracked:  data/orthofinder/Polyneoptera/Results_I5/Orthogroups/Orthogroups.txt
    Untracked:  data/orthofinder/Polyneoptera/Results_I5/Orthogroups/Orthogroups_reprocessed.tsv
    Untracked:  data/orthofinder/Polyneoptera/Results_I5/Orthogroups/Orthogroups_reprocessed.txt
    Untracked:  data/orthofinder/Schistocerca/Results_I2/Orthogroups/Orthogroups.tsv
    Untracked:  data/orthofinder/Schistocerca/Results_I2/Orthogroups/Orthogroups.txt
    Untracked:  data/orthofinder/Schistocerca/Results_I2/Orthogroups/Orthogroups_reprocessed.tsv
    Untracked:  data/orthofinder/Schistocerca/Results_I2/Orthogroups/Orthogroups_reprocessed.txt
    Untracked:  data/overlap/summaryPolyneoptera_DEGs_Orthogroups_Feb2025.csv
    Untracked:  data/overlap/summaryPolyneoptera_DEGs_Orthogroups_togregaria_Feb2025.csv
    Untracked:  data/overlap/summarySchistocerca_DEGs_Orthogroups_Feb2025.csv
    Untracked:  data/overlap/summarySchistocerca_DEGs_Orthogroups_togregaria_Feb2025.csv
    Untracked:  data/readcounts/Bulk_RNAseq/
    Untracked:  data/readcounts/RNAi/
    Untracked:  eggnog_jan24_bellini/

Unstaged changes:
    Modified:   analysis/2_orthologs-prediction.Rmd
    Modified:   analysis/3_deseq2-results.Rmd
    Modified:   analysis/3_go-enrichment.Rmd
    Modified:   analysis/3_overlap-venn.Rmd
    Modified:   analysis/_site.yml
    Deleted:    data/DEG-results/DESeq2_results_Head_americana.csv
    Deleted:    data/DEG-results/DESeq2_results_Head_cancellata.csv
    Deleted:    data/DEG-results/DESeq2_results_Head_cubense.csv
    Deleted:    data/DEG-results/DESeq2_results_Head_gregaria.csv
    Deleted:    data/DEG-results/DESeq2_results_Head_nitens.csv
    Deleted:    data/DEG-results/DESeq2_results_Head_piceifrons.csv
    Deleted:    data/DEG-results/DESeq2_results_Head_togregaria_americana.csv
    Deleted:    data/DEG-results/DESeq2_results_Head_togregaria_cancellata.csv
    Deleted:    data/DEG-results/DESeq2_results_Head_togregaria_cubense.csv
    Deleted:    data/DEG-results/DESeq2_results_Head_togregaria_gregaria.csv
    Deleted:    data/DEG-results/DESeq2_results_Head_togregaria_nitens.csv
    Deleted:    data/DEG-results/DESeq2_results_Head_togregaria_piceifrons.csv
    Deleted:    data/DEG-results/DESeq2_results_Thorax_americana.csv
    Deleted:    data/DEG-results/DESeq2_results_Thorax_cancellata.csv
    Deleted:    data/DEG-results/DESeq2_results_Thorax_cubense.csv
    Deleted:    data/DEG-results/DESeq2_results_Thorax_gregaria.csv
    Deleted:    data/DEG-results/DESeq2_results_Thorax_nitens.csv
    Deleted:    data/DEG-results/DESeq2_results_Thorax_piceifrons.csv
    Deleted:    data/DEG-results/DESeq2_results_Thorax_togregaria_americana.csv
    Deleted:    data/DEG-results/DESeq2_results_Thorax_togregaria_cancellata.csv
    Deleted:    data/DEG-results/DESeq2_results_Thorax_togregaria_cubense.csv
    Deleted:    data/DEG-results/DESeq2_results_Thorax_togregaria_gregaria.csv
    Deleted:    data/DEG-results/DESeq2_results_Thorax_togregaria_nitens.csv
    Deleted:    data/DEG-results/DESeq2_results_Thorax_togregaria_piceifrons.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Head_togregaria_americana.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Head_togregaria_cancellata.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Head_togregaria_cubense.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Head_togregaria_gregaria.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Head_togregaria_nitens.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Head_togregaria_piceifrons.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Thorax_americana.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Thorax_cancellata.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Thorax_cubense.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Thorax_gregaria.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Thorax_nitens.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Thorax_piceifrons.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Thorax_togregaria_americana.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Thorax_togregaria_cancellata.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Thorax_togregaria_cubense.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Thorax_togregaria_gregaria.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Thorax_togregaria_nitens.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_Thorax_togregaria_piceifrons.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_head_americana.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_head_cancellata.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_head_cubense.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_head_gregaria.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_head_nitens.csv
    Deleted:    data/DEG-results/DESeq2_sigresults_head_piceifrons.csv
    Deleted:    data/DEG-results/GO10_enrichment_Head_americana_custom.csv
    Deleted:    data/DEG-results/GO10_enrichment_Head_cancellata_custom.csv
    Deleted:    data/DEG-results/GO10_enrichment_Head_cubense_custom.csv
    Deleted:    data/DEG-results/GO10_enrichment_Head_gregaria_custom.csv
    Deleted:    data/DEG-results/GO10_enrichment_Head_nitens_custom.csv
    Deleted:    data/DEG-results/GO10_enrichment_Head_piceifrons_custom.csv
    Deleted:    data/DEG-results/GO10_enrichment_Thorax_americana_custom.csv
    Deleted:    data/DEG-results/GO10_enrichment_Thorax_cancellata_custom.csv
    Deleted:    data/DEG-results/GO10_enrichment_Thorax_cubense_custom.csv
    Deleted:    data/DEG-results/GO10_enrichment_Thorax_gregaria_custom.csv
    Deleted:    data/DEG-results/GO10_enrichment_Thorax_nitens_custom.csv
    Deleted:    data/DEG-results/GO10_enrichment_Thorax_piceifrons_custom.csv
    Deleted:    data/DEG-results/GO30_enrichment_Head_americana_custom.csv
    Deleted:    data/DEG-results/GO30_enrichment_Head_cancellata_custom.csv
    Deleted:    data/DEG-results/GO30_enrichment_Head_cubense_custom.csv
    Deleted:    data/DEG-results/GO30_enrichment_Head_gregaria_custom.csv
    Deleted:    data/DEG-results/GO30_enrichment_Head_nitens_custom.csv
    Deleted:    data/DEG-results/GO30_enrichment_Head_piceifrons_custom.csv
    Deleted:    data/DEG-results/GO30_enrichment_Thorax_americana_custom.csv
    Deleted:    data/DEG-results/GO30_enrichment_Thorax_cancellata_custom.csv
    Deleted:    data/DEG-results/GO30_enrichment_Thorax_cubense_custom.csv
    Deleted:    data/DEG-results/GO30_enrichment_Thorax_gregaria_custom.csv
    Deleted:    data/DEG-results/GO30_enrichment_Thorax_nitens_custom.csv
    Deleted:    data/DEG-results/GO30_enrichment_Thorax_piceifrons_custom.csv
    Deleted:    data/DEG-results/GO_enrichment_head_americana_custom.csv
    Deleted:    data/DEG-results/GO_enrichment_head_cancellata_custom.csv
    Deleted:    data/DEG-results/GO_enrichment_head_cubense_custom.csv
    Deleted:    data/DEG-results/GO_enrichment_head_gregaria_custom.csv
    Deleted:    data/DEG-results/GO_enrichment_head_gregaria_custom_top10.csv
    Deleted:    data/DEG-results/GO_enrichment_head_nitens_custom.csv
    Deleted:    data/DEG-results/GO_enrichment_head_piceifrons_custom.csv
    Deleted:    data/DEG-results/overlapping_genes_head_thorax_americana.csv
    Deleted:    data/DEG-results/overlapping_genes_head_thorax_cancellata.csv
    Deleted:    data/DEG-results/overlapping_genes_head_thorax_cubense.csv
    Deleted:    data/DEG-results/overlapping_genes_head_thorax_gregaria.csv
    Deleted:    data/DEG-results/overlapping_genes_head_thorax_nitens.csv
    Deleted:    data/DEG-results/overlapping_genes_head_thorax_piceifrons.csv
    Deleted:    data/DEG-results/scatter_plot_overlapping_genes_americana.png
    Deleted:    data/DEG-results/scatter_plot_overlapping_genes_cancellata.png
    Deleted:    data/DEG-results/scatter_plot_overlapping_genes_cubense.png
    Deleted:    data/DEG-results/scatter_plot_overlapping_genes_gregaria.png
    Deleted:    data/DEG-results/scatter_plot_overlapping_genes_nitens.png
    Deleted:    data/DEG-results/scatter_plot_overlapping_genes_piceifrons.png
    Deleted:    data/list/Head_americana_WGCNA.txt
    Deleted:    data/list/Head_americana_nooutliers.txt
    Deleted:    data/list/Head_cancellata_WGCNA.txt
    Deleted:    data/list/Head_cancellata_nooutliers.txt
    Deleted:    data/list/Head_cubense_WGCNA.txt
    Deleted:    data/list/Head_cubense_nooutliers.txt
    Deleted:    data/list/Head_gregaria.txt
    Deleted:    data/list/Head_gregaria_WGCNA.txt
    Deleted:    data/list/Head_nitens_WGCNA.txt
    Deleted:    data/list/Head_nitens_nooutliers.txt
    Deleted:    data/list/Head_piceifrons_WGCNA.txt
    Deleted:    data/list/Head_piceifrons_nooutliers.txt
    Deleted:    data/list/Thorax_americana_WGCNA.txt
    Deleted:    data/list/Thorax_americana_nooutliers.txt
    Deleted:    data/list/Thorax_cancellata_WGCNA.txt
    Deleted:    data/list/Thorax_cancellata_nooutliers.txt
    Deleted:    data/list/Thorax_cubense_WGCNA.txt
    Deleted:    data/list/Thorax_cubense_nooutliers.txt
    Deleted:    data/list/Thorax_gregaria_WGCNA.txt
    Deleted:    data/list/Thorax_gregaria_nooutliers.txt
    Deleted:    data/list/Thorax_nitens_WGCNA.txt
    Deleted:    data/list/Thorax_nitens_nooutliers.txt
    Deleted:    data/list/Thorax_piceifrons.txt
    Deleted:    data/list/Thorax_piceifrons_WGCNA.txt
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Orthogroups_13species_Jan2025.txt
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Orthogroups_genesprotein_Schisto_Jan2025.txt
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Orthogroups_genesproteinbiotype_13species_Jan2025.csv
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Plots_Polyneoptera/VerticalStackedBar_A. simplex.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Plots_Polyneoptera/VerticalStackedBar_B. rossius.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Plots_Polyneoptera/VerticalStackedBar_C. secundus.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Plots_Polyneoptera/VerticalStackedBar_D. australis.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Plots_Polyneoptera/VerticalStackedBar_G. bimaculatus.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Plots_Polyneoptera/VerticalStackedBar_G. longicornis.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Plots_Polyneoptera/VerticalStackedBar_P. americana.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Plots_Polyneoptera/VerticalStackedBar_americana.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Plots_Polyneoptera/VerticalStackedBar_cancellata.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Plots_Polyneoptera/VerticalStackedBar_cubense.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Plots_Polyneoptera/VerticalStackedBar_gregaria.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Plots_Polyneoptera/VerticalStackedBar_nitens.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2/Plots_Polyneoptera/VerticalStackedBar_piceifrons.pdf
    Modified:   data/orthofinder/Polyneoptera/Results_I2/SingleCopyOrthogroups_genesprotein_13species_Jan2025.txt
    Modified:   data/orthofinder/Schistocerca/Results_I2/Orthogroups_Schistocerca_Jan2025.txt
    Modified:   data/orthofinder/Schistocerca/Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv
    Modified:   data/orthofinder/Schistocerca/Results_I2/Plots_Schistocerca/VerticalStackedBar_americana.pdf
    Modified:   data/orthofinder/Schistocerca/Results_I2/Plots_Schistocerca/VerticalStackedBar_cancellata.pdf
    Modified:   data/orthofinder/Schistocerca/Results_I2/Plots_Schistocerca/VerticalStackedBar_cubense.pdf
    Modified:   data/orthofinder/Schistocerca/Results_I2/Plots_Schistocerca/VerticalStackedBar_gregaria.pdf
    Modified:   data/orthofinder/Schistocerca/Results_I2/Plots_Schistocerca/VerticalStackedBar_nitens.pdf
    Modified:   data/orthofinder/Schistocerca/Results_I2/Plots_Schistocerca/VerticalStackedBar_piceifrons.pdf
    Modified:   data/orthofinder/Schistocerca/Results_I2/SingleCopyOrthogroups_genesprotein_6species_Jan2025.txt
    Deleted:    data/overlap/summary_DEGs_Orthogroups_togregaria.csv
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_G_Crd_SRR11815241_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_G_Crd_SRR11815242_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_G_Crd_SRR11815243_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_G_Crd_SRR11815244_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_G_Crd_SRR11815245_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_G_Crd_SRR11815246_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_G_Crd_SRR11815247_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_G_Crd_SRR11815248_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_G_Crd_SRR11815249_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_G_Crd_SRR11815250_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_S_Iso_SRR11815252_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_S_Iso_SRR11815253_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_S_Iso_SRR11815254_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_S_Iso_SRR11815255_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_S_Iso_SRR11815256_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_S_Iso_SRR11815257_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_S_Iso_SRR11815258_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_S_Iso_SRR11815259_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_S_Iso_SRR11815260_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2-togregaria/SAMER_S_Iso_SRR11815263_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815241_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815241_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815242_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815242_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815243_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815243_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815244_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815244_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815245_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815245_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815246_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815246_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815247_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815247_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815248_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815248_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815249_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815249_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815250_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_G_Crd_SRR11815250_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815252_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815252_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815253_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815253_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815254_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815254_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815255_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815255_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815256_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815256_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815257_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815257_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815258_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815258_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815259_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815259_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815260_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815260_featurecounts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815263_counts.txt
    Deleted:    data/readcounts/03-americana-DESeq2/SAMER_S_Iso_SRR11815263_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_G_Crd_SRR17648042_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_G_Crd_SRR17648043_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_G_Crd_SRR17648048_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_G_Crd_SRR17648049_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_G_Crd_SRR17648056_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_G_Crd_SRR17648057_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_G_Crd_SRR17648058_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_G_Crd_SRR17648059_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_G_Crd_SRR17648060_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_G_Crd_SRR17648061_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_S_Iso_SRR17648044_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_S_Iso_SRR17648045_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_S_Iso_SRR17648046_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_S_Iso_SRR17648047_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_S_Iso_SRR17648050_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_S_Iso_SRR17648051_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_S_Iso_SRR17648052_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_S_Iso_SRR17648053_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_S_Iso_SRR17648054_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2-togregaria/SCANC_S_Iso_SRR17648055_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648042_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648042_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648043_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648043_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648048_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648048_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648049_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648049_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648056_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648056_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648057_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648057_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648058_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648058_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648059_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648059_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648060_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648060_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648061_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_G_Crd_SRR17648061_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648044_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648044_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648045_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648045_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648046_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648046_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648047_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648047_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648050_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648050_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648051_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648051_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648052_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648052_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648053_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648053_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648054_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648054_featurecounts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648055_counts.txt
    Deleted:    data/readcounts/03-cancellata-DESeq2/SCANC_S_Iso_SRR17648055_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_G_Crd_SRR11815219_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_G_Crd_SRR11815220_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_G_Crd_SRR11815221_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_G_Crd_SRR11815222_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_G_Crd_SRR11815223_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_G_Crd_SRR11815224_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_G_Crd_SRR11815225_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_G_Crd_SRR11815226_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_G_Crd_SRR11815227_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_G_Crd_SRR11815228_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_S_Iso_SRR11815230_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_S_Iso_SRR11815231_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_S_Iso_SRR11815232_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_S_Iso_SRR11815233_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_S_Iso_SRR11815234_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_S_Iso_SRR11815235_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_S_Iso_SRR11815236_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_S_Iso_SRR11815237_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_S_Iso_SRR11815238_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2-togregaria/SSCUB_S_Iso_SRR11815239_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815219_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815219_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815220_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815220_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815221_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815221_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815222_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815222_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815223_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815223_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815224_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815224_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815225_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815225_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815226_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815226_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815227_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815227_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815228_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_G_Crd_SRR11815228_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815230_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815230_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815231_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815231_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815232_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815232_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815233_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815233_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815234_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815234_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815235_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815235_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815236_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815236_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815237_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815237_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815238_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815238_featurecounts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815239_counts.txt
    Deleted:    data/readcounts/03-cubense-DESeq2/SSCUB_S_Iso_SRR11815239_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-HEAD-CRD-1_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-HEAD-CRD-2_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-HEAD-CRD-3_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-HEAD-CRD-4_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-HEAD-CRD-5_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-HEAD-CRD-6_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-HEAD-ISO-1_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-HEAD-ISO-2_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-HEAD-ISO-3_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-HEAD-ISO-4_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-HEAD-ISO-5_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-HEAD-ISO-6_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-THOX-CRD-1_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-THOX-CRD-2_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-THOX-CRD-3_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-THOX-CRD-4_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-THOX-CRD-5_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-THOX-CRD-6_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-THOX-ISO-1_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-THOX-ISO-2_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-THOX-ISO-3_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-THOX-ISO-4_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-THOX-ISO-5_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2-togregaria/SGRE-THOX-ISO-6_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-CRD-1_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-CRD-1_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-CRD-2_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-CRD-2_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-CRD-3_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-CRD-3_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-CRD-4_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-CRD-4_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-CRD-5_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-CRD-5_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-CRD-6_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-CRD-6_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-ISO-1_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-ISO-1_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-ISO-2_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-ISO-2_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-ISO-3_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-ISO-3_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-ISO-4_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-ISO-4_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-ISO-5_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-ISO-5_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-ISO-6_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-HEAD-ISO-6_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-CRD-1_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-CRD-1_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-CRD-2_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-CRD-2_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-CRD-3_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-CRD-3_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-CRD-4_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-CRD-4_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-CRD-5_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-CRD-5_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-CRD-6_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-CRD-6_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-ISO-1_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-ISO-1_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-ISO-2_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-ISO-2_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-ISO-3_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-ISO-3_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-ISO-4_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-ISO-4_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-ISO-5_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-ISO-5_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-ISO-6_MERGE_counts.txt
    Deleted:    data/readcounts/03-gregaria-DESeq2/SGRE-THOX-ISO-6_MERGE_featurecounts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-G-CCT-1-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-G-CCT-2-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-G-CCT-3-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-G-CCT-5-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-G-CCT-6-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-G-CCT-8-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-G-I72-2-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-G-I72-4-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-G-I72-5-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-G-I72-6-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-G-I72-8-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-G-I72-9-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-S-C72-2-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-S-C72-3-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-S-C72-4-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-S-C72-5-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-S-C72-6-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-S-C72-7-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-S-ICT-1-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-S-ICT-10-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-S-ICT-2-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-S-ICT-5-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-S-ICT-6-AGY_counts.txt
    Deleted:    data/readcounts/03-gregaria-vivian/GREG-S-ICT-8-AGY_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_G_Crd_SRR11815197_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_G_Crd_SRR11815198_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_G_Crd_SRR11815199_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_G_Crd_SRR11815200_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_G_Crd_SRR11815201_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_G_Crd_SRR11815202_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_G_Crd_SRR11815203_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_G_Crd_SRR11815204_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_G_Crd_SRR11815205_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_G_Crd_SRR11815206_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_S_Iso_SRR11815208_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_S_Iso_SRR11815209_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_S_Iso_SRR11815210_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_S_Iso_SRR11815211_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_S_Iso_SRR11815212_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_S_Iso_SRR11815213_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_S_Iso_SRR11815214_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_S_Iso_SRR11815215_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_S_Iso_SRR11815216_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2-togregaria/SNITE_S_Iso_SRR11815217_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815197_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815197_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815198_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815198_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815199_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815199_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815200_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815200_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815201_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815201_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815202_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815202_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815203_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815203_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815204_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815204_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815205_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815205_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815206_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_G_Crd_SRR11815206_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815208_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815208_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815209_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815209_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815210_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815210_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815211_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815211_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815212_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815212_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815213_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815213_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815214_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815214_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815215_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815215_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815216_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815216_featurecounts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815217_counts.txt
    Deleted:    data/readcounts/03-nitens-DESeq2/SNITE_S_Iso_SRR11815217_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_G_Crd_SRR11815262_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_G_Crd_SRR11815264_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_G_Crd_SRR11815265_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_G_Crd_SRR11815266_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_G_Crd_SRR11815267_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_G_Crd_SRR11815268_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_G_Crd_SRR11815269_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_G_Crd_SRR11815270_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_G_Crd_SRR11815271_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_G_Crd_SRR11815272_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_S_Iso_SRR11815195_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_S_Iso_SRR11815196_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_S_Iso_SRR11815207_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_S_Iso_SRR11815218_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_S_Iso_SRR11815229_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_S_Iso_SRR11815240_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_S_Iso_SRR11815251_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_S_Iso_SRR11815261_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_S_Iso_SRR11815273_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2-togregaria/SPICE_S_Iso_SRR11815274_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815262_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815262_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815264_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815264_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815265_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815265_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815266_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815266_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815267_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815267_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815268_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815268_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815269_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815269_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815270_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815270_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815271_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815271_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815272_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_G_Crd_SRR11815272_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815195_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815195_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815196_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815196_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815207_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815207_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815218_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815218_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815229_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815229_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815240_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815240_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815251_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815251_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815261_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815261_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815273_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815273_featurecounts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815274_counts.txt
    Deleted:    data/readcounts/03-piceifrons-DESeq2/SPICE_S_Iso_SRR11815274_featurecounts.txt

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the repository in which changes were made to the R Markdown (analysis/3_overlap-venn.Rmd) and HTML (docs/3_overlap-venn.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File	Version	Author	Date	Message
Rmd	34c299a	Maeva TECHER	2025-02-06	Overlap confirmed
html	34c299a	Maeva TECHER	2025-02-06	Overlap confirmed
Rmd	db8b525	Maeva TECHER	2025-02-06	update overlap
Rmd	aab712a	Maeva TECHER	2025-02-04	change overlap
html	aab712a	Maeva TECHER	2025-02-04	change overlap
Rmd	faf2db3	Maeva TECHER	2025-01-13	update markdown
Rmd	fe6dae9	Maeva TECHER	2024-11-19	changes ESA
html	fe6dae9	Maeva TECHER	2024-11-19	changes ESA
Rmd	3fa8e62	Maeva TECHER	2024-11-09	updated analysis
html	3fa8e62	Maeva TECHER	2024-11-09	updated analysis
Rmd	edb70fe	Maeva TECHER	2024-11-08	overlap and deg results created
html	edb70fe	Maeva TECHER	2024-11-08	overlap and deg results created
html	ba35b82	Maeva A. TECHER	2024-06-20	Build site.
html	45d0b6b	Maeva A. TECHER	2024-05-16	Build site.
Rmd	5dff93d	Maeva A. TECHER	2024-05-16	wflow_publish("analysis/3_overlap-venn.Rmd")

Load libraries

We start by loading all the required R packages.

#(install first from CRAN or Bioconductor)
library(knitr)
library(dplyr) 
library(ggplot2)
library(plotly)
library(htmlwidgets)  # For saving interactive plots
library(ggVennDiagram)
library(pheatmap)
library(tidyr)
library(RColorBrewer)
library(viridis)
library(kableExtra)
library(tibble)
library(VennDiagram)
library(gridExtra)
library(grid)
library(DT)
library(readr)
library(tidyverse)
library(data.table)

# Path for all species
workDir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data"
ortho_dir <- "/Users/maevatecher/Documents/GitHub/locust-comparative-genomics/data/orthofinder/Schistocerca"
allspecies_path <- file.path(workDir, "/list/13polyneoptera_geneid_ncbi.csv")
allspecies_df <- read.table(allspecies_path, sep = ",", header = TRUE, quote = "", fill = TRUE, stringsAsFactors = FALSE)
species_list <- c("gregaria", "piceifrons", "cancellata", "americana", "cubense", "nitens")
species_order <- c( "nitens", "cubense", "americana",  "piceifrons", "cancellata", "gregaria")

Here our objective is to compare the abundance, composition and overlap of the DEGs found in the head and thorax tissues of each species between the isolated and crowded last instar females. We found that the differential genes expressed detected by DESeq2 varied across species and tissues but we need some perspective: Are locusts up-regulated and down-regulated the same genes? In the later section GO enrichment, we will investigate what are the functions of these genes as we will see that each species seems to show different gene expression profiles in response to density changes.

STRATEGY 1: One genome S. gregaria

1. DEGs comparison among species

We summarized the number of genes differential expressed between density for each species and each tissues.

# Initialize empty lists to store results
summary_list_head <- list()
summary_list_thorax <- list()

# Loop through each species to process their data
for (species in species_list) {
    # Read the DESeq2 results
    head_sigresults_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Head_togregaria_", species, ".csv"))
    thorax_sigresults_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Thorax_togregaria_", species, ".csv"))

    head_sigresults <- fread(head_sigresults_file)  # fread is faster and uses less memory
    thorax_sigresults <- fread(thorax_sigresults_file)

    # Count upregulated and downregulated genes for head
    head_upregulated <- sum(head_sigresults$log2FoldChange > 0)
    head_downregulated <- sum(head_sigresults$log2FoldChange < 0)
    head_upregulated_strict <- sum(head_sigresults$log2FoldChange > 1)
    head_downregulated_strict <- sum(head_sigresults$log2FoldChange < -1)

    # Count upregulated and downregulated genes for thorax
    thorax_upregulated <- sum(thorax_sigresults$log2FoldChange > 0)
    thorax_downregulated <- sum(thorax_sigresults$log2FoldChange < 0)
    thorax_upregulated_strict <- sum(thorax_sigresults$log2FoldChange > 1)
    thorax_downregulated_strict <- sum(thorax_sigresults$log2FoldChange < -1)

    # Store results in the list
    summary_list_head[[species]] <- data.frame(
        Species = species,
        Head_Upregulated = head_upregulated,
        Head_Downregulated = head_downregulated,
        Head_Upregulated_Strict = head_upregulated_strict,
        Head_Downregulated_Strict = head_downregulated_strict
    )

    summary_list_thorax[[species]] <- data.frame(
        Species = species,
        Thorax_Upregulated = thorax_upregulated,
        Thorax_Downregulated = thorax_downregulated,
        Thorax_Upregulated_Strict = thorax_upregulated_strict,
        Thorax_Downregulated_Strict = thorax_downregulated_strict
    )
}

# Combine lists into final data frames
summary_table_head <- bind_rows(summary_list_head)
summary_table_thorax <- bind_rows(summary_list_thorax)

# Print the summary table in a markdown-friendly format
knitr::kable(summary_table_head, format = "markdown", caption = "Summary of differentially expressed genes in head per species")

Summary of differentially expressed genes in head per species
Species	Head_Upregulated	Head_Downregulated	Head_Upregulated_Strict	Head_Downregulated_Strict
gregaria	2709	2988	814	662
piceifrons	375	375	194	191
cancellata	689	756	301	386
americana	703	487	311	256
cubense	30	31	30	31
nitens	189	259	104	207

# Convert the summary table to a long format for easier plotting
summary_long_head <- summary_table_head %>%
  pivot_longer(cols = c(Head_Upregulated_Strict, Head_Downregulated_Strict),
               names_to = "Tissue", values_to = "Count")

# Adjust the values for downregulated genes to be negative
summary_long_head <- summary_long_head %>%
  mutate(Count = ifelse(Tissue == "Head_Downregulated_Strict", -Count, Count))

summary_long_head$Species <- factor(summary_long_head$Species, levels = species_order)

# Plot barplot for head
ggplot(summary_long_head, aes(x = Species, y = Count, fill = Tissue)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Upregulated and Downregulated Genes in Head (absolute lfc >1)",
       x = "Species", y = "Number of Genes") +
  scale_fill_manual(values = c("Head_Upregulated_Strict" = "red2", "Head_Downregulated_Strict" = "blue")) +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200)) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1)) +
  coord_flip()

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19
3fa8e62	Maeva TECHER	2024-11-09
edb70fe	Maeva TECHER	2024-11-08

# Print the summary table for thorax
knitr::kable(summary_table_thorax, format = "markdown", caption = "Summary of differentially expressed genes in thorax per species")

Summary of differentially expressed genes in thorax per species
Species	Thorax_Upregulated	Thorax_Downregulated	Thorax_Upregulated_Strict	Thorax_Downregulated_Strict
gregaria	2751	2691	622	1174
piceifrons	1517	1200	549	221
cancellata	686	648	289	303
americana	398	699	149	339
cubense	104	218	64	154
nitens	0	0	0	0

# Convert the summary table to a long format for thorax
summary_long_thorax <- summary_table_thorax %>%
  pivot_longer(cols = c(Thorax_Upregulated_Strict, Thorax_Downregulated_Strict),
               names_to = "Tissue", values_to = "Count")

# Adjust the values for downregulated genes to be negative
summary_long_thorax <- summary_long_thorax %>%
  mutate(Count = ifelse(Tissue == "Thorax_Downregulated_Strict", -Count, Count))

summary_long_thorax$Species <- factor(summary_long_thorax$Species, levels = species_order)

# Plot barplot for thorax
ggplot(summary_long_thorax, aes(x = Species, y = Count, fill = Tissue)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Upregulated and Downregulated Genes in Thorax (absolute lfc >1)",
       x = "Species", y = "Number of Genes") +
  scale_fill_manual(values = c("Thorax_Upregulated_Strict" = "red2", "Thorax_Downregulated_Strict" = "blue")) +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200)) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1)) +
  coord_flip()

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19
3fa8e62	Maeva TECHER	2024-11-09
edb70fe	Maeva TECHER	2024-11-08

# Define custom colors for each GeneType
custom_colors <- c(
  "transcribed_pseudogene" = "#F4F1BB",  # Example color for transcribed_pseudogene
  "protein-coding" = "#9B57D3",         # Example color for protein-coding
  "lncRNA" = "#A5300F",                 # Example color for lncRNA
  "tRNA" = "#74D055FF",                   # Example color for tRNA
  "misc_RNA" = "#3B6978",               # Example color for misc_RNA
  "ncRNA" = "#29AF7FFF",                  # Example color for ncRNA
  "pseudogene" = "#81B29A",             # Example color for pseudogene
  "rRNA" = "#5982DB",                   # Example color for rRNA
  "snoRNA" = "#DCE318FF",                 # Example color for snoRNA
  "snRNA" = "#665EB8"                   # Example color for snRNA
)

# Use scale_fill_manual to map the custom colors to the GeneTypes
custom_color_scale <- scale_fill_manual(
  values = custom_colors
)
# Create an empty list to store the data for all species
all_species_data <- list()

# Loop through each species to process their data
for (species in species_list) {
  # Read the DESeq2 results for head and thorax
  head_sigresults_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Head_togregaria_", species, ".csv"))
  thorax_sigresults_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Thorax_togregaria_", species, ".csv"))
  
  head_sigresults <- read.csv(head_sigresults_file, stringsAsFactors = FALSE)
  thorax_sigresults <- read.csv(thorax_sigresults_file, stringsAsFactors = FALSE)
  
  # Add GeneType and Species columns (from `allspecies_df`)
  head_sigresults_merged <- merge(head_sigresults, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
  thorax_sigresults_merged <- merge(thorax_sigresults, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
  
  # Count for upregulated and downregulated genes in head
  head_upregulated <- head_sigresults_merged %>%
    filter(log2FoldChange > 1) %>%
    mutate(Regulation = "Upregulated", Tissue = "Head", Count = 1)
  
  head_downregulated <- head_sigresults_merged %>%
    filter(log2FoldChange < -1) %>%
    mutate(Regulation = "Downregulated", Tissue = "Head", Count = -1)  # Mutate downregulated genes to negative
  
  # Combine upregulated and downregulated genes for head
  head_combined <- rbind(head_upregulated, head_downregulated)
  
  # Ensure all GeneTypes are represented for this species, even if they have no DEGs
  head_combined <- head_combined %>%
    complete(GeneType = unique(allspecies_df$GeneType), 
             fill = list(Count = 0))  # Fill missing GeneTypes with Count = 0
  
  # Count for upregulated and downregulated genes in thorax
  thorax_upregulated <- thorax_sigresults_merged %>%
    filter(log2FoldChange > 1) %>%
    mutate(Regulation = "Upregulated", Tissue = "Thorax", Count = 1)
  
  thorax_downregulated <- thorax_sigresults_merged %>%
    filter(log2FoldChange < -1) %>%
    mutate(Regulation = "Downregulated", Tissue = "Thorax", Count = -1)  # Mutate downregulated genes to negative
  
  # Combine upregulated and downregulated genes for thorax
  thorax_combined <- rbind(thorax_upregulated, thorax_downregulated)
  
  # Ensure all GeneTypes are represented for this species in thorax, even if they have no DEGs
  thorax_combined <- thorax_combined %>%
    complete(GeneType = unique(allspecies_df$GeneType), 
             fill = list(Count = 0))  # Fill missing GeneTypes with Count = 0
  
  # Combine data for head and thorax into one
  combined_data <- rbind(head_combined, thorax_combined)
  
  # Add species column to the data
  combined_data$Species <- species
  
  # Append the data to the list for all species
  all_species_data[[species]] <- combined_data
}

# Combine all species data into one data frame
final_data <- bind_rows(all_species_data)

# Reorder species according to the desired order
final_data$Species <- factor(final_data$Species, levels = species_order)

# Filter for head tissue only
final_data_head <- final_data %>% filter(Tissue == "Head")
final_data_thorax <- final_data %>% filter(Tissue == "Thorax")

# Create the barplot for all species and only head tissue
ggplot(final_data_head, aes(x = Species, y = Count, fill = GeneType)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "DEGs by Gene Biotype for Head (absolute lfc >1)",
       x = "Species",
       y = "Number of Genes") +
  custom_color_scale +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200))+
theme_minimal(base_size = 12) + 
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1), 
        axis.text.y = element_text(size = 12), 
        panel.grid.major.y = element_line(color = "grey90", linetype = "dashed"),
        panel.grid.minor = element_blank()) +
  coord_flip()  # Flip coordinates to make the plot horizontal

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create the barplot for all species and only head tissue
ggplot(final_data_thorax, aes(x = Species, y = Count, fill = GeneType)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "DEGs by Gene Biotype for Thorax (absolute lfc >1)",
       x = "Species",
       y = "Number of Genes") +
  custom_color_scale +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200))+
theme_minimal(base_size = 12) + 
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1), 
        axis.text.y = element_text(size = 12), 
        panel.grid.major.y = element_line(color = "grey90", linetype = "dashed"),
        panel.grid.minor = element_blank()) +
  coord_flip()  # Flip coordinates to make the plot horizontal

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
8df3d7c	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

2. Overlap DEGs between tissues

gregaria

species <- "gregaria"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_togregaria_", species, ".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_togregaria_", species, ".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 1.2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }
    
    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

piceifrons

species <- "piceifrons"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_togregaria_", species, ".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_togregaria_", species, ".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 1.2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

cancellata

species <- "cancellata"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_togregaria_", species, ".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_togregaria_", species, ".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 1.2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

americana

species <- "americana"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_togregaria_", species, ".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_togregaria_", species, ".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 1.2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
8df3d7c	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

cubense

species <- "cubense"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_togregaria_", species, ".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_togregaria_", species, ".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 1.2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

nitens

species <- "nitens"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_togregaria_", species, ".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_togregaria_", species, ".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 1.2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
8df3d7c	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

3. Overlap DEGs among species

Locusts

Head tissues

# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")

# Initialize an empty list to store DEG data
venn_data_locusts_up <- list()
venn_data_locusts_down <- list()
venn_data_locusts_all <- list()

# Function to load DEGs for a given group of species for head
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in locusts) {
    head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_togregaria_", species, ".csv"))
    
    head_data <- read.csv(head_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(head_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    head_up <- head_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    head_down <- head_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- head_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- head_up$GeneID
    degs_down[[species]] <- head_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for head
venn_data_locusts <- load_deg_data(locusts)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_locusts$up[["gregaria"]],
  piceifrons = venn_data_locusts$up[["piceifrons"]],
  cancellata = venn_data_locusts$up[["cancellata"]]
)

venn_data_down <- list(
  gregaria = venn_data_locusts$down[["gregaria"]],
  piceifrons = venn_data_locusts$down[["piceifrons"]],
  cancellata = venn_data_locusts$down[["cancellata"]]
)

venn_data_all <- list(
  gregaria = venn_data_locusts$all[["gregaria"]],
  piceifrons = venn_data_locusts$all[["piceifrons"]],
  cancellata = venn_data_locusts$all[["cancellata"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("gregaria", "piceifrons", "cancellata"), 
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid"),  # Set colors for the groups
    alpha = 0.5, 
    cex = 1.2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("gregaria", "piceifrons", "cancellata")
    legend_colors <- c("orange", "red", "orchid")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }  
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for head upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Head Upregulated DEGs - Locusts", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for head downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Head Downregulated DEGs - Locusts", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - Locusts", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")

# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in locusts) {
  # Load DESeq2 results for head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Head_togregaria_", species, ".csv"))
  
  # Load the data using fread() for memory efficiency
  head_data <- fread(head_file, data.table = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter significant DEGs first (reduces memory use in sorting)
  head_data_filtered <- head_data %>%
    filter(padj < 0.05, abs(log2FoldChange) > 1)  # Keep only strong up/downregulated DEGs
  
  # Select top 500 upregulated and top 500 downregulated genes
  head_up <- head_data_filtered %>%
    filter(log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  head_down <- head_data_filtered %>%
    filter(log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  # Combine data for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
  stop("No valid data available for heatmap generation.")
}

# **Fix duplicate GeneIDs: Aggregate log2FoldChange by taking the mean**
final_heatmap_data <- final_heatmap_data %>%
  group_by(GeneID, Species) %>%
  summarise(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop")

# **Create heatmap matrix without duplicates**
heatmap_matrix <- final_heatmap_data %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Thorax tissues

# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")

# Initialize an empty list to store DEG data
venn_data_locusts_up <- list()
venn_data_locusts_down <- list()
venn_data_locusts_all <- list()

# Function to load DEGs for a given group of species for thorax
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in locusts) {
    thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_togregaria_", species, ".csv"))
    
    thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(thorax_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    thorax_up <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    thorax_down <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- thorax_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- thorax_up$GeneID
    degs_down[[species]] <- thorax_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for thorax
venn_data_locusts <- load_deg_data(locusts)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_locusts$up[["gregaria"]],
  piceifrons = venn_data_locusts$up[["piceifrons"]],
  cancellata = venn_data_locusts$up[["cancellata"]]
)

venn_data_down <- list(
  gregaria = venn_data_locusts$down[["gregaria"]],
  piceifrons = venn_data_locusts$down[["piceifrons"]],
  cancellata = venn_data_locusts$down[["cancellata"]]
)

venn_data_all <- list(
  gregaria = venn_data_locusts$all[["gregaria"]],
  piceifrons = venn_data_locusts$all[["piceifrons"]],
  cancellata = venn_data_locusts$all[["cancellata"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("gregaria", "piceifrons", "cancellata"), 
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid"),  # Set colors for the groups
    alpha = 0.5, 
    cex = 1.2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("gregaria", "piceifrons", "cancellata")
    legend_colors <- c("orange", "red", "orchid")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }    
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for thorax upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Thorax Upregulated DEGs - Locusts", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for head downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Thorax Downregulated DEGs - Locusts", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - Locusts", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
8df3d7c	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in locusts) {
  # Load DESeq2 results for thorax
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Thorax_togregaria_", species, ".csv"))
  
  # Load the data using fread() for memory efficiency
  thorax_data <- fread(thorax_file, data.table = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter significant DEGs first (reduces memory use in sorting)
  thorax_data_filtered <- thorax_data %>%
    filter(padj < 0.05, abs(log2FoldChange) > 1)  # Keep only strong up/downregulated DEGs
  
  # Select top 500 upregulated and top 500 downregulated genes
  thorax_up <- thorax_data_filtered %>%
    filter(log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  thorax_down <- thorax_data_filtered %>%
    filter(log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  # Combine data for heatmap, adding the species column
  heatmap_data <- bind_rows(
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
  stop("No valid data available for heatmap generation.")
}

# **Fix duplicate GeneIDs: Aggregate log2FoldChange by taking the mean**
final_heatmap_data <- final_heatmap_data %>%
  group_by(GeneID, Species) %>%
  summarise(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop")

# **Create heatmap matrix without duplicates**
heatmap_matrix <- final_heatmap_data %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Thorax Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Thorax Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
8df3d7c	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

piceifrons-americana-cubense

Head tissues

PACclade <- c("piceifrons", "americana", "cubense")

# Initialize an empty list to store DEG data
venn_data_PACclade_up <- list()
venn_data_PACclade_down <- list()
venn_data_PACclade_all <- list()

# Function to load DEGs for a given group of species for head
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in PACclade) {
    head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_togregaria_", species, ".csv"))
    
    head_data <- read.csv(head_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(head_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    head_up <- head_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    head_down <- head_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- head_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- head_up$GeneID
    degs_down[[species]] <- head_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for head
venn_data_PACclade <- load_deg_data(PACclade)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  piceifrons = venn_data_PACclade$up[["piceifrons"]],
  americana = venn_data_PACclade$up[["americana"]],
  cubense = venn_data_PACclade$up[["cubense"]]
)

venn_data_down <- list(
  piceifrons = venn_data_PACclade$down[["piceifrons"]],
  americana = venn_data_PACclade$down[["americana"]],
  cubense = venn_data_PACclade$down[["cubense"]]
)

venn_data_all <- list(
  piceifrons = venn_data_PACclade$all[["piceifrons"]],
  americana = venn_data_PACclade$all[["americana"]],
  cubense = venn_data_PACclade$all[["cubense"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("piceifrons", "americana", "cubense"), 
    filename = NULL, 
    output = TRUE, 
    fill = c("red", "green", "yellow"),  # Set colors for the groups
    alpha = 0.5, 
    cex = 1.2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("piceifrons", "americana", "cubense")
    legend_colors <- c("red", "green", "yellow")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }    
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for head upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Head Upregulated DEGs - PACclade", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for head downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Head Downregulated DEGs - PACclade", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - PACclade", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Define the species for Group 1
PACclade <- c("piceifrons", "americana", "cubense")

# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in PACclade) {
  # Load DESeq2 results for head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Head_togregaria_", species, ".csv"))
  
  # Load the data using fread() for memory efficiency
  head_data <- fread(head_file, data.table = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter significant DEGs first (reduces memory use in sorting)
  head_data_filtered <- head_data %>%
    filter(padj < 0.05, abs(log2FoldChange) > 1)  # Keep only strong up/downregulated DEGs
  
  # Select top 500 upregulated and top 500 downregulated genes
  head_up <- head_data_filtered %>%
    filter(log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  head_down <- head_data_filtered %>%
    filter(log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  # Combine data for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
  stop("No valid data available for heatmap generation.")
}

# Fix duplicate GeneIDs: Aggregate log2FoldChange by taking the mean**
final_heatmap_data <- final_heatmap_data %>%
  group_by(GeneID, Species) %>%
  summarise(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop")

# *Create heatmap matrix without duplicates**
heatmap_matrix <- final_heatmap_data %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Thorax tissues

# Define the species for PACclade
PACclade <- c("piceifrons", "americana", "cubense")

# Initialize an empty list to store DEG data
venn_data_PACclade_up <- list()
venn_data_PACclade_down <- list()
venn_data_PACclade_all <- list()

# Function to load DEGs for a given group of species for thorax
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in PACclade) {
    thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_togregaria_", species, ".csv"))
    
    thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(thorax_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    thorax_up <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    thorax_down <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- thorax_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- thorax_up$GeneID
    degs_down[[species]] <- thorax_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for thorax
venn_data_PACclade <- load_deg_data(PACclade)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  piceifrons = venn_data_PACclade$up[["piceifrons"]],
  americana = venn_data_PACclade$up[["americana"]],
  cubense = venn_data_PACclade$up[["cubense"]]
)

venn_data_down <- list(
  piceifrons = venn_data_PACclade$down[["piceifrons"]],
  americana = venn_data_PACclade$down[["americana"]],
  cubense = venn_data_PACclade$down[["cubense"]]
)

venn_data_all <- list(
  piceifrons = venn_data_PACclade$all[["piceifrons"]],
  americana = venn_data_PACclade$all[["americana"]],
  cubense = venn_data_PACclade$all[["cubense"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("piceifrons", "americana", "cubense"), 
    filename = NULL, 
    output = TRUE, 
    fill = c("red", "green", "yellow"),   # Set colors for the groups
    alpha = 0.5, 
    cex = 1.2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("piceifrons", "americana", "cubense")
    legend_colors <- c("red", "green", "yellow")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }    
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for thorax upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Thorax Upregulated DEGs - PACclade", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for head downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Thorax Downregulated DEGs - PACclade", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - PACclade", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

PACclade <- c("piceifrons", "americana", "cubense")

# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in PACclade) {
  # Load DESeq2 results for thorax
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Thorax_togregaria_", species, ".csv"))
  
  # Load the data using fread() for memory efficiency
  thorax_data <- fread(thorax_file, data.table = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter significant DEGs first (reduces memory use in sorting)
  thorax_data_filtered <- thorax_data %>%
    filter(padj < 0.05, abs(log2FoldChange) > 1)  # Keep only strong up/downregulated DEGs
  
  # Select top 500 upregulated and top 500 downregulated genes
  thorax_up <- thorax_data_filtered %>%
    filter(log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  thorax_down <- thorax_data_filtered %>%
    filter(log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  # Combine data for heatmap, adding the species column
  heatmap_data <- bind_rows(
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
  stop("No valid data available for heatmap generation.")
}

# Fix duplicate GeneIDs: Aggregate log2FoldChange by taking the mean**
final_heatmap_data <- final_heatmap_data %>%
  group_by(GeneID, Species) %>%
  summarise(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop")

# *Create heatmap matrix without duplicates**
heatmap_matrix <- final_heatmap_data %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Thorax Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Thorax Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Plastic species

Head tissues

# Define the species for Group 1
plastic_species <- c("gregaria", "piceifrons", "cancellata","americana")

# Initialize an empty list to store DEG data
venn_data_plastic_species_up <- list()
venn_data_plastic_species_down <- list()
venn_data_plastic_species_all <- list()

# Function to load DEGs for a given group of species for head
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in plastic_species) {
    head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_togregaria_", species, ".csv"))
    
    head_data <- read.csv(head_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(head_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    head_up <- head_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    head_down <- head_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- head_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- head_up$GeneID
    degs_down[[species]] <- head_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for head
venn_data_plastic_species <- load_deg_data(plastic_species)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_plastic_species$up[["gregaria"]],
  piceifrons = venn_data_plastic_species$up[["piceifrons"]],
  cancellata = venn_data_plastic_species$up[["cancellata"]],
  americana = venn_data_plastic_species$up[["americana"]]
)

venn_data_down <- list(
  gregaria = venn_data_plastic_species$down[["gregaria"]],
  piceifrons = venn_data_plastic_species$down[["piceifrons"]],
  cancellata = venn_data_plastic_species$down[["cancellata"]],
  americana = venn_data_plastic_species$down[["americana"]]
)

venn_data_all <- list(
  gregaria = venn_data_plastic_species$all[["gregaria"]],
  piceifrons = venn_data_plastic_species$all[["piceifrons"]],
  cancellata = venn_data_plastic_species$all[["cancellata"]],
  americana = venn_data_plastic_species$all[["americana"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("gregaria", "piceifrons", "cancellata","americana"),
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid", "green"),  # Set colors for the groups
    alpha = 0.5, 
    cex = 1.2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("gregaria", "piceifrons", "cancellata","americana")
    legend_colors <- c("orange", "red", "orchid", "green")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }    
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for head upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Head Upregulated DEGs - plastic_species", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
8df3d7c	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for head downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Head Downregulated DEGs - plastic_species", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
8df3d7c	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - plastic_species", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Define the species for Group 1
plastic_species <- c("gregaria", "piceifrons", "cancellata","americana")

# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in locusts) {
  # Load DESeq2 results for head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Head_togregaria_", species, ".csv"))
  
  # Load the data using fread() for memory efficiency
  head_data <- fread(head_file, data.table = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter significant DEGs first (reduces memory use in sorting)
  head_data_filtered <- head_data %>%
    filter(padj < 0.05, abs(log2FoldChange) > 1)  # Keep only strong up/downregulated DEGs
  
  # Select top 500 upregulated and top 500 downregulated genes
  head_up <- head_data_filtered %>%
    filter(log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  head_down <- head_data_filtered %>%
    filter(log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  # Combine data for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
  stop("No valid data available for heatmap generation.")
}

# **Fix duplicate GeneIDs: Aggregate log2FoldChange by taking the mean**
final_heatmap_data <- final_heatmap_data %>%
  group_by(GeneID, Species) %>%
  summarise(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop")

# **Create heatmap matrix without duplicates**
heatmap_matrix <- final_heatmap_data %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
8df3d7c	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Thorax tissues

plastic_species <- c("gregaria", "piceifrons", "cancellata","americana")

# Initialize an empty list to store DEG data
venn_data_plastic_species_up <- list()
venn_data_plastic_species_down <- list()
venn_data_plastic_species_all <- list()

# Function to load DEGs for a given group of species for thorax
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in plastic_species) {
    thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_togregaria_", species, ".csv"))
    
    thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(thorax_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    thorax_up <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    thorax_down <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- thorax_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- thorax_up$GeneID
    degs_down[[species]] <- thorax_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for thorax
venn_data_plastic_species <- load_deg_data(plastic_species)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_plastic_species$up[["gregaria"]],
  piceifrons = venn_data_plastic_species$up[["piceifrons"]],
  cancellata = venn_data_plastic_species$up[["cancellata"]],
  americana = venn_data_plastic_species$up[["americana"]]
)

venn_data_down <- list(
  gregaria = venn_data_plastic_species$down[["gregaria"]],
  piceifrons = venn_data_plastic_species$down[["piceifrons"]],
  cancellata = venn_data_plastic_species$down[["cancellata"]],
  americana = venn_data_plastic_species$down[["americana"]]
)

venn_data_all <- list(
  gregaria = venn_data_plastic_species$all[["gregaria"]],
  piceifrons = venn_data_plastic_species$all[["piceifrons"]],
  cancellata = venn_data_plastic_species$all[["cancellata"]],
  americana = venn_data_plastic_species$all[["americana"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("gregaria", "piceifrons", "cancellata","americana"),
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid", "green"),  # Set colors for the groups
    alpha = 0.5, 
    cex = 1.2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("gregaria", "piceifrons", "cancellata","americana")
    legend_colors <- c("orange", "red", "orchid", "green")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }    
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for thorax upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Thorax Upregulated DEGs - plastic_species", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for thorax downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Thorax Downregulated DEGs - plastic_species", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - plastic_species", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

plastic_species <- c("gregaria", "piceifrons", "cancellata","americana")

# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in plastic_species) {
  # Load DESeq2 results for thorax
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Thorax_togregaria_", species, ".csv"))
  
  # Load the data using fread() for memory efficiency
  thorax_data <- fread(thorax_file, data.table = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter significant DEGs first (reduces memory use in sorting)
  thorax_data_filtered <- thorax_data %>%
    filter(padj < 0.05, abs(log2FoldChange) > 1)  # Keep only strong up/downregulated DEGs
  
  # Select top 500 upregulated and top 500 downregulated genes
  thorax_up <- thorax_data_filtered %>%
    filter(log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  thorax_down <- thorax_data_filtered %>%
    filter(log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice_head(n = 500)   # More memory-efficient than slice(1:500)
  
  # Combine data for heatmap, adding the species column
  heatmap_data <- bind_rows(
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
  stop("No valid data available for heatmap generation.")
}

# Fix duplicate GeneIDs: Aggregate log2FoldChange by taking the mean**
final_heatmap_data <- final_heatmap_data %>%
  group_by(GeneID, Species) %>%
  summarise(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop")

# *Create heatmap matrix without duplicates**
heatmap_matrix <- final_heatmap_data %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Thorax Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Thorax Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

All species

Combined tissues

# Define the species for Group 1
allspecies <- c("gregaria", "piceifrons", "cancellata","americana", "cubense")

# Initialize an empty list to store DEG data
venn_data_allspecies_up <- list()
venn_data_allspecies_down <- list()
venn_data_allspecies_all <- list()

# Function to load DEGs for a given group of species for head
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in allspecies) {
    head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_togregaria_", species, ".csv"))
    
    head_data <- read.csv(head_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(head_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    head_up <- head_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    head_down <- head_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- head_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- head_up$GeneID
    degs_down[[species]] <- head_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for head
venn_data_allspecies <- load_deg_data(allspecies)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_allspecies$up[["gregaria"]],
  piceifrons = venn_data_allspecies$up[["piceifrons"]],
  cancellata = venn_data_allspecies$up[["cancellata"]],
  americana = venn_data_allspecies$up[["americana"]],
  cubense = venn_data_allspecies$up[["cubense"]]
)

venn_data_down <- list(
  gregaria = venn_data_allspecies$down[["gregaria"]],
  piceifrons = venn_data_allspecies$down[["piceifrons"]],
  cancellata = venn_data_allspecies$down[["cancellata"]],
  americana = venn_data_allspecies$down[["americana"]],
  cubense = venn_data_allspecies$down[["cubense"]]
)

venn_data_all <- list(
  gregaria = venn_data_allspecies$all[["gregaria"]],
  piceifrons = venn_data_allspecies$all[["piceifrons"]],
  cancellata = venn_data_allspecies$all[["cancellata"]],
  americana = venn_data_allspecies$all[["americana"]],
  cubense = venn_data_allspecies$all[["cubense"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("gregaria", "piceifrons", "cancellata","americana", "cubense"),
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid", "green", "yellow"),  # Set colors for the groups
    alpha = 0.5, 
    cex = 1.2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("gregaria", "piceifrons", "cancellata","americana", "cubense")
    legend_colors <- c("orange", "red", "orchid", "green", "yellow")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }      
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for head upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Head Upregulated DEGs - all species", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04

# Display the Venn diagram and datatable for head downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Head Downregulated DEGs - all species", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04

# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - all species", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04

# Thorax
# Initialize an empty list to store DEG data
venn_data_allspecies_up <- list()
venn_data_allspecies_down <- list()
venn_data_allspecies_all <- list()

# Function to load DEGs for a given group of species for thorax
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in allspecies) {
    thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_togregaria_", species, ".csv"))
    
    thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
    
    # Check if data is empty and handle accordingly
    if (nrow(thorax_data) == 0) {
      message(paste("No data for species:", species))
      next  # Skip to the next species if there's no data
    }
    
    # Filter for significant DEGs (both upregulated and downregulated)
    thorax_up <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(GeneID = X)
    
    thorax_down <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(GeneID = X)
    
    all_deg <- thorax_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(GeneID = X)

    # Store the DEGs in the list
    degs_up[[species]] <- thorax_up$GeneID
    degs_down[[species]] <- thorax_down$GeneID
    degs_all[[species]] <- all_deg$GeneID
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Group 1 for thorax
venn_data_allspecies <- load_deg_data(allspecies)

# Prepare the data for the Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_allspecies$up[["gregaria"]],
  piceifrons = venn_data_allspecies$up[["piceifrons"]],
  cancellata = venn_data_allspecies$up[["cancellata"]],
  americana = venn_data_allspecies$up[["americana"]],
  cubense = venn_data_allspecies$up[["cubense"]]
)

venn_data_down <- list(
  gregaria = venn_data_allspecies$down[["gregaria"]],
  piceifrons = venn_data_allspecies$down[["piceifrons"]],
  cancellata = venn_data_allspecies$down[["cancellata"]],
  americana = venn_data_allspecies$down[["americana"]],
  cubense = venn_data_allspecies$down[["cubense"]]
)

venn_data_all <- list(
  gregaria = venn_data_allspecies$all[["gregaria"]],
  piceifrons = venn_data_allspecies$all[["piceifrons"]],
  cancellata = venn_data_allspecies$all[["cancellata"]],
  americana = venn_data_allspecies$all[["americana"]],
  cubense = venn_data_allspecies$all[["cubense"]]
)

# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df) {
  # Calculate the overlapping genes
  overlap_genes <- Reduce(intersect, venn_data)
  
  # Create a data frame for the overlapping genes
  overlap_df <- data.frame(GeneID = overlap_genes)

  # Merge to get species information
  meta_brock_df <- merge(overlap_df, allspecies_df, by = "GeneID", all.x = TRUE)

  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = c("gregaria", "piceifrons", "cancellata","americana", "cubense"),
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid", "green", "yellow"),  # Set colors for the groups
    alpha = 0.5, 
    cex = 1.2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear the current plotting area before drawing the Venn diagram
  grid.newpage()
  
  # Display the Venn diagram
  grid.draw(venn_plot)
    # Manually create a custom legend
    legend_labels <- c("gregaria", "piceifrons", "cancellata","americana", "cubense")
    legend_colors <- c("orange", "red", "orchid", "green", "yellow")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }     
  # Display the merged overlapping genes table with datatable
  datatable(meta_brock_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ),
  rownames = FALSE,
  escape = FALSE
  ) %>%
  formatStyle(
      'Species', target = 'cell',
      fontStyle = 'italic'
  ) %>%
  formatStyle(
      columns = names(meta_brock_df), 
      target = 'row',
      color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
      fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
      backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
  )
}

# Display the Venn diagram and datatable for thorax upregulated DEGs
display_venn_with_datatable(venn_data_up, "Venn Diagram of Thorax Upregulated DEGs - all species", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04

# Display the Venn diagram and datatable for head downregulated DEGs
display_venn_with_datatable(venn_data_down, "Venn Diagram of Thorax Downregulated DEGs - all species", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04

# Display the Venn diagram and datatable for all significant DEGs
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Significant DEGs - all species", allspecies_df)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04

# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in species_list) {
  # Load DESeq2 results for head and thorax
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Head_togregaria_", species, ".csv"))
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Thorax_togregaria_", species, ".csv"))
  
  # Load the data
  head_data <- read.csv(head_file, stringsAsFactors = FALSE)
  thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter for significant DEGs and select top 100 upregulated and downregulated genes for each tissue
  head_up <- head_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  head_down <- head_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  thorax_up <- thorax_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  thorax_down <- thorax_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species),
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
  stop("No valid data available for heatmap generation.")
}

# Create heatmap matrix
# Aggregate log2FoldChange values correctly
heatmap_matrix <- final_heatmap_data %>%
  group_by(GeneID, Species, Tissue) %>%
  summarize(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop") %>%

  # Convert to wide format (no need for redundant grouping)
  pivot_wider(names_from = c(Species, Tissue), values_from = log2FoldChange, values_fill = 0) %>%

  # Ensure unique row names
  column_to_rownames("GeneID") %>%
  as.matrix()


custom_cyan_orange_palette <- colorRampPalette(c("cyan", "cyan2", "cyan3", "black", "orange3", "orange2", "orange"))(100)
custom_blue_red_palette <- colorRampPalette(c("blue3", "blue2", "blue1", "white", "red", "red2", "red3"))(100)

# Define color breaks to ensure **black = 0**
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create first heatmap with blue-red gradient
pheatmap(
  heatmap_matrix,
  color = custom_blue_red_palette,  
  breaks = color_breaks,  
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head and Thorax Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19
3fa8e62	Maeva TECHER	2024-11-09
edb70fe	Maeva TECHER	2024-11-08

# Create second heatmap with cyan-black-orange gradient
pheatmap(
  heatmap_matrix,
  color = custom_cyan_orange_palette,  
  breaks = color_breaks,  
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head and Thorax Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19
3fa8e62	Maeva TECHER	2024-11-09
edb70fe	Maeva TECHER	2024-11-08

Head tissues

# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in species_list) {
  # Load DESeq2 results for head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Head_togregaria_", species, ".csv"))
  
  # Load the data
  head_data <- read.csv(head_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Filter for significant DEGs and select top 100 upregulated and downregulated genes for each tissue
  head_up <- head_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  head_down <- head_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data
final_heatmap_data <- bind_rows(heatmap_list)

# Ensure all species are represented, even if they have no significant DEGs
for (species in species_order) {
    if (!species %in% unique(final_heatmap_data$Species)) {
        message(paste("Adding placeholder for missing species:", species))
        final_heatmap_data <- bind_rows(
            final_heatmap_data,
            data.frame(
                GeneID = "Unassigned",  # Placeholder GeneID
                log2FoldChange = 0,
                Tissue = "Head",
                Regulation = "None",
                Species = species
            )
        )
    }
}

# Ensure species order in the data
final_heatmap_data$Species <- factor(final_heatmap_data$Species, levels = species_order)

# Create heatmap matrix (Thorax only)
heatmap_matrix <- final_heatmap_data %>%
  group_by(GeneID, Species) %>% 
  summarize(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop") %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Explicitly reorder the columns in heatmap_matrix
heatmap_matrix <- heatmap_matrix[, species_order, drop = FALSE]

# Define color palettes
custom_cyan_orange_palette <- colorRampPalette(c("cyan", "cyan2", "cyan3", "black", "orange3", "orange2", "orange"))(100)
custom_blue_red_palette <- colorRampPalette(c("blue3", "blue2", "blue1", "white", "red", "red2", "red3"))(100)

# Define color breaks to ensure **black = 0**
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create first heatmap with blue-red gradient
pheatmap(
  heatmap_matrix,
  color = custom_blue_red_palette,  
  breaks = color_breaks,  
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create second heatmap with cyan-black-orange gradient
pheatmap(
  heatmap_matrix,
  color = custom_cyan_orange_palette,  
  breaks = color_breaks,  
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Gene Expression in Head Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Thorax tissues

# Define species order explicitly to ensure consistency
species_order <- c("nitens", "cubense", "americana", "piceifrons", "cancellata", "gregaria")

# Initialize an empty list to store heatmap data
heatmap_list <- list()

# Loop through each species to process their Thorax data
for (species in species_order) {
  message(paste("Processing species:", species))

  # Define file path for Thorax
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Thorax_togregaria_", species, ".csv"))

  # Check if file exists before loading
  if (!file.exists(thorax_file)) {
    message(paste("Missing Thorax file for:", species, "- Assigning empty dataset"))
    thorax_data <- data.frame(GeneID = character(), padj = numeric(), log2FoldChange = numeric(), stringsAsFactors = FALSE)
  } else {
    thorax_data <- tryCatch(read.csv(thorax_file, stringsAsFactors = FALSE), error = function(e) data.frame())
  }

  # Ensure GeneID column exists
  if (!"GeneID" %in% colnames(thorax_data) && "X" %in% colnames(thorax_data)) {
    colnames(thorax_data)[colnames(thorax_data) == "X"] <- "GeneID"
  }

  # Convert GeneID to character
  thorax_data$GeneID <- as.character(thorax_data$GeneID)

  # If no significant DEGs are found, ensure the structure is correct
  if (nrow(thorax_data) == 0) {
    message(paste("No significant Thorax DEGs for:", species, "- Assigning placeholder values"))
    thorax_data <- data.frame(
      GeneID = "Unassigned",
      log2FoldChange = 0,
      Tissue = "Thorax",
      Regulation = "None",
      Species = species
    )
  } else {
    # Filter for significant DEGs and select top 500 upregulated and downregulated genes
    thorax_up <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange > 1) %>%
      arrange(desc(log2FoldChange)) %>%
      slice(1:500)

    thorax_down <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange < -1) %>%
      arrange(log2FoldChange) %>%
      slice(1:500)

    # Combine data and prepare for heatmap
    thorax_data <- bind_rows(
      thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
      thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
    ) %>%
      select(GeneID, log2FoldChange, Tissue, Regulation, Species)
  }

  # Append to heatmap list, ensuring species is represented
  heatmap_list[[species]] <- thorax_data
}

# Combine all species data
final_heatmap_data <- bind_rows(heatmap_list)

# Ensure all species are represented, even if they have no significant DEGs
for (species in species_order) {
    if (!species %in% unique(final_heatmap_data$Species)) {
        message(paste("Adding placeholder for missing species:", species))
        final_heatmap_data <- bind_rows(
            final_heatmap_data,
            data.frame(
                GeneID = "Unassigned",  # Placeholder GeneID
                log2FoldChange = 0,
                Tissue = "Thorax",
                Regulation = "None",
                Species = species
            )
        )
    }
}

# Ensure species order in the data
final_heatmap_data$Species <- factor(final_heatmap_data$Species, levels = species_order)

# Create heatmap matrix (Thorax only)
heatmap_matrix <- final_heatmap_data %>%
  group_by(GeneID, Species) %>% 
  summarize(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop") %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("GeneID") %>%
  as.matrix()

# Explicitly reorder the columns in heatmap_matrix
heatmap_matrix <- heatmap_matrix[, species_order, drop = FALSE]

# Define color palettes
custom_cyan_orange_palette <- colorRampPalette(c("cyan", "cyan2", "cyan3", "black", "orange3", "orange2", "orange"))(100)
custom_blue_red_palette <- colorRampPalette(c("blue3", "blue2", "blue1", "white", "red", "red2", "red3"))(100)

# Define color breaks
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)

# Generate heatmaps (Only thorax)
pheatmap(
  heatmap_matrix,
  color = custom_blue_red_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of GeneID Expression in Thorax Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19
3fa8e62	Maeva TECHER	2024-11-09
edb70fe	Maeva TECHER	2024-11-08

pheatmap(
  heatmap_matrix,
  color = custom_cyan_orange_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of GeneID Expression in Thorax Tissue - STRATEGY 1"
)

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19
3fa8e62	Maeva TECHER	2024-11-09
edb70fe	Maeva TECHER	2024-11-08

STRATEGY 2: Own RefSeq genome

Here the difference with STRATEGY 1 is that to look at the correspondance of genes across species for comparison, we will have to use orthologs (see section Orthofinder).

We load from our previous conversion

input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

1. DEGs comparison among species

We summarized the number of genes differential expressed between density for each species and each tissues.

# Initialize empty lists to store results
summary_list_head <- list()
summary_list_thorax <- list()

# Loop through each species to process their data
for (species in species_list) {
    # Read the DESeq2 results
    head_sigresults_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Head_", species, ".csv"))
    thorax_sigresults_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Thorax_", species, ".csv"))

    head_sigresults <- fread(head_sigresults_file)  # fread is faster and uses less memory
    thorax_sigresults <- fread(thorax_sigresults_file)

    # Count upregulated and downregulated genes for head
    head_upregulated <- sum(head_sigresults$log2FoldChange > 0)
    head_downregulated <- sum(head_sigresults$log2FoldChange < 0)
    head_upregulated_strict <- sum(head_sigresults$log2FoldChange > 1)
    head_downregulated_strict <- sum(head_sigresults$log2FoldChange < -1)

    # Count upregulated and downregulated genes for thorax
    thorax_upregulated <- sum(thorax_sigresults$log2FoldChange > 0)
    thorax_downregulated <- sum(thorax_sigresults$log2FoldChange < 0)
    thorax_upregulated_strict <- sum(thorax_sigresults$log2FoldChange > 1)
    thorax_downregulated_strict <- sum(thorax_sigresults$log2FoldChange < -1)

    # Store results in the list
    summary_list_head[[species]] <- data.frame(
        Species = species,
        Head_Upregulated = head_upregulated,
        Head_Downregulated = head_downregulated,
        Head_Upregulated_Strict = head_upregulated_strict,
        Head_Downregulated_Strict = head_downregulated_strict
    )

    summary_list_thorax[[species]] <- data.frame(
        Species = species,
        Thorax_Upregulated = thorax_upregulated,
        Thorax_Downregulated = thorax_downregulated,
        Thorax_Upregulated_Strict = thorax_upregulated_strict,
        Thorax_Downregulated_Strict = thorax_downregulated_strict
    )
}

# Combine lists into final data frames
summary_table_head <- bind_rows(summary_list_head)
summary_table_thorax <- bind_rows(summary_list_thorax)

# Print the summary table in a markdown-friendly format
knitr::kable(summary_table_head, format = "markdown", caption = "Summary of differentially expressed genes in head per species")

Summary of differentially expressed genes in head per species
Species	Head_Upregulated	Head_Downregulated	Head_Upregulated_Strict	Head_Downregulated_Strict
gregaria	2709	2988	814	662
piceifrons	538	518	301	263
cancellata	751	877	378	476
americana	802	619	357	339
cubense	49	56	49	55
nitens	233	314	122	245

# Convert the summary table to a long format for easier plotting
summary_long_head <- summary_table_head %>%
  pivot_longer(cols = c(Head_Upregulated_Strict, Head_Downregulated_Strict),
               names_to = "Tissue", values_to = "Count")

# Adjust the values for downregulated genes to be negative
summary_long_head <- summary_long_head %>%
  mutate(Count = ifelse(Tissue == "Head_Downregulated_Strict", -Count, Count))

summary_long_head$Species <- factor(summary_long_head$Species, levels = species_order)

# Plot barplot for head
ggplot(summary_long_head, aes(x = Species, y = Count, fill = Tissue)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Upregulated and Downregulated Genes in Head (absolute lfc >1)",
       x = "Species", y = "Number of Genes") +
  scale_fill_manual(values = c("Head_Upregulated_Strict" = "red2", "Head_Downregulated_Strict" = "blue")) +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200)) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1)) +
  coord_flip()

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Print the summary table for thorax
knitr::kable(summary_table_thorax, format = "markdown", caption = "Summary of differentially expressed genes in thorax per species")

Summary of differentially expressed genes in thorax per species
Species	Thorax_Upregulated	Thorax_Downregulated	Thorax_Upregulated_Strict	Thorax_Downregulated_Strict
gregaria	2751	2691	622	1174
piceifrons	1641	1361	652	332
cancellata	734	738	324	376
americana	460	798	181	427
cubense	127	251	78	185
nitens	0	0	0	0

# Convert the summary table to a long format for thorax
summary_long_thorax <- summary_table_thorax %>%
  pivot_longer(cols = c(Thorax_Upregulated_Strict, Thorax_Downregulated_Strict),
               names_to = "Tissue", values_to = "Count")

# Adjust the values for downregulated genes to be negative
summary_long_thorax <- summary_long_thorax %>%
  mutate(Count = ifelse(Tissue == "Thorax_Downregulated_Strict", -Count, Count))

summary_long_thorax$Species <- factor(summary_long_thorax$Species, levels = species_order)

# Plot barplot for thorax
ggplot(summary_long_thorax, aes(x = Species, y = Count, fill = Tissue)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "Upregulated and Downregulated Genes in Thorax (absolute lfc >1)",
       x = "Species", y = "Number of Genes") +
  scale_fill_manual(values = c("Thorax_Upregulated_Strict" = "red2", "Thorax_Downregulated_Strict" = "blue")) +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200)) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1)) +
  coord_flip()

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Define custom colors for each GeneType
custom_colors <- c(
  "transcribed_pseudogene" = "#F4F1BB",  # Example color for transcribed_pseudogene
  "protein-coding" = "#9B57D3",         # Example color for protein-coding
  "lncRNA" = "#A5300F",                 # Example color for lncRNA
  "tRNA" = "#74D055FF",                   # Example color for tRNA
  "misc_RNA" = "#3B6978",               # Example color for misc_RNA
  "ncRNA" = "#29AF7FFF",                  # Example color for ncRNA
  "pseudogene" = "#81B29A",             # Example color for pseudogene
  "rRNA" = "#5982DB",                   # Example color for rRNA
  "snoRNA" = "#DCE318FF",                 # Example color for snoRNA
  "snRNA" = "#665EB8"                   # Example color for snRNA
)

# Use scale_fill_manual to map the custom colors to the GeneTypes
custom_color_scale <- scale_fill_manual(
  values = custom_colors
)
# Create an empty list to store the data for all species
all_species_data <- list()

# Loop through each species to process their data
for (species in species_list) {
  # Read the DESeq2 results for head and thorax
  head_sigresults_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Head_", species, ".csv"))
  thorax_sigresults_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Thorax_", species, ".csv"))
  
  head_sigresults <- read.csv(head_sigresults_file, stringsAsFactors = FALSE)
  thorax_sigresults <- read.csv(thorax_sigresults_file, stringsAsFactors = FALSE)
  
  # Add GeneType and Species columns (from `allspecies_df`)
  head_sigresults_merged <- merge(head_sigresults, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
  thorax_sigresults_merged <- merge(thorax_sigresults, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
  
  # Count for upregulated and downregulated genes in head
  head_upregulated <- head_sigresults_merged %>%
    filter(log2FoldChange > 1) %>%
    mutate(Regulation = "Upregulated", Tissue = "Head", Count = 1)
  
  head_downregulated <- head_sigresults_merged %>%
    filter(log2FoldChange < -1) %>%
    mutate(Regulation = "Downregulated", Tissue = "Head", Count = -1)  # Mutate downregulated genes to negative
  
  # Combine upregulated and downregulated genes for head
  head_combined <- rbind(head_upregulated, head_downregulated)
  
  # Ensure all GeneTypes are represented for this species, even if they have no DEGs
  head_combined <- head_combined %>%
    complete(GeneType = unique(allspecies_df$GeneType), 
             fill = list(Count = 0))  # Fill missing GeneTypes with Count = 0
  
  # Count for upregulated and downregulated genes in thorax
  thorax_upregulated <- thorax_sigresults_merged %>%
    filter(log2FoldChange > 1) %>%
    mutate(Regulation = "Upregulated", Tissue = "Thorax", Count = 1)
  
  thorax_downregulated <- thorax_sigresults_merged %>%
    filter(log2FoldChange < -1) %>%
    mutate(Regulation = "Downregulated", Tissue = "Thorax", Count = -1)  # Mutate downregulated genes to negative
  
  # Combine upregulated and downregulated genes for thorax
  thorax_combined <- rbind(thorax_upregulated, thorax_downregulated)
  
  # Ensure all GeneTypes are represented for this species in thorax, even if they have no DEGs
  thorax_combined <- thorax_combined %>%
    complete(GeneType = unique(allspecies_df$GeneType), 
             fill = list(Count = 0))  # Fill missing GeneTypes with Count = 0
  
  # Combine data for head and thorax into one
  combined_data <- rbind(head_combined, thorax_combined)
  
  # Add species column to the data
  combined_data$Species <- species
  
  # Append the data to the list for all species
  all_species_data[[species]] <- combined_data
}

# Combine all species data into one data frame
final_data <- bind_rows(all_species_data)

# Reorder species according to the desired order
final_data$Species <- factor(final_data$Species, levels = species_order)

# Filter for head tissue only
final_data_head <- final_data %>% filter(Tissue == "Head")
final_data_thorax <- final_data %>% filter(Tissue == "Thorax")

# Create the barplot for all species and only head tissue
ggplot(final_data_head, aes(x = Species, y = Count, fill = GeneType)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "DEGs by Gene Biotype for Head (absolute lfc >1)",
       x = "Species",
       y = "Number of Genes") +
  custom_color_scale +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200))+
theme_minimal(base_size = 12) + 
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1), 
        axis.text.y = element_text(size = 12), 
        panel.grid.major.y = element_line(color = "grey90", linetype = "dashed"),
        panel.grid.minor = element_blank()) +
  coord_flip()  # Flip coordinates to make the plot horizontal

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create the barplot for all species and only head tissue
ggplot(final_data_thorax, aes(x = Species, y = Count, fill = GeneType)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(title = "DEGs by Gene Biotype for Thorax (absolute lfc >1)",
       x = "Species",
       y = "Number of Genes") +
  custom_color_scale +
  scale_y_continuous(labels = function(x) ifelse(x < 0, -x, x), limits = c(-1200, 1200))+
theme_minimal(base_size = 12) + 
  theme(legend.position = "top", 
        plot.title = element_text(hjust = 0.5, size = 14, face = "bold"), 
        axis.title.x = element_text(size = 14, face = "bold"), 
        axis.title.y = element_text(size = 14, face = "bold"), 
        axis.text.x = element_text(size = 12, angle = 45, hjust = 1), 
        axis.text.y = element_text(size = 12), 
        panel.grid.major.y = element_line(color = "grey90", linetype = "dashed"),
        panel.grid.minor = element_blank()) +
  coord_flip()  # Flip coordinates to make the plot horizontal

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

2. Overlap DEGs between tissues

gregaria

species <- "gregaria"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_", species, ".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_", species, ".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 1.2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }
    
    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
8df3d7c	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

piceifrons

species <- "piceifrons"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_", species, ".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_", species, ".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 1.2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

cancellata

species <- "cancellata"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_", species, ".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_", species, ".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 1.2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

americana

species <- "americana"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_", species, ".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_", species, ".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 1.2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

cubense

species <- "cubense"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_", species, ".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_", species, ".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 1.2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

nitens

species <- "nitens"  # Specify the species for which to generate plots

# Load DESeq2 results for head and thorax
head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_", species, ".csv"))
thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_", species, ".csv"))

head_data <- read.csv(head_file, stringsAsFactors = FALSE)
thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)

# Check if data is empty and handle accordingly
if (nrow(head_data) == 0 || nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
} else {
    # Filter for significant DEGs and select upregulated and downregulated genes
    head_up <- head_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    head_down <- head_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    thorax_up <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange > 1) %>%
        select(GeneID = X)

    thorax_down <- thorax_data %>%
        filter(padj < 0.05 & log2FoldChange < -1) %>%
        select(GeneID = X)

    # Prepare data for Venn diagram
    venn_data <- list(
        Head_Upregulated = head_up$GeneID,
        Head_Downregulated = head_down$GeneID,
        Thorax_Upregulated = thorax_up$GeneID,
        Thorax_Downregulated = thorax_down$GeneID
    )

    # Generate the four-way Venn diagram with specified colors and legend outside
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("Head Upregulated", "Head Downregulated", "Thorax Upregulated", "Thorax Downregulated"), 
        filename = NULL, 
        output = TRUE, 
        fill = c("red", "skyblue", "orange", "blue"),  # Set colors for upregulated and downregulated
        alpha = 0.5, 
        cex = 1.2,  # Text size for numbers
        cat.cex = 0,  # Text size for category labels
        cat.pos = c(0, 0, 0, 0),  # Position to center labels
        cat.dist = c(0.1, 0.1, 0.1, 0.1),  # Distance between category labels and circles
        main = paste("Venn Diagram of DEGs for S.", species),
        main.cex = 1.2,  # Size of the main title
        cat.col = c("red", "skyblue", "orange", "blue")  # Color the category labels
    )

    # Clear the current plotting area before drawing the next Venn diagram
    grid.newpage()

    # Display the Venn diagram
    grid.draw(venn_plot)

    # Manually create a custom legend
    legend_labels <- c("Head Up", "Head Down", "Thorax Up", "Thorax Down")
    legend_colors <- c("red", "skyblue", "orange", "blue")

    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")    # Lower the legend vertically

    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }

    # Scatter plot for overlapping genes
    # Filter significant DEGs for both head and thorax
    head_sig_genes <- head_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj) 

    thorax_sig_genes <- thorax_data %>%
        filter(padj < 0.05 & abs(log2FoldChange) > 1) %>%
        select(GeneID = X, log2FoldChange, padj)

    # Find overlapping genes based on GeneID
    overlapping_genes <- inner_join(head_sig_genes, thorax_sig_genes, by = "GeneID", suffix = c("_head", "_thorax"))

    # Save the overlapping genes to a CSV file
    output_file <- file.path(workDir, "overlap/Bulk_RNAseq", paste0("overlapping_genes_head_thorax_", species, ".csv"))
    write.csv(overlapping_genes, output_file, row.names = FALSE)

    # Plot overlapping genes with scatter plot
    p <- ggplot(overlapping_genes, aes(x = log2FoldChange_head, y = log2FoldChange_thorax)) +
        geom_point(aes(color = case_when(
            log2FoldChange_head > 0 & log2FoldChange_thorax > 0 ~ "Upregulated in Both",
            log2FoldChange_head < 0 & log2FoldChange_thorax < 0 ~ "Downregulated in Both",
            log2FoldChange_head > 0 & log2FoldChange_thorax < 0 ~ "Up in Head, Down in Thorax",
            log2FoldChange_head < 0 & log2FoldChange_thorax > 0 ~ "Down in Head, Up in Thorax"
        )), size = 3, alpha = 0.7) +
        geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
        labs(
            x = "Log2 Fold Change (Head)", 
            y = "Log2 Fold Change (Thorax)", 
            color = "Regulation Pattern", 
            title = "Comparison of Log2 Fold Changes in Overlapping Genes", 
            subtitle = paste("Head vs. Thorax in", species)
        ) +
        theme_minimal() + 
        theme(
            plot.title = element_text(size = 16, face = "bold"), 
            plot.subtitle = element_text(size = 12, face = "italic"), 
            legend.position = "top"
        ) +
        scale_color_manual(values = c(
            "Upregulated in Both" = "red", 
            "Downregulated in Both" = "blue", 
            "Up in Head, Down in Thorax" = "purple", 
            "Down in Head, Up in Thorax" = "green"
        ))

    # Save the scatter plot
    ggsave(filename = file.path(workDir, "overlap/Bulk_RNAseq", paste0("scatter_plot_overlapping_genes_", species, ".png")), plot = p)

    # Display the scatter plot
    print(p)
}

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

3. Overlap DEGs among species

Locusts

Head tissues

# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Function to load DEGs for a given group of species
load_deg_data <- function(locusts, allspecies_df, filtered_final_orthotable) {
    degs_up <- list()
    degs_down <- list()
    degs_all <- list()
    
    # Rename the "gene_id" column in filtered_final_orthotable for consistency
    colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
    
    for (species in locusts) {
        head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_", species, ".csv"))
        
        # Check if the file exists
        if (!file.exists(head_file)) {
            message(paste("File not found for species:", species))
            next  # Skip this iteration if the file is missing
        }
        
        # Read the data
        head_data <- read.csv(head_file, stringsAsFactors = FALSE)
        
        # Rename the "X" column to "GeneID"
        colnames(head_data)[colnames(head_data) == "X"] <- "GeneID"
        
        # Merge DEG data with GeneType and Orthogroup information
        head_data_merged <- merge(head_data, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
        head_data_merged <- merge(head_data_merged, filtered_final_orthotable[, c("GeneID", "Orthogroup")], by = "GeneID")
        
        # Handle missing Orthogroups
        head_data_merged$Orthogroup[is.na(head_data_merged$Orthogroup)] <- "Unknown"
        
        # Filter for significant DEGs (both upregulated and downregulated)
        head_up <- head_data_merged %>%
            filter(padj < 0.05 & log2FoldChange >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        head_down <- head_data_merged %>%
            filter(padj < 0.05 & log2FoldChange <= -1) %>%
            select(Orthogroup) %>%
            distinct()
        
        all_deg <- head_data_merged %>%
            filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        # Store the DEGs in the list
        degs_up[[species]] <- head_up$Orthogroup
        degs_down[[species]] <- head_down$Orthogroup
        degs_all[[species]] <- all_deg$Orthogroup
    }
    
    return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Function to display Venn diagram and corresponding datatable based on Orthogroups
# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df, filtered_final_orthotable) {
    
    # Calculate overlapping Orthogroups
    overlap_orthogroups <- Reduce(intersect, venn_data)
    
    # Print overlap info
    cat("Overlapping Orthogroups: \n")
    print(overlap_orthogroups)
    
    # If no overlaps exist, display a message and an empty plot
    if (length(overlap_orthogroups) == 0) {
        message("⚠️ No overlapping Orthogroups found. Displaying an empty Venn diagram.")
        
        # Create an empty Venn diagram placeholder
        plot.new()
        text(0.5, 0.5, "No overlapping Orthogroups found", cex = 1.5, col = "red")
        
        return(NULL)  # Exit the function gracefully
    }
    
    # Create a data frame for the overlapping Orthogroups
    overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
    
    # Merge to get species and other information from filtered_final_orthotable
    meta_brock_df <- merge(overlap_df, filtered_final_orthotable, by = "Orthogroup", all.x = TRUE)
    
    # Ensure merged data exists
    if (nrow(meta_brock_df) == 0) {
        message("⚠️ Merge failed: No matching rows after merging Orthogroups.")
        return(NULL)
    }
    
    # Generate the Venn diagram
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("gregaria", "piceifrons", "cancellata"), 
        filename = NULL, 
        output = TRUE,
        fill = c("orange", "red", "orchid"),
        alpha = 0.5,
        cex = 1.2,
        cat.cex = 0,
        main = title,
        main.cex = 1.2
    )
    
    # Clear the current plotting area before drawing the Venn diagram
    grid.newpage()
    
    # Display the Venn diagram
    grid.draw(venn_plot)
    
    # Display the datatable for overlapping Orthogroups
    datatable(meta_brock_df, options = list(
        pageLength = 10,
        scrollX = TRUE,
        autoWidth = TRUE,
        searchHighlight = TRUE
    ),
    rownames = FALSE,
    escape = FALSE
    ) %>%
        formatStyle(
            'Species', target = 'cell',
            fontStyle = 'italic'
        )
}

# Load DEGs for locusts
venn_data_locusts <- load_deg_data(locusts, allspecies_df, filtered_final_orthotable)

# Prepare the data for Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_locusts$up[["gregaria"]],
  piceifrons = venn_data_locusts$up[["piceifrons"]],
  cancellata = venn_data_locusts$up[["cancellata"]]
)

venn_data_down <- list(
  gregaria = venn_data_locusts$down[["gregaria"]],
  piceifrons = venn_data_locusts$down[["piceifrons"]],
  cancellata = venn_data_locusts$down[["cancellata"]]
)

venn_data_all <- list(
  gregaria = venn_data_locusts$all[["gregaria"]],
  piceifrons = venn_data_locusts$all[["piceifrons"]],
  cancellata = venn_data_locusts$all[["cancellata"]]
)

# Display the Venn diagrams with fallback for missing overlaps
message("Processing Venn diagram for head upregulated DEGs...")
display_venn_with_datatable(venn_data_up, "Venn Diagram of Head Upregulated DEGs - Locusts", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
 [1] "Unknown"   "OG0007485" "OG0004381" "OG0000630" "OG0008668" "OG0000522"
 [7] "OG0000307" "OG0000354" "OG0009529" "OG0000197" "OG0000447" "OG0010928"
[13] "OG0005151" "OG0000045" "OG0001019" "OG0000272"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

message("Processing Venn diagram for head downregulated DEGs...")
display_venn_with_datatable(venn_data_down, "Venn Diagram of Head Downregulated DEGs - Locusts", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
 [1] "Unknown"   "OG0008550" "OG0012948" "OG0008546" "OG0014256" "OG0004570"
 [7] "OG0009787" "OG0013175" "OG0010889" "OG0011171" "OG0000270" "OG0004972"
[13] "OG0002151" "OG0003935" "OG0000505" "OG0000273" "OG0000149"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

message("Processing Venn diagram for all significant DEGs...")
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Head DEGs - Locusts", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
 [1] "Unknown"   "OG0007485" "OG0004381" "OG0000630" "OG0008550" "OG0012948"
 [7] "OG0008668" "OG0008546" "OG0000522" "OG0008322" "OG0000307" "OG0014256"
[13] "OG0000354" "OG0004570" "OG0009529" "OG0000197" "OG0009787" "OG0013175"
[19] "OG0010889" "OG0000447" "OG0010928" "OG0011171" "OG0005490" "OG0000270"
[25] "OG0005151" "OG0000045" "OG0004972" "OG0001019" "OG0002151" "OG0003935"
[31] "OG0000272" "OG0000505" "OG0000273" "OG0000149"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)
# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in locusts) {
  # Load DESeq2 results for head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Head_", species, ".csv"))
  
  # Load the DESeq2 results
  head_data <- read.csv(head_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Rename the "gene_id" column in filtered_final_orthotable for consistency
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
  
  # Merge with filtered_final_orthotable to include Orthogroup
  merged_data <- merge(head_data, filtered_final_orthotable, by = "GeneID", all.x = TRUE)
  
  # Check if merge was successful
  if (nrow(merged_data) == 0) {
    message(paste("No matching data for species:", species))
    next  # Skip if no matching data after merging
  }

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes for each tissue
  head_up <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  head_down <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
    stop("No valid data available for heatmap generation.")
}

# Filter out rows with missing Orthogroup values
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(Orthogroup))

# Check if there are any missing values in log2FoldChange (optional, just in case)
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(log2FoldChange))

# Create heatmap matrix using Orthogroup instead of GeneID
heatmap_matrix <- final_heatmap_data %>%
    group_by(Orthogroup, Species) %>%
    summarize(
        Head_Combined = sum(log2FoldChange[Tissue == "Head"], na.rm = TRUE),
        .groups = 'drop'
    ) %>%
    pivot_wider(names_from = Species, 
                values_from = Head_Combined, 
                values_fill = list(Head_Combined = 0)) %>%
    column_to_rownames("Orthogroup") %>%
    as.matrix()

# Check if heatmap_matrix is empty
if (nrow(heatmap_matrix) == 0) {
    stop("No valid data available for heatmap matrix.")
}

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Thorax tissues

# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Function to load DEGs for a given group of species
load_deg_data <- function(locusts, allspecies_df, filtered_final_orthotable) {
    degs_up <- list()
    degs_down <- list()
    degs_all <- list()
    
    # Rename the "gene_id" column in filtered_final_orthotable for consistency
    colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
    
    for (species in locusts) {
        thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_", species, ".csv"))
        
        # Check if the file exists
        if (!file.exists(thorax_file)) {
            message(paste("File not found for species:", species))
            next  # Skip this iteration if the file is missing
        }
        
        # Read the data
        thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
        
        # Rename the "X" column to "GeneID"
        colnames(thorax_data)[colnames(thorax_data) == "X"] <- "GeneID"
        
        # Merge DEG data with GeneType and Orthogroup information
        thorax_data_merged <- merge(thorax_data, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
        thorax_data_merged <- merge(thorax_data_merged, filtered_final_orthotable[, c("GeneID", "Orthogroup")], by = "GeneID")
        
        # Handle missing Orthogroups
        thorax_data_merged$Orthogroup[is.na(thorax_data_merged$Orthogroup)] <- "Unknown"
        
        # Filter for significant DEGs (both upregulated and downregulated)
        thorax_up <- thorax_data_merged %>%
            filter(padj < 0.05 & log2FoldChange >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        thorax_down <- thorax_data_merged %>%
            filter(padj < 0.05 & log2FoldChange <= -1) %>%
            select(Orthogroup) %>%
            distinct()
        
        all_deg <- thorax_data_merged %>%
            filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        # Store the DEGs in the list
        degs_up[[species]] <- thorax_up$Orthogroup
        degs_down[[species]] <- thorax_down$Orthogroup
        degs_all[[species]] <- all_deg$Orthogroup
    }
    
    return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Function to display Venn diagram and corresponding datatable based on Orthogroups
# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df, filtered_final_orthotable) {
    
    # Calculate overlapping Orthogroups
    overlap_orthogroups <- Reduce(intersect, venn_data)
    
    # Print overlap info
    cat("Overlapping Orthogroups: \n")
    print(overlap_orthogroups)
    
    # If no overlaps exist, display a message and an empty plot
    if (length(overlap_orthogroups) == 0) {
        message("⚠️ No overlapping Orthogroups found. Displaying an empty Venn diagram.")
        
        # Create an empty Venn diagram placeholder
        plot.new()
        text(0.5, 0.5, "No overlapping Orthogroups found", cex = 1.5, col = "red")
        
        return(NULL)  # Exit the function gracefully
    }
    
    # Create a data frame for the overlapping Orthogroups
    overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
    
    # Merge to get species and other information from filtered_final_orthotable
    meta_brock_df <- merge(overlap_df, filtered_final_orthotable, by = "Orthogroup", all.x = TRUE)
    
    # Ensure merged data exists
    if (nrow(meta_brock_df) == 0) {
        message("⚠️ Merge failed: No matching rows after merging Orthogroups.")
        return(NULL)
    }
    
    # Generate the Venn diagram
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("gregaria", "piceifrons", "cancellata"), 
        filename = NULL, 
        output = TRUE,
        fill = c("orange", "red", "orchid"),
        alpha = 0.5,
        cex = 1.2,
        cat.cex = 0,
        main = title,
        main.cex = 1.2
    )
    
    # Clear the current plotting area before drawing the Venn diagram
    grid.newpage()
    
    # Display the Venn diagram
    grid.draw(venn_plot)
    
    # Display the datatable for overlapping Orthogroups
    datatable(meta_brock_df, options = list(
        pageLength = 10,
        scrollX = TRUE,
        autoWidth = TRUE,
        searchHighlight = TRUE
    ),
    rownames = FALSE,
    escape = FALSE
    ) %>%
        formatStyle(
            'Species', target = 'cell',
            fontStyle = 'italic'
        )
}

# Load DEGs for locusts
venn_data_locusts <- load_deg_data(locusts, allspecies_df, filtered_final_orthotable)

# Prepare the data for Venn diagrams
venn_data_up <- list(
  gregaria = venn_data_locusts$up[["gregaria"]],
  piceifrons = venn_data_locusts$up[["piceifrons"]],
  cancellata = venn_data_locusts$up[["cancellata"]]
)

venn_data_down <- list(
  gregaria = venn_data_locusts$down[["gregaria"]],
  piceifrons = venn_data_locusts$down[["piceifrons"]],
  cancellata = venn_data_locusts$down[["cancellata"]]
)

venn_data_all <- list(
  gregaria = venn_data_locusts$all[["gregaria"]],
  piceifrons = venn_data_locusts$all[["piceifrons"]],
  cancellata = venn_data_locusts$all[["cancellata"]]
)

# Display the Venn diagrams with fallback for missing overlaps
message("Processing Venn diagram for thorax upregulated DEGs...")
display_venn_with_datatable(venn_data_up, "Venn Diagram of Thorax Upregulated DEGs - Locusts", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
 [1] "OG0012909" "OG0007864" "Unknown"   "OG0012855" "OG0004381" "OG0000630"
 [7] "OG0000022" "OG0008668" "OG0008773" "OG0000354" "OG0013891" "OG0002449"
[13] "OG0000196" "OG0004741" "OG0009529" "OG0009902" "OG0000197" "OG0010559"
[19] "OG0010743" "OG0011005" "OG0012141" "OG0011869" "OG0005151" "OG0006295"
[25] "OG0006293" "OG0005991" "OG0003684" "OG0003702" "OG0003704" "OG0003705"
[31] "OG0006530" "OG0007121" "OG0006498" "OG0000112" "OG0006936" "OG0007438"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

message("Processing Venn diagram for thorax downregulated DEGs...")
display_venn_with_datatable(venn_data_down, "Venn Diagram of Thorax Downregulated DEGs - Locusts", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
 [1] "Unknown"   "OG0007611" "OG0014256" "OG0008629" "OG0008761" "OG0002570"
 [7] "OG0009787" "OG0010410" "OG0011346" "OG0000270" "OG0004972" "OG0000008"
[13] "OG0002151" "OG0000505"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
8df3d7c	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

message("Processing Venn diagram for all significant DEGs...")
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Thorax DEGs - Locusts", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
 [1] "OG0012909" "Unknown"   "OG0007611" "OG0007864" "OG0012855" "OG0004381"
 [7] "OG0000630" "OG0000022" "OG0008668" "OG0012943" "OG0014256" "OG0008773"
[13] "OG0000354" "OG0013891" "OG0008629" "OG0002449" "OG0008761" "OG0000196"
[19] "OG0002570" "OG0004741" "OG0000027" "OG0009529" "OG0009902" "OG0000197"
[25] "OG0009787" "OG0002897" "OG0010559" "OG0010410" "OG0000446" "OG0010863"
[31] "OG0010743" "OG0011005" "OG0011162" "OG0011346" "OG0012141" "OG0011869"
[37] "OG0001366" "OG0000396" "OG0000270" "OG0005151" "OG0004972" "OG0014897"
[43] "OG0006295" "OG0006293" "OG0000008" "OG0005991" "OG0005943" "OG0012394"
[49] "OG0003684" "OG0003702" "OG0003704" "OG0003705" "OG0006530" "OG0002151"
[55] "OG0000218" "OG0007121" "OG0006498" "OG0000112" "OG0000505" "OG0000504"
[61] "OG0000037" "OG0006936" "OG0007438"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Define the species for Group 1
locusts <- c("gregaria", "piceifrons", "cancellata")
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)
# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in locusts) {
  # Load DESeq2 results for head
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Thorax_", species, ".csv"))
  
  # Load the DESeq2 results
  thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Rename the "gene_id" column in filtered_final_orthotable for consistency
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
  
  # Merge with filtered_final_orthotable to include Orthogroup
  merged_data <- merge(thorax_data, filtered_final_orthotable, by = "GeneID", all.x = TRUE)
  
  # Check if merge was successful
  if (nrow(merged_data) == 0) {
    message(paste("No matching data for species:", species))
    next  # Skip if no matching data after merging
  }

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes for each tissue
  thorax_up <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  thorax_down <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
    stop("No valid data available for heatmap generation.")
}

# Filter out rows with missing Orthogroup values
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(Orthogroup))

# Check if there are any missing values in log2FoldChange (optional, just in case)
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(log2FoldChange))

# Create heatmap matrix using Orthogroup instead of GeneID
heatmap_matrix <- final_heatmap_data %>%
    group_by(Orthogroup, Species) %>%
    summarize(
        Thorax_Combined = sum(log2FoldChange[Tissue == "Thorax"], na.rm = TRUE),
        .groups = 'drop'
    ) %>%
    pivot_wider(names_from = Species, 
                values_from = Thorax_Combined, 
                values_fill = list(Thorax_Combined = 0)) %>%
    column_to_rownames("Orthogroup") %>%
    as.matrix()

# Check if heatmap_matrix is empty
if (nrow(heatmap_matrix) == 0) {
    stop("No valid data available for heatmap matrix.")
}

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

piceifrons-americana-cubense

Head tissues

# Define the species for PACclade
PACclade <- c("piceifrons", "americana", "cubense")
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Function to load DEGs for a given group of species
load_deg_data <- function(PACclade, allspecies_df, filtered_final_orthotable) {
    degs_up <- list()
    degs_down <- list()
    degs_all <- list()
    
    # Rename the "gene_id" column in filtered_final_orthotable for consistency
    colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
    
    for (species in PACclade) {
        head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_", species, ".csv"))
        
        # Check if the file exists
        if (!file.exists(head_file)) {
            message(paste("File not found for species:", species))
            next  # Skip this iteration if the file is missing
        }
        
        # Read the data
        head_data <- read.csv(head_file, stringsAsFactors = FALSE)
        
        # Rename the "X" column to "GeneID"
        colnames(head_data)[colnames(head_data) == "X"] <- "GeneID"
        
        # Merge DEG data with GeneType and Orthogroup information
        head_data_merged <- merge(head_data, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
        head_data_merged <- merge(head_data_merged, filtered_final_orthotable[, c("GeneID", "Orthogroup")], by = "GeneID")
        
        # Handle missing Orthogroups
        head_data_merged$Orthogroup[is.na(head_data_merged$Orthogroup)] <- "Unknown"
        
        # Filter for significant DEGs (both upregulated and downregulated)
        head_up <- head_data_merged %>%
            filter(padj < 0.05 & log2FoldChange >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        head_down <- head_data_merged %>%
            filter(padj < 0.05 & log2FoldChange <= -1) %>%
            select(Orthogroup) %>%
            distinct()
        
        all_deg <- head_data_merged %>%
            filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        # Store the DEGs in the list
        degs_up[[species]] <- head_up$Orthogroup
        degs_down[[species]] <- head_down$Orthogroup
        degs_all[[species]] <- all_deg$Orthogroup
    }
    
    return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Function to display Venn diagram and corresponding datatable based on Orthogroups
# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df, filtered_final_orthotable) {
    
    # Calculate overlapping Orthogroups
    overlap_orthogroups <- Reduce(intersect, venn_data)
    
    # Print overlap info
    cat("Overlapping Orthogroups: \n")
    print(overlap_orthogroups)
    
    # If no overlaps exist, display a message and an empty plot
    if (length(overlap_orthogroups) == 0) {
        message("⚠️ No overlapping Orthogroups found. Displaying an empty Venn diagram.")
        
        # Create an empty Venn diagram placeholder
        plot.new()
        text(0.5, 0.5, "No overlapping Orthogroups found", cex = 1.5, col = "red")
        
        return(NULL)  # Exit the function gracefully
    }
    
    # Create a data frame for the overlapping Orthogroups
    overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
    
    # Merge to get species and other information from filtered_final_orthotable
    meta_brock_df <- merge(overlap_df, filtered_final_orthotable, by = "Orthogroup", all.x = TRUE)
    
    # Ensure merged data exists
    if (nrow(meta_brock_df) == 0) {
        message("⚠️ Merge failed: No matching rows after merging Orthogroups.")
        return(NULL)
    }
    
    # Generate the Venn diagram
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("piceifrons", "americana", "cubense"), 
        filename = NULL, 
        output = TRUE,
        fill = c("red", "green", "yellow"),
        alpha = 0.5,
        cex = 1.2,
        cat.cex = 0,
        main = title,
        main.cex = 1.2
    )
    
    # Clear the current plotting area before drawing the Venn diagram
    grid.newpage()
    
    # Display the Venn diagram
    grid.draw(venn_plot)
    
    # Manually create a custom legend
    legend_labels <- c("piceifrons", "americana", "cubense")
    legend_colors <- c("red", "green", "yellow")
    
    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")   # Lower the legend vertically
    
    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }
    
    # Display the merged overlapping Orthogroups table with datatable
    datatable(meta_brock_df, options = list(
        pageLength = 10,
        scrollX = TRUE,
        autoWidth = TRUE,
        searchHighlight = TRUE
    ),
    rownames = FALSE,
    escape = FALSE
    ) %>%
        formatStyle(
            'Species', target = 'cell',
            fontStyle = 'italic'
        ) %>%
        formatStyle(
            columns = names(meta_brock_df), 
            target = 'row',
            color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
            fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
            backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
        )
}

# Example for testing with your data (for PACclade)
venn_data_pacclade <- load_deg_data(PACclade, allspecies_df, filtered_final_orthotable)

# Prepare the data for the Venn diagrams for PACclade
venn_data_up <- list(
  piceifrons = venn_data_pacclade$up[["piceifrons"]],
  americana = venn_data_pacclade$up[["americana"]],
  cubense = venn_data_pacclade$up[["cubense"]]
)

venn_data_down <- list(
  piceifrons = venn_data_pacclade$down[["piceifrons"]],
  americana = venn_data_pacclade$down[["americana"]],
  cubense = venn_data_pacclade$down[["cubense"]]
)

venn_data_all <- list(
  piceifrons = venn_data_pacclade$all[["piceifrons"]],
  americana = venn_data_pacclade$all[["americana"]],
  cubense = venn_data_pacclade$all[["cubense"]]
)

# Display the Venn diagram and datatable for head upregulated DEGs (PACclade)
display_venn_with_datatable(venn_data_up, "Venn Diagram of Head Upregulated DEGs - PAC", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
[1] "Unknown"   "OG0006291" "OG0012596"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for head downregulated DEGs (PACclade)
display_venn_with_datatable(venn_data_down, "Venn Diagram of Head Downregulated DEGs - PAC", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
[1] "Unknown"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for all significant DEGs (PACclade)
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Head DEGs - PAC", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
[1] "Unknown"   "OG0006291" "OG0012596" "OG0008500"

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Define the species for Group 1
PACclade <- c("piceifrons", "americana", "cubense")
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)
# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in PACclade) {
  # Load DESeq2 results for head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Head_", species, ".csv"))
  
  # Load the DESeq2 results
  head_data <- read.csv(head_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Rename the "gene_id" column in filtered_final_orthotable for consistency
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
  
  # Merge with filtered_final_orthotable to include Orthogroup
  merged_data <- merge(head_data, filtered_final_orthotable, by = "GeneID", all.x = TRUE)
  
  # Check if merge was successful
  if (nrow(merged_data) == 0) {
    message(paste("No matching data for species:", species))
    next  # Skip if no matching data after merging
  }

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes for each tissue
  head_up <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  head_down <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
    stop("No valid data available for heatmap generation.")
}

# Filter out rows with missing Orthogroup values
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(Orthogroup))

# Check if there are any missing values in log2FoldChange (optional, just in case)
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(log2FoldChange))

# Create heatmap matrix using Orthogroup instead of GeneID
heatmap_matrix <- final_heatmap_data %>%
    group_by(Orthogroup, Species) %>%
    summarize(
        Head_Combined = sum(log2FoldChange[Tissue == "Head"], na.rm = TRUE),
        .groups = 'drop'
    ) %>%
    pivot_wider(names_from = Species, 
                values_from = Head_Combined, 
                values_fill = list(Head_Combined = 0)) %>%
    column_to_rownames("Orthogroup") %>%
    as.matrix()

# Check if heatmap_matrix is empty
if (nrow(heatmap_matrix) == 0) {
    stop("No valid data available for heatmap matrix.")
}

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Thorax tissues

# Define the species for PACclade
PACclade <- c("piceifrons", "americana", "cubense")
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Function to load DEGs for a given group of species
load_deg_data <- function(PACclade, allspecies_df, filtered_final_orthotable) {
    degs_up <- list()
    degs_down <- list()
    degs_all <- list()
    
    # Rename the "gene_id" column in filtered_final_orthotable for consistency
    colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
    
    for (species in PACclade) {
        thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_", species, ".csv"))
        
        # Check if the file exists
        if (!file.exists(thorax_file)) {
            message(paste("File not found for species:", species))
            next  # Skip this iteration if the file is missing
        }
        
        # Read the data
        thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
        
        # Rename the "X" column to "GeneID"
        colnames(thorax_data)[colnames(thorax_data) == "X"] <- "GeneID"
        
        # Merge DEG data with GeneType and Orthogroup information
        thorax_data_merged <- merge(thorax_data, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
        thorax_data_merged <- merge(thorax_data_merged, filtered_final_orthotable[, c("GeneID", "Orthogroup")], by = "GeneID")
        
        # Handle missing Orthogroups
        thorax_data_merged$Orthogroup[is.na(thorax_data_merged$Orthogroup)] <- "Unknown"
        
        # Filter for significant DEGs (both upregulated and downregulated)
        thorax_up <- thorax_data_merged %>%
            filter(padj < 0.05 & log2FoldChange >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        thorax_down <- thorax_data_merged %>%
            filter(padj < 0.05 & log2FoldChange <= -1) %>%
            select(Orthogroup) %>%
            distinct()
        
        all_deg <- thorax_data_merged %>%
            filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        # Store the DEGs in the list
        degs_up[[species]] <- thorax_up$Orthogroup
        degs_down[[species]] <- thorax_down$Orthogroup
        degs_all[[species]] <- all_deg$Orthogroup
    }
    
    return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Function to display Venn diagram and corresponding datatable based on Orthogroups
# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df, filtered_final_orthotable) {
    
    # Calculate overlapping Orthogroups
    overlap_orthogroups <- Reduce(intersect, venn_data)
    
    # Print overlap info
    cat("Overlapping Orthogroups: \n")
    print(overlap_orthogroups)
    
    # If no overlaps exist, display a message and an empty plot
    if (length(overlap_orthogroups) == 0) {
        message("⚠️ No overlapping Orthogroups found. Displaying an empty Venn diagram.")
        
        # Create an empty Venn diagram placeholder
        plot.new()
        text(0.5, 0.5, "No overlapping Orthogroups found", cex = 1.5, col = "red")
        
        return(NULL)  # Exit the function gracefully
    }
    
    # Create a data frame for the overlapping Orthogroups
    overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
    
    # Merge to get species and other information from filtered_final_orthotable
    meta_brock_df <- merge(overlap_df, filtered_final_orthotable, by = "Orthogroup", all.x = TRUE)
    
    # Ensure merged data exists
    if (nrow(meta_brock_df) == 0) {
        message("⚠️ Merge failed: No matching rows after merging Orthogroups.")
        return(NULL)
    }
   
    # Generate the Venn diagram
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("piceifrons", "americana", "cubense"), 
        filename = NULL, 
        output = TRUE,
        fill = c("red", "green", "yellow"),
        alpha = 0.5,
        cex = 1.2,
        cat.cex = 0,
        main = title,
        main.cex = 1.2
    )
    
    # Clear the current plotting area before drawing the Venn diagram
    grid.newpage()
    
    # Display the Venn diagram
    grid.draw(venn_plot)
    
    # Manually create a custom legend
    legend_labels <- c("piceifrons", "americana", "cubense")
    legend_colors <- c("red", "green", "yellow")
    
    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")   # Lower the legend vertically
    
    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }
    
    # Display the merged overlapping Orthogroups table with datatable
    datatable(meta_brock_df, options = list(
        pageLength = 10,
        scrollX = TRUE,
        autoWidth = TRUE,
        searchHighlight = TRUE
    ),
    rownames = FALSE,
    escape = FALSE
    ) %>%
        formatStyle(
            'Species', target = 'cell',
            fontStyle = 'italic'
        ) %>%
        formatStyle(
            columns = names(meta_brock_df), 
            target = 'row',
            color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
            fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
            backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
        )
}

# Example for testing with your data (for PACclade)
venn_data_pacclade <- load_deg_data(PACclade, allspecies_df, filtered_final_orthotable)

# Prepare the data for the Venn diagrams for PACclade
venn_data_up <- list(
  piceifrons = venn_data_pacclade$up[["piceifrons"]],
  americana = venn_data_pacclade$up[["americana"]],
  cubense = venn_data_pacclade$up[["cubense"]]
)

venn_data_down <- list(
  piceifrons = venn_data_pacclade$down[["piceifrons"]],
  americana = venn_data_pacclade$down[["americana"]],
  cubense = venn_data_pacclade$down[["cubense"]]
)

venn_data_all <- list(
  piceifrons = venn_data_pacclade$all[["piceifrons"]],
  americana = venn_data_pacclade$all[["americana"]],
  cubense = venn_data_pacclade$all[["cubense"]]
)

# Display the Venn diagram and datatable for thorax upregulated DEGs (PACclade)
display_venn_with_datatable(venn_data_up, "Venn Diagram of Thorax Upregulated DEGs - PAC", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
[1] "Unknown"   "OG0000111"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for thorax downregulated DEGs (PACclade)
display_venn_with_datatable(venn_data_down, "Venn Diagram of Thorax Downregulated DEGs - PAC", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
[1] "Unknown"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for all significant DEGs (PACclade)
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Thorax DEGs - PAC", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
[1] "Unknown"   "OG0000315" "OG0000142" "OG0001691" "OG0000111" "OG0000467"
[7] "OG0008500"

Version	Author	Date
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Define the species for PACclade
PACclade <- c("piceifrons", "americana", "cubense")
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)
# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in PACclade) {
  # Load DESeq2 results for head
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Thorax_", species, ".csv"))
  
  # Load the DESeq2 results
  thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Rename the "gene_id" column in filtered_final_orthotable for consistency
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
  
  # Merge with filtered_final_orthotable to include Orthogroup
  merged_data <- merge(thorax_data, filtered_final_orthotable, by = "GeneID", all.x = TRUE)
  
  # Check if merge was successful
  if (nrow(merged_data) == 0) {
    message(paste("No matching data for species:", species))
    next  # Skip if no matching data after merging
  }

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes for each tissue
  thorax_up <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  thorax_down <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
    stop("No valid data available for heatmap generation.")
}

# Filter out rows with missing Orthogroup values
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(Orthogroup))

# Check if there are any missing values in log2FoldChange (optional, just in case)
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(log2FoldChange))

# Create heatmap matrix using Orthogroup instead of GeneID
heatmap_matrix <- final_heatmap_data %>%
    group_by(Orthogroup, Species) %>%
    summarize(
        Thorax_Combined = sum(log2FoldChange[Tissue == "Thorax"], na.rm = TRUE),
        .groups = 'drop'
    ) %>%
    pivot_wider(names_from = Species, 
                values_from = Thorax_Combined, 
                values_fill = list(Thorax_Combined = 0)) %>%
    column_to_rownames("Orthogroup") %>%
    as.matrix()

# Check if heatmap_matrix is empty
if (nrow(heatmap_matrix) == 0) {
    stop("No valid data available for heatmap matrix.")
}

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Plastic species

Head tissues

# Define the species for plastic_species
plastic_species <- c("gregaria", "piceifrons", "cancellata", "americana")
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)


# Function to load DEGs for a given group of species
load_deg_data <- function(plastic_species, allspecies_df, filtered_final_orthotable) {
    degs_up <- list()
    degs_down <- list()
    degs_all <- list()
    
    # Rename the "gene_id" column in filtered_final_orthotable for consistency
    colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
    
    for (species in plastic_species) {
        head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_", species, ".csv"))
        
        # Check if the file exists
        if (!file.exists(head_file)) {
            message(paste("File not found for species:", species))
            next  # Skip this iteration if the file is missing
        }
        
        # Read the data
        head_data <- read.csv(head_file, stringsAsFactors = FALSE)
        
        # Rename the "X" column to "GeneID"
        colnames(head_data)[colnames(head_data) == "X"] <- "GeneID"
        
        # Merge DEG data with GeneType and Orthogroup information
        head_data_merged <- merge(head_data, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
        head_data_merged <- merge(head_data_merged, filtered_final_orthotable[, c("GeneID", "Orthogroup")], by = "GeneID")
        
        # Handle missing Orthogroups
        head_data_merged$Orthogroup[is.na(head_data_merged$Orthogroup)] <- "Unknown"
        
        # Filter for significant DEGs (both upregulated and downregulated)
        head_up <- head_data_merged %>%
            filter(padj < 0.05 & log2FoldChange >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        head_down <- head_data_merged %>%
            filter(padj < 0.05 & log2FoldChange <= -1) %>%
            select(Orthogroup) %>%
            distinct()
        
        all_deg <- head_data_merged %>%
            filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        # Store the DEGs in the list
        degs_up[[species]] <- head_up$Orthogroup
        degs_down[[species]] <- head_down$Orthogroup
        degs_all[[species]] <- all_deg$Orthogroup
    }
    
    return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Function to display Venn diagram and corresponding datatable based on Orthogroups
# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df, filtered_final_orthotable) {
    
    # Calculate overlapping Orthogroups
    overlap_orthogroups <- Reduce(intersect, venn_data)
    
    # Print overlap info
    cat("Overlapping Orthogroups: \n")
    print(overlap_orthogroups)
    
    # If no overlaps exist, display a message and an empty plot
    if (length(overlap_orthogroups) == 0) {
        message("⚠️ No overlapping Orthogroups found. Displaying an empty Venn diagram.")
        
        # Create an empty Venn diagram placeholder
        plot.new()
        text(0.5, 0.5, "No overlapping Orthogroups found", cex = 1.5, col = "red")
        
        return(NULL)  # Exit the function gracefully
    }
    
    # Create a data frame for the overlapping Orthogroups
    overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
    
    # Merge to get species and other information from filtered_final_orthotable
    meta_brock_df <- merge(overlap_df, filtered_final_orthotable, by = "Orthogroup", all.x = TRUE)
    
    # Ensure merged data exists
    if (nrow(meta_brock_df) == 0) {
        message("⚠️ Merge failed: No matching rows after merging Orthogroups.")
        return(NULL)
    }
    
    # Generate the Venn diagram
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("piceifrons", "americana", "cubense", "gregaria"), 
        filename = NULL, 
        output = TRUE,
        fill = c("red", "green", "yellow", "orange"),
        alpha = 0.5,
        cex = 1.2,
        cat.cex = 0,
        main = title,
        main.cex = 1.2
    )
    
    # Clear the current plotting area before drawing the Venn diagram
    grid.newpage()
    
    # Display the Venn diagram
    grid.draw(venn_plot)
    
    # Manually create a custom legend
    legend_labels <- c("piceifrons", "americana", "cubense", "gregaria")
    legend_colors <- c("red", "green", "yellow", "orange")
    
    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")   # Lower the legend vertically
    
    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }
    
    # Display the merged overlapping Orthogroups table with datatable
    datatable(meta_brock_df, options = list(
        pageLength = 10,
        scrollX = TRUE,
        autoWidth = TRUE,
        searchHighlight = TRUE
    ),
    rownames = FALSE,
    escape = FALSE
    ) %>%
        formatStyle(
            'Species', target = 'cell',
            fontStyle = 'italic'
        ) %>%
        formatStyle(
            columns = names(meta_brock_df), 
            target = 'row',
            color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
            fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
            backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
        )
}

# Example for testing with your data (for plastic_species)
venn_data_plastic_species <- load_deg_data(plastic_species, allspecies_df, filtered_final_orthotable)

# Prepare the data for the Venn diagrams for plastic_species
venn_data_up <- list(
  gregaria = venn_data_plastic_species$up[["gregaria"]],
  piceifrons = venn_data_plastic_species$up[["piceifrons"]],
  cancellata = venn_data_plastic_species$up[["cancellata"]],
  americana = venn_data_plastic_species$up[["americana"]]
)

venn_data_down <- list(
  gregaria = venn_data_plastic_species$down[["gregaria"]],
  piceifrons = venn_data_plastic_species$down[["piceifrons"]],
  cancellata = venn_data_plastic_species$down[["cancellata"]],
  americana = venn_data_plastic_species$down[["americana"]]
)

venn_data_all <- list(
  gregaria = venn_data_plastic_species$all[["gregaria"]],
  piceifrons = venn_data_plastic_species$all[["piceifrons"]],
  cancellata = venn_data_plastic_species$all[["cancellata"]],
  americana = venn_data_plastic_species$all[["americana"]]
)

# Display the Venn diagram and datatable for head upregulated DEGs (plastic_species)
display_venn_with_datatable(venn_data_up, "Venn Diagram of Head Upregulated DEGs - Plastic Species", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
[1] "Unknown"   "OG0007485" "OG0004381" "OG0000630" "OG0000447"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for head downregulated DEGs (plastic_species)
display_venn_with_datatable(venn_data_down, "Venn Diagram of Head Downregulated DEGs - Plastic Species", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
 [1] "Unknown"   "OG0008550" "OG0008546" "OG0004570" "OG0009787" "OG0011171"
 [7] "OG0004972" "OG0003935" "OG0000505" "OG0000273" "OG0000149"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for all significant DEGs (plastic_species)
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Head DEGs - Plastic Species", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
 [1] "Unknown"   "OG0007485" "OG0004381" "OG0000630" "OG0008550" "OG0008546"
 [7] "OG0004570" "OG0009787" "OG0000447" "OG0011171" "OG0005490" "OG0004972"
[13] "OG0003935" "OG0000505" "OG0000273" "OG0000149"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Define the species for Group 1
plastic_species <- c("gregaria", "piceifrons", "cancellata", "americana")
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)
# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in plastic_species) {
  # Load DESeq2 results for head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Head_", species, ".csv"))
  
  # Load the DESeq2 results
  head_data <- read.csv(head_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(head_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Rename the "gene_id" column in filtered_final_orthotable for consistency
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
  
  # Merge with filtered_final_orthotable to include Orthogroup
  merged_data <- merge(head_data, filtered_final_orthotable, by = "GeneID", all.x = TRUE)
  
  # Check if merge was successful
  if (nrow(merged_data) == 0) {
    message(paste("No matching data for species:", species))
    next  # Skip if no matching data after merging
  }

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes for each tissue
  head_up <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  head_down <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
    stop("No valid data available for heatmap generation.")
}

# Filter out rows with missing Orthogroup values
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(Orthogroup))

# Check if there are any missing values in log2FoldChange (optional, just in case)
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(log2FoldChange))

# Create heatmap matrix using Orthogroup instead of GeneID
heatmap_matrix <- final_heatmap_data %>%
    group_by(Orthogroup, Species) %>%
    summarize(
        Head_Combined = sum(log2FoldChange[Tissue == "Head"], na.rm = TRUE),
        .groups = 'drop'
    ) %>%
    pivot_wider(names_from = Species, 
                values_from = Head_Combined, 
                values_fill = list(Head_Combined = 0)) %>%
    column_to_rownames("Orthogroup") %>%
    as.matrix()

# Check if heatmap_matrix is empty
if (nrow(heatmap_matrix) == 0) {
    stop("No valid data available for heatmap matrix.")
}

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
8df3d7c	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

Thorax tissues

# Define the species for plastic_species
plastic_species <- c("gregaria", "piceifrons", "cancellata", "americana")
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)


# Function to load DEGs for a given group of species
load_deg_data <- function(plastic_species, allspecies_df, filtered_final_orthotable) {
    degs_up <- list()
    degs_down <- list()
    degs_all <- list()
    
    # Rename the "gene_id" column in filtered_final_orthotable for consistency
    colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
    
    for (species in plastic_species) {
        thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_", species, ".csv"))
        
        # Check if the file exists
        if (!file.exists(thorax_file)) {
            message(paste("File not found for species:", species))
            next  # Skip this iteration if the file is missing
        }
        
        # Read the data
        thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
        
        # Rename the "X" column to "GeneID"
        colnames(thorax_data)[colnames(thorax_data) == "X"] <- "GeneID"
        
        # Merge DEG data with GeneType and Orthogroup information
        thorax_data_merged <- merge(thorax_data, allspecies_df[, c("GeneID", "GeneType", "Species")], by = "GeneID")
        thorax_data_merged <- merge(thorax_data_merged, filtered_final_orthotable[, c("GeneID", "Orthogroup")], by = "GeneID")
        
        # Handle missing Orthogroups
        thorax_data_merged$Orthogroup[is.na(thorax_data_merged$Orthogroup)] <- "Unknown"
        
        # Filter for significant DEGs (both upregulated and downregulated)
        thorax_up <- thorax_data_merged %>%
            filter(padj < 0.05 & log2FoldChange >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        thorax_down <- thorax_data_merged %>%
            filter(padj < 0.05 & log2FoldChange <= -1) %>%
            select(Orthogroup) %>%
            distinct()
        
        all_deg <- thorax_data_merged %>%
            filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
            select(Orthogroup) %>%
            distinct()
        
        # Store the DEGs in the list
        degs_up[[species]] <- thorax_up$Orthogroup
        degs_down[[species]] <- thorax_down$Orthogroup
        degs_all[[species]] <- all_deg$Orthogroup
    }
    
    return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Function to display Venn diagram and corresponding datatable based on Orthogroups
# Function to display Venn diagram and corresponding datatable
display_venn_with_datatable <- function(venn_data, title, allspecies_df, filtered_final_orthotable) {
    
    # Calculate overlapping Orthogroups
    overlap_orthogroups <- Reduce(intersect, venn_data)
    
    # Print overlap info
    cat("Overlapping Orthogroups: \n")
    print(overlap_orthogroups)
    
    # If no overlaps exist, display a message and an empty plot
    if (length(overlap_orthogroups) == 0) {
        message("⚠️ No overlapping Orthogroups found. Displaying an empty Venn diagram.")
        
        # Create an empty Venn diagram placeholder
        plot.new()
        text(0.5, 0.5, "No overlapping Orthogroups found", cex = 1.5, col = "red")
        
        return(NULL)  # Exit the function gracefully
    }
    
    # Create a data frame for the overlapping Orthogroups
    overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
    
    # Merge to get species and other information from filtered_final_orthotable
    meta_brock_df <- merge(overlap_df, filtered_final_orthotable, by = "Orthogroup", all.x = TRUE)
    
    # Ensure merged data exists
    if (nrow(meta_brock_df) == 0) {
        message("⚠️ Merge failed: No matching rows after merging Orthogroups.")
        return(NULL)
    }
      
    # Generate the Venn diagram
    venn_plot <- venn.diagram(
        x = venn_data, 
        category.names = c("piceifrons", "americana", "cubense", "gregaria"), 
        filename = NULL, 
        output = TRUE,
        fill = c("red", "green", "yellow", "orange"),
        alpha = 0.5,
        cex = 1.2,
        cat.cex = 0,
        main = title,
        main.cex = 1.2
    )
    
    # Clear the current plotting area before drawing the Venn diagram
    grid.newpage()
    
    # Display the Venn diagram
    grid.draw(venn_plot)
    
    # Manually create a custom legend
    legend_labels <- c("piceifrons", "americana", "cubense", "gregaria")
    legend_colors <- c("red", "green", "yellow", "orange")
    
    # Positioning the legend lower on the right side of the plot
    legend_x <- unit(0.85, "npc")  # Adjust x position
    legend_y <- unit(0.2, "npc")   # Lower the legend vertically
    
    # Draw the legend
    for (i in 1:length(legend_labels)) {
        grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
                  gp = gpar(fill = legend_colors[i], col = NA))
        grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
                  y = legend_y - unit((i - 1) * 0.05, "npc"), 
                  just = "left", gp = gpar(cex = 0.8))
    }
    
    # Display the merged overlapping Orthogroups table with datatable
    datatable(meta_brock_df, options = list(
        pageLength = 10,
        scrollX = TRUE,
        autoWidth = TRUE,
        searchHighlight = TRUE
    ),
    rownames = FALSE,
    escape = FALSE
    ) %>%
        formatStyle(
            'Species', target = 'cell',
            fontStyle = 'italic'
        ) %>%
        formatStyle(
            columns = names(meta_brock_df), 
            target = 'row',
            color = styleEqual(c("red", "blue", "black"), c("red", "blue", "black")),
            fontWeight = styleEqual(c("bold", "normal"), c("bold", "normal")),
            backgroundColor = styleEqual(c("red", "blue", "black"), c("white", "white", "white"))
        )
}

# Example for testing with your data (for plastic_species)
venn_data_plastic_species <- load_deg_data(plastic_species, allspecies_df, filtered_final_orthotable)

# Prepare the data for the Venn diagrams for plastic_species
venn_data_up <- list(
  gregaria = venn_data_plastic_species$up[["gregaria"]],
  piceifrons = venn_data_plastic_species$up[["piceifrons"]],
  cancellata = venn_data_plastic_species$up[["cancellata"]],
  americana = venn_data_plastic_species$up[["americana"]]
)

venn_data_down <- list(
  gregaria = venn_data_plastic_species$down[["gregaria"]],
  piceifrons = venn_data_plastic_species$down[["piceifrons"]],
  cancellata = venn_data_plastic_species$down[["cancellata"]],
  americana = venn_data_plastic_species$down[["americana"]]
)

venn_data_all <- list(
  gregaria = venn_data_plastic_species$all[["gregaria"]],
  piceifrons = venn_data_plastic_species$all[["piceifrons"]],
  cancellata = venn_data_plastic_species$all[["cancellata"]],
  americana = venn_data_plastic_species$all[["americana"]]
)

# Display the Venn diagram and datatable for thorax upregulated DEGs (plastic_species)
display_venn_with_datatable(venn_data_up, "Venn Diagram of Thorax Upregulated DEGs - Plastic Species", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
 [1] "Unknown"   "OG0012855" "OG0004381" "OG0008773" "OG0013891" "OG0002449"
 [7] "OG0000196" "OG0010743" "OG0011869" "OG0006295" "OG0006293" "OG0005991"
[13] "OG0003684" "OG0003702" "OG0003704" "OG0003705" "OG0006530" "OG0006498"
[19] "OG0006936" "OG0007438"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for head downregulated DEGs (plastic_species)
display_venn_with_datatable(venn_data_down, "Venn Diagram of Thorax Downregulated DEGs - Plastic Species", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
[1] "Unknown"   "OG0008629" "OG0009787" "OG0010410" "OG0011346" "OG0000270"
[7] "OG0000008" "OG0000505"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Display the Venn diagram and datatable for all significant DEGs (plastic_species)
display_venn_with_datatable(venn_data_all, "Venn Diagram of All Thorax DEGs - Plastic Species", allspecies_df, filtered_final_orthotable)

Overlapping Orthogroups: 
 [1] "Unknown"   "OG0012855" "OG0004381" "OG0012943" "OG0008773" "OG0013891"
 [7] "OG0008629" "OG0002449" "OG0000196" "OG0009787" "OG0010410" "OG0010743"
[13] "OG0011346" "OG0011869" "OG0000396" "OG0000270" "OG0006295" "OG0006293"
[19] "OG0000008" "OG0005991" "OG0003684" "OG0003702" "OG0003704" "OG0003705"
[25] "OG0006530" "OG0006498" "OG0000505" "OG0000504" "OG0000037" "OG0006936"
[31] "OG0007438"

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Define the species for PACclade
plastic_species <- c("gregaria", "piceifrons", "cancellata", "americana")
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)
# Initialize an empty list to store heatmap data for each species
heatmap_list <- list()

# Loop through each species to process their data
for (species in plastic_species) {
  # Load DESeq2 results for head
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Thorax_", species, ".csv"))
  
  # Load the DESeq2 results
  thorax_data <- read.csv(thorax_file, stringsAsFactors = FALSE)
  
  # Check if data is empty and handle accordingly
  if (nrow(thorax_data) == 0) {
    message(paste("No data for species:", species))
    next  # Skip to the next species if there's no data
  }
  
  # Rename the "gene_id" column in filtered_final_orthotable for consistency
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
  
  # Merge with filtered_final_orthotable to include Orthogroup
  merged_data <- merge(thorax_data, filtered_final_orthotable, by = "GeneID", all.x = TRUE)
  
  # Check if merge was successful
  if (nrow(merged_data) == 0) {
    message(paste("No matching data for species:", species))
    next  # Skip if no matching data after merging
  }

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes for each tissue
  thorax_up <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)
  
  thorax_down <- merged_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)
  
  # Combine data and prepare for heatmap, adding the species column
  heatmap_data <- bind_rows(
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)
  
  # Append the heatmap data to the list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data into a single dataframe for heatmap matrix preparation
final_heatmap_data <- bind_rows(heatmap_list)

# Check if final_heatmap_data is empty before proceeding
if (nrow(final_heatmap_data) == 0) {
    stop("No valid data available for heatmap generation.")
}

# Filter out rows with missing Orthogroup values
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(Orthogroup))

# Check if there are any missing values in log2FoldChange (optional, just in case)
final_heatmap_data <- final_heatmap_data %>%
    filter(!is.na(log2FoldChange))

# Create heatmap matrix using Orthogroup instead of GeneID
heatmap_matrix <- final_heatmap_data %>%
    group_by(Orthogroup, Species) %>%
    summarize(
        Thorax_Combined = sum(log2FoldChange[Tissue == "Thorax"], na.rm = TRUE),
        .groups = 'drop'
    ) %>%
    pivot_wider(names_from = Species, 
                values_from = Thorax_Combined, 
                values_fill = list(Thorax_Combined = 0)) %>%
    column_to_rownames("Orthogroup") %>%
    as.matrix()

# Check if heatmap_matrix is empty
if (nrow(heatmap_matrix) == 0) {
    stop("No valid data available for heatmap matrix.")
}

# Define color palettes
# Define a custom color gradient where 0 is black
custom_color_palette1 <- colorRampPalette(c("cyan", "cyan3", "black", "orange3", "orange"))(100)

# Define a custom color gradient where 0 is white
custom_color_palette2 <- colorRampPalette(c("blue3", "blue", "white", "red", "red3"))(100)

# Define color breaks so that black is exactly at 0
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)  # Get max absolute log2FoldChange
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)  # Symmetric scale

# Create heatmap with clustering
pheatmap(
  heatmap_matrix,
  color = custom_color_palette2,
  breaks = color_breaks,
  cluster_rows = TRUE,  # Cluster genes
  cluster_cols = FALSE,  # Do not cluster species
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

# Create heatmap without clustering columns
pheatmap(
  heatmap_matrix,
  color = custom_color_palette1,
  breaks = color_breaks,
  cluster_rows = TRUE,  
  cluster_cols = FALSE,  
  show_rownames = FALSE,  
  show_colnames = TRUE,   
  fontsize_row = 6,      
  fontsize_col = 10,     
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04
faf2db3	Maeva TECHER	2025-01-13
fe6dae9	Maeva TECHER	2024-11-19

All species

Combined tissues

# Load orthogroup mapping
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Ensure column names are correctly set
if ("gene_id" %in% colnames(filtered_final_orthotable)) {
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
}

# Select only relevant columns and ensure uniqueness
filtered_final_orthotable <- filtered_final_orthotable %>%
  select(GeneID, Orthogroup) %>%
  distinct(GeneID, .keep_all = TRUE)  # Ensure one entry per GeneID

# Define species list
allspecies <- c("gregaria", "piceifrons", "cancellata", "americana", "cubense")

# Function to load DEGs for a given set of species and a specific tissue
load_deg_data <- function(species_list, tissue) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in species_list) {
    # Define the correct file path based on tissue
    deg_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_", tissue, "_", species, ".csv"))
    
    # Read DESeq2 results
    deg_data <- read.csv(deg_file, stringsAsFactors = FALSE)
    
    # Ensure 'GeneID' column exists (some DESeq2 outputs use 'X')
    if (!"GeneID" %in% colnames(deg_data)) {
      if ("X" %in% colnames(deg_data)) {
        colnames(deg_data)[colnames(deg_data) == "X"] <- "GeneID"
      } else {
        message(paste("No GeneID column found for", species, "in", tissue, "- Skipping"))
        next
      }
    }
    
    # Convert to character for safe merging
    deg_data$GeneID <- as.character(deg_data$GeneID)
    filtered_final_orthotable$GeneID <- as.character(filtered_final_orthotable$GeneID)

    # Merge with orthogroup information
    deg_data <- left_join(deg_data, filtered_final_orthotable, by = "GeneID") %>%
      mutate(Orthogroup = ifelse(is.na(Orthogroup), "Unassigned", Orthogroup))  # Handle missing orthogroups
    
    # Check if data is empty
    if (nrow(deg_data) == 0) {
      message(paste("No data for species:", species, "in tissue:", tissue))
      next
    }
    
    # Filter for significant DEGs based on `log2FoldChange`
    upregulated <- deg_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(Orthogroup) %>%
      distinct()
    
    downregulated <- deg_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(Orthogroup) %>%
      distinct()
    
    all_degs <- deg_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(Orthogroup) %>%
      distinct()
    
    # Store the DEGs in the lists
    degs_up[[species]] <- upregulated$Orthogroup
    degs_down[[species]] <- downregulated$Orthogroup
    degs_all[[species]] <- all_degs$Orthogroup
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Head
venn_data_allspecies_head <- load_deg_data(allspecies, "Head")

# Load DEG data for Thorax
venn_data_allspecies_thorax <- load_deg_data(allspecies, "Thorax")

# Function to generate Venn diagrams with Orthogroups
display_venn_with_datatable <- function(venn_data, title) {
  # Calculate overlapping genes
  overlap_orthogroups <- Reduce(intersect, venn_data)
  
  # Create a dataframe for overlapping orthogroups
  overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
  
  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = allspecies,
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid", "green", "yellow"),
    alpha = 0.5, 
    cex = 1.2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear plotting area and display Venn diagram
  grid.newpage()
  grid.draw(venn_plot)

  # Manually create a custom legend
  legend_labels <- allspecies
  legend_colors <- c("orange", "red", "orchid", "green", "yellow")

  # Position legend
  legend_x <- unit(0.85, "npc")  
  legend_y <- unit(0.2, "npc")

  for (i in 1:length(legend_labels)) {
    grid.rect(x = legend_x, y = legend_y - unit((i - 1) * 0.05, "npc"), 
              width = unit(0.02, "npc"), height = unit(0.02, "npc"), 
              gp = gpar(fill = legend_colors[i], col = NA))
    grid.text(label = legend_labels[i], x = legend_x + unit(0.05, "npc"), 
              y = legend_y - unit((i - 1) * 0.05, "npc"), 
              just = "left", gp = gpar(cex = 0.8))
  }

  # Display the overlapping Orthogroups as a datatable
  datatable(overlap_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ), 
  rownames = FALSE)
}

# Display Venn diagrams and tables for HEAD
display_venn_with_datatable(venn_data_allspecies_head$up, "Venn Diagram of Upregulated Orthogroups - Head")

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

display_venn_with_datatable(venn_data_allspecies_head$down, "Venn Diagram of Downregulated Orthogroups - Head")

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

display_venn_with_datatable(venn_data_allspecies_head$all, "Venn Diagram of All Significant Orthogroups - Head")

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

# Display Venn diagrams and tables for THORAX
display_venn_with_datatable(venn_data_allspecies_thorax$up, "Venn Diagram of Upregulated Orthogroups - Thorax")

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

display_venn_with_datatable(venn_data_allspecies_thorax$down, "Venn Diagram of Downregulated Orthogroups - Thorax")

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

display_venn_with_datatable(venn_data_allspecies_thorax$all, "Venn Diagram of All Significant Orthogroups - Thorax")

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

# Load Orthogroup information
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Ensure correct column names
if ("gene_id" %in% colnames(filtered_final_orthotable)) {
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
}

# Select relevant columns and ensure uniqueness
filtered_final_orthotable <- filtered_final_orthotable %>%
  select(GeneID, Orthogroup) %>%
  distinct(GeneID, .keep_all = TRUE)  # Keep unique mapping

# Define species order explicitly
species_order <- c("nitens", "cubense", "americana", "piceifrons", "cancellata", "gregaria")

# Initialize an empty list to store heatmap data
heatmap_list <- list()

# Loop through each species to process their data
for (species in species_order) {
  message(paste("Processing species:", species))

  # Define file paths
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Head_", species, ".csv"))
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Thorax_", species, ".csv"))

  # Check if files exist before loading
  if (!file.exists(head_file)) {
    message(paste("Missing Head file for:", species, "- Assigning empty dataset"))
    head_data <- data.frame(GeneID = character(), padj = numeric(), log2FoldChange = numeric(), stringsAsFactors = FALSE)
  } else {
    head_data <- tryCatch(read.csv(head_file, stringsAsFactors = FALSE), error = function(e) data.frame())
  }

  if (!file.exists(thorax_file)) {
    message(paste("Missing Thorax file for:", species, "- Assigning empty dataset"))
    thorax_data <- data.frame(GeneID = character(), padj = numeric(), log2FoldChange = numeric(), stringsAsFactors = FALSE)
  } else {
    thorax_data <- tryCatch(read.csv(thorax_file, stringsAsFactors = FALSE), error = function(e) data.frame())
  }

  # Ensure GeneID column exists
  if (!"GeneID" %in% colnames(head_data) && "X" %in% colnames(head_data)) {
    colnames(head_data)[colnames(head_data) == "X"] <- "GeneID"
  }
  if (!"GeneID" %in% colnames(thorax_data) && "X" %in% colnames(thorax_data)) {
    colnames(thorax_data)[colnames(thorax_data) == "X"] <- "GeneID"
  }

  # Convert GeneID to character
  head_data$GeneID <- as.character(head_data$GeneID)
  thorax_data$GeneID <- as.character(thorax_data$GeneID)
  filtered_final_orthotable$GeneID <- as.character(filtered_final_orthotable$GeneID)

  # Ensure species is not skipped if one dataset is empty
  if (nrow(head_data) == 0 && nrow(thorax_data) == 0) {
    message(paste("No data for species:", species, "- Skipping"))
    next
  }

  # If thorax data is missing, assign zero values
  if (nrow(thorax_data) == 0) {
    message(paste("No Thorax data for:", species, "- Assigning 0 values"))
    thorax_data <- data.frame(GeneID = head_data$GeneID, padj = 1, log2FoldChange = 0, stringsAsFactors = FALSE)
  }

  # Merge with orthogroup information
  head_data <- left_join(head_data, filtered_final_orthotable, by = "GeneID") %>%
    mutate(Orthogroup = ifelse(is.na(Orthogroup), "Unassigned", Orthogroup))

  thorax_data <- left_join(thorax_data, filtered_final_orthotable, by = "GeneID") %>%
    mutate(Orthogroup = ifelse(is.na(Orthogroup), "Unassigned", Orthogroup))

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes per tissue
  head_up <- head_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)

  head_down <- head_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)

  thorax_up <- thorax_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)

  thorax_down <- thorax_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)

  # Combine data and prepare for heatmap
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species),
    thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
    thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)

  # Append to heatmap list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data
final_heatmap_data <- bind_rows(heatmap_list)

# Ensure species order in the data
final_heatmap_data$Species <- factor(final_heatmap_data$Species, levels = species_order)

# Create heatmap matrix (Thorax only)
heatmap_matrix <- final_heatmap_data %>%
    group_by(Orthogroup, Species) %>%  # Remove Tissue to ensure unique Orthogroup rows
    summarize(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop") %>%
    pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
    distinct(Orthogroup, .keep_all = TRUE) %>%  # Ensure unique Orthogroup rows
    column_to_rownames("Orthogroup") %>%
    as.matrix()

# Explicitly reorder the columns in heatmap_matrix
heatmap_matrix <- heatmap_matrix[, species_order, drop = FALSE]  # Ensure order is applied

# Define color palettes
custom_cyan_orange_palette <- colorRampPalette(c("cyan", "cyan2", "cyan3", "black", "orange3", "orange2", "orange"))(100)
custom_blue_red_palette <- colorRampPalette(c("blue3", "blue2", "blue1", "white", "red", "red2", "red3"))(100)

# Define color breaks
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)

# Generate heatmaps
pheatmap(
  heatmap_matrix,
  color = custom_blue_red_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of Orthologs Expression in Head and Thorax Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

pheatmap(
  heatmap_matrix,
  color = custom_cyan_orange_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of Orthologs Expression in Head and Thorax Tissue- STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

Head tissues

# Load orthogroup mapping
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Ensure column names are correctly set
if ("gene_id" %in% colnames(filtered_final_orthotable)) {
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
}

# Select only relevant columns and ensure uniqueness
filtered_final_orthotable <- filtered_final_orthotable %>%
  select(GeneID, Orthogroup) %>%
  distinct(GeneID, .keep_all = TRUE)  # Ensure one entry per GeneID

# Define species list
allspecies <- c("gregaria", "piceifrons", "cancellata", "americana", "cubense")

# Function to load DEGs for a given set of species and a specific tissue (ONLY HEAD)
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in species_list) {
    # Define the correct file path for Head
    deg_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Head_", species, ".csv"))
    
    # Read DESeq2 results
    deg_data <- read.csv(deg_file, stringsAsFactors = FALSE)
    
    # Ensure 'GeneID' column exists (some DESeq2 outputs use 'X')
    if (!"GeneID" %in% colnames(deg_data)) {
      if ("X" %in% colnames(deg_data)) {
        colnames(deg_data)[colnames(deg_data) == "X"] <- "GeneID"
      } else {
        message(paste("No GeneID column found for", species, "in Head - Skipping"))
        next
      }
    }
    
    # Convert to character for safe merging
    deg_data$GeneID <- as.character(deg_data$GeneID)
    filtered_final_orthotable$GeneID <- as.character(filtered_final_orthotable$GeneID)

    # Merge with orthogroup information
    deg_data <- left_join(deg_data, filtered_final_orthotable, by = "GeneID") %>%
      mutate(Orthogroup = ifelse(is.na(Orthogroup), "Unassigned", Orthogroup))  # Handle missing orthogroups
    
    # Check if data is empty
    if (nrow(deg_data) == 0) {
      message(paste("No data for species:", species, "in Head"))
      next
    }
    
    # Filter for significant DEGs based on `log2FoldChange`
    upregulated <- deg_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(Orthogroup) %>%
      distinct()
    
    downregulated <- deg_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(Orthogroup) %>%
      distinct()
    
    all_degs <- deg_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(Orthogroup) %>%
      distinct()
    
    # Store the DEGs in the lists
    degs_up[[species]] <- upregulated$Orthogroup
    degs_down[[species]] <- downregulated$Orthogroup
    degs_all[[species]] <- all_degs$Orthogroup
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Head only
venn_data_allspecies_head <- load_deg_data(allspecies)

# Function to generate Venn diagrams with Orthogroups (ONLY HEAD)
display_venn_with_datatable <- function(venn_data, title) {
  # Calculate overlapping genes
  overlap_orthogroups <- Reduce(intersect, venn_data)
  
  # Create a dataframe for overlapping orthogroups
  overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
  
  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = allspecies,
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid", "green", "yellow"),
    alpha = 0.5, 
    cex = 1.2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear plotting area and display Venn diagram
  grid.newpage()
  grid.draw(venn_plot)

  # Display overlapping Orthogroups as a datatable
  datatable(overlap_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ), rownames = FALSE)
}

# Display Venn diagrams and tables for HEAD only
display_venn_with_datatable(venn_data_allspecies_head$up, "Venn Diagram of Upregulated Orthogroups - Head")

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

display_venn_with_datatable(venn_data_allspecies_head$down, "Venn Diagram of Downregulated Orthogroups - Head")

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

display_venn_with_datatable(venn_data_allspecies_head$all, "Venn Diagram of All Significant Orthogroups - Head")

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

# Load Orthogroup information
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Ensure correct column names
if ("gene_id" %in% colnames(filtered_final_orthotable)) {
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
}

# Select relevant columns and ensure uniqueness
filtered_final_orthotable <- filtered_final_orthotable %>%
  select(GeneID, Orthogroup) %>%
  distinct(GeneID, .keep_all = TRUE)  # Keep unique mapping

# Define species order explicitly
species_order <- c("nitens", "cubense", "americana", "piceifrons", "cancellata", "gregaria")

# Initialize an empty list to store heatmap data
heatmap_list <- list()

# Loop through each species to process their Head data
for (species in species_order) {
  message(paste("Processing species:", species))

  # Define file path for Head
  head_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Head_", species, ".csv"))

  # Check if file exists before loading
  if (!file.exists(head_file)) {
    message(paste("Missing Head file for:", species, "- Assigning empty dataset"))
    head_data <- data.frame(GeneID = character(), padj = numeric(), log2FoldChange = numeric(), stringsAsFactors = FALSE)
  } else {
    head_data <- tryCatch(read.csv(head_file, stringsAsFactors = FALSE), error = function(e) data.frame())
  }

  # Ensure GeneID column exists
  if (!"GeneID" %in% colnames(head_data) && "X" %in% colnames(head_data)) {
    colnames(head_data)[colnames(head_data) == "X"] <- "GeneID"
  }

  # Convert GeneID to character
  head_data$GeneID <- as.character(head_data$GeneID)
  filtered_final_orthotable$GeneID <- as.character(filtered_final_orthotable$GeneID)

  # Merge with orthogroup information
  head_data <- left_join(head_data, filtered_final_orthotable, by = "GeneID") %>%
    mutate(Orthogroup = ifelse(is.na(Orthogroup), "Unassigned", Orthogroup))

  # Filter for significant DEGs and select top 500 upregulated and downregulated genes
  head_up <- head_data %>%
    filter(padj < 0.05 & log2FoldChange > 1) %>%
    arrange(desc(log2FoldChange)) %>%
    slice(1:500)

  head_down <- head_data %>%
    filter(padj < 0.05 & log2FoldChange < -1) %>%
    arrange(log2FoldChange) %>%
    slice(1:500)

  # Combine data and prepare for heatmap
  heatmap_data <- bind_rows(
    head_up %>% mutate(Tissue = "Head", Regulation = "Upregulated", Species = species),
    head_down %>% mutate(Tissue = "Head", Regulation = "Downregulated", Species = species)
  ) %>%
    select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)

  # Append to heatmap list
  heatmap_list[[species]] <- heatmap_data
}

# Combine all species data
final_heatmap_data <- bind_rows(heatmap_list)

# Ensure all species are represented, even if they have no significant DEGs
for (species in species_order) {
    if (!species %in% unique(final_heatmap_data$Species)) {
        message(paste("Adding placeholder for missing species:", species))
        final_heatmap_data <- bind_rows(
            final_heatmap_data,
            data.frame(
                Orthogroup = "Unassigned",  # Placeholder Orthogroup
                log2FoldChange = 0,
                Tissue = "Head",
                Regulation = "None",
                Species = species
            )
        )
    }
}

# Ensure species order in the data
final_heatmap_data$Species <- factor(final_heatmap_data$Species, levels = species_order)

# Create heatmap matrix (Thorax only)
heatmap_matrix <- final_heatmap_data %>%
  group_by(Orthogroup, Species) %>% 
  summarize(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop") %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("Orthogroup") %>%
  as.matrix()

# Explicitly reorder the columns in heatmap_matrix
heatmap_matrix <- heatmap_matrix[, species_order, drop = FALSE]  # Ensure order is applied

# Define color palettes
custom_cyan_orange_palette <- colorRampPalette(c("cyan", "cyan2", "cyan3", "black", "orange3", "orange2", "orange"))(100)
custom_blue_red_palette <- colorRampPalette(c("blue3", "blue2", "blue1", "white", "red", "red2", "red3"))(100)

# Define color breaks
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)

# Generate heatmaps (Only Head)
pheatmap(
  heatmap_matrix,
  color = custom_blue_red_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

pheatmap(
  heatmap_matrix,
  color = custom_cyan_orange_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of Orthologs Expression in Head Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

Thorax tissues

# Load orthogroup mapping
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Ensure column names are correctly set
if ("gene_id" %in% colnames(filtered_final_orthotable)) {
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
}

# Select only relevant columns and ensure uniqueness
filtered_final_orthotable <- filtered_final_orthotable %>%
  select(GeneID, Orthogroup) %>%
  distinct(GeneID, .keep_all = TRUE)  # Ensure one entry per GeneID

# Define species list
allspecies <- c("gregaria", "piceifrons", "cancellata", "americana", "cubense")

# Function to load DEGs for a given set of species and a specific tissue (ONLY thorax)
load_deg_data <- function(species_list) {
  degs_up <- list()
  degs_down <- list()
  degs_all <- list()
  
  for (species in species_list) {
    # Define the correct file path for thorax
    deg_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_results_Thorax_", species, ".csv"))
    
    # Read DESeq2 results
    deg_data <- read.csv(deg_file, stringsAsFactors = FALSE)
    
    # Ensure 'GeneID' column exists (some DESeq2 outputs use 'X')
    if (!"GeneID" %in% colnames(deg_data)) {
      if ("X" %in% colnames(deg_data)) {
        colnames(deg_data)[colnames(deg_data) == "X"] <- "GeneID"
      } else {
        message(paste("No GeneID column found for", species, "in Thorax - Skipping"))
        next
      }
    }
    
    # Convert to character for safe merging
    deg_data$GeneID <- as.character(deg_data$GeneID)
    filtered_final_orthotable$GeneID <- as.character(filtered_final_orthotable$GeneID)

    # Merge with orthogroup information
    deg_data <- left_join(deg_data, filtered_final_orthotable, by = "GeneID") %>%
      mutate(Orthogroup = ifelse(is.na(Orthogroup), "Unassigned", Orthogroup))  # Handle missing orthogroups
    
    # Check if data is empty
    if (nrow(deg_data) == 0) {
      message(paste("No data for species:", species, "in Thorax"))
      next
    }
    
    # Filter for significant DEGs based on `log2FoldChange`
    upregulated <- deg_data %>%
      filter(padj < 0.05 & log2FoldChange >= 1) %>%
      select(Orthogroup) %>%
      distinct()
    
    downregulated <- deg_data %>%
      filter(padj < 0.05 & log2FoldChange <= -1) %>%
      select(Orthogroup) %>%
      distinct()
    
    all_degs <- deg_data %>%
      filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
      select(Orthogroup) %>%
      distinct()
    
    # Store the DEGs in the lists
    degs_up[[species]] <- upregulated$Orthogroup
    degs_down[[species]] <- downregulated$Orthogroup
    degs_all[[species]] <- all_degs$Orthogroup
  }
  
  return(list(up = degs_up, down = degs_down, all = degs_all))
}

# Load DEG data for Thorax only
venn_data_allspecies_thorax <- load_deg_data(allspecies)

# Function to generate Venn diagrams with Orthogroups (ONLY thorax)
display_venn_with_datatable <- function(venn_data, title) {
  # Calculate overlapping genes
  overlap_orthogroups <- Reduce(intersect, venn_data)
  
  # Create a dataframe for overlapping orthogroups
  overlap_df <- data.frame(Orthogroup = overlap_orthogroups)
  
  # Generate the Venn diagram
  venn_plot <- venn.diagram(
    x = venn_data, 
    category.names = allspecies,
    filename = NULL, 
    output = TRUE, 
    fill = c("orange", "red", "orchid", "green", "yellow"),
    alpha = 0.5, 
    cex = 1.2, 
    cat.cex = 0, 
    main = title,
    main.cex = 1.2
  )

  # Clear plotting area and display Venn diagram
  grid.newpage()
  grid.draw(venn_plot)

  # Display overlapping Orthogroups as a datatable
  datatable(overlap_df, options = list(
      pageLength = 10,
      scrollX = TRUE,
      autoWidth = TRUE,
      searchHighlight = TRUE
  ), rownames = FALSE)
}

# Display Venn diagrams and tables for thorax only
display_venn_with_datatable(venn_data_allspecies_thorax$up, "Venn Diagram of Upregulated Orthogroups - Thorax")

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

display_venn_with_datatable(venn_data_allspecies_thorax$down, "Venn Diagram of Downregulated Orthogroups - Thorax")

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

display_venn_with_datatable(venn_data_allspecies_thorax$all, "Venn Diagram of All Significant Orthogroups - Thorax")

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

# Load Orthogroup information
input_file <- file.path(ortho_dir, "Results_I2/Orthogroups_genesproteinbiotype_Schistocerca_Jan2025.csv")
filtered_final_orthotable <- read.csv(input_file, header = TRUE, stringsAsFactors = FALSE)

# Ensure correct column names
if ("gene_id" %in% colnames(filtered_final_orthotable)) {
  colnames(filtered_final_orthotable)[colnames(filtered_final_orthotable) == "gene_id"] <- "GeneID"
}

# Select relevant columns and ensure uniqueness
filtered_final_orthotable <- filtered_final_orthotable %>%
  select(GeneID, Orthogroup) %>%
  distinct(GeneID, .keep_all = TRUE)  # Keep unique mapping

# Define species order explicitly
species_order <- c("nitens", "cubense", "americana", "piceifrons", "cancellata", "gregaria")

# Initialize an empty list to store heatmap data
heatmap_list <- list()

# Loop through each species to process their Thorax data
for (species in species_order) {
  message(paste("Processing species:", species))

  # Define file path for Thorax
  thorax_file <- file.path(workDir, "DEG_results/Bulk_RNAseq", paste0("DESeq2_sigresults_Thorax_", species, ".csv"))

  # Check if file exists before loading
  if (!file.exists(thorax_file)) {
    message(paste("Missing Thorax file for:", species, "- Assigning empty dataset"))
    thorax_data <- data.frame(GeneID = character(), padj = numeric(), log2FoldChange = numeric(), stringsAsFactors = FALSE)
  } else {
    thorax_data <- tryCatch(read.csv(thorax_file, stringsAsFactors = FALSE), error = function(e) data.frame())
  }

  # Ensure GeneID column exists
  if (!"GeneID" %in% colnames(thorax_data) && "X" %in% colnames(thorax_data)) {
    colnames(thorax_data)[colnames(thorax_data) == "X"] <- "GeneID"
  }

  # Convert GeneID to character
  thorax_data$GeneID <- as.character(thorax_data$GeneID)
  filtered_final_orthotable$GeneID <- as.character(filtered_final_orthotable$GeneID)

  # Merge with orthogroup information
  thorax_data <- left_join(thorax_data, filtered_final_orthotable, by = "GeneID") %>%
    mutate(Orthogroup = ifelse(is.na(Orthogroup), "Unassigned", Orthogroup))

  # If no significant DEGs are found, ensure the structure is correct
  if (nrow(thorax_data) == 0) {
    message(paste("No significant Thorax DEGs for:", species, "- Assigning placeholder values"))
    thorax_data <- data.frame(
      Orthogroup = character(),
      log2FoldChange = numeric(),
      Tissue = character(),
      Regulation = character(),
      Species = character()
    )
  } else {
    # Filter for significant DEGs and select top 500 upregulated and downregulated genes
    thorax_up <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange > 1) %>%
      arrange(desc(log2FoldChange)) %>%
      slice(1:500)

    thorax_down <- thorax_data %>%
      filter(padj < 0.05 & log2FoldChange < -1) %>%
      arrange(log2FoldChange) %>%
      slice(1:500)

    # Combine data and prepare for heatmap
    thorax_data <- bind_rows(
      thorax_up %>% mutate(Tissue = "Thorax", Regulation = "Upregulated", Species = species),
      thorax_down %>% mutate(Tissue = "Thorax", Regulation = "Downregulated", Species = species)
    ) %>%
      select(Orthogroup, log2FoldChange, Tissue, Regulation, Species)
  }

  # Append to heatmap list, ensuring species is represented
  heatmap_list[[species]] <- thorax_data
}

# Combine all species data
final_heatmap_data <- bind_rows(heatmap_list)

# Ensure all species are represented, even if they have no significant DEGs
for (species in species_order) {
    if (!species %in% unique(final_heatmap_data$Species)) {
        message(paste("Adding placeholder for missing species:", species))
        final_heatmap_data <- bind_rows(
            final_heatmap_data,
            data.frame(
                Orthogroup = "Unassigned",  # Placeholder Orthogroup
                log2FoldChange = 0,
                Tissue = "Thorax",
                Regulation = "None",
                Species = species
            )
        )
    }
}

# Ensure species order in the data
final_heatmap_data$Species <- factor(final_heatmap_data$Species, levels = species_order)

# Create heatmap matrix (Thorax only)
heatmap_matrix <- final_heatmap_data %>%
  group_by(Orthogroup, Species) %>% 
  summarize(log2FoldChange = mean(log2FoldChange, na.rm = TRUE), .groups = "drop") %>%
  pivot_wider(names_from = Species, values_from = log2FoldChange, values_fill = 0) %>%
  column_to_rownames("Orthogroup") %>%
  as.matrix()

# Explicitly reorder the columns in heatmap_matrix
heatmap_matrix <- heatmap_matrix[, species_order, drop = FALSE]  # Ensure order is applied

# Define color palettes
custom_cyan_orange_palette <- colorRampPalette(c("cyan", "cyan2", "cyan3", "black", "orange3", "orange2", "orange"))(100)
custom_blue_red_palette <- colorRampPalette(c("blue3", "blue2", "blue1", "white", "red", "red2", "red3"))(100)

# Define color breaks
max_abs_lfc <- max(abs(heatmap_matrix), na.rm = TRUE)
color_breaks <- seq(-max_abs_lfc, max_abs_lfc, length.out = 100)

# Generate heatmaps (Only thorax)
pheatmap(
  heatmap_matrix,
  color = custom_blue_red_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

pheatmap(
  heatmap_matrix,
  color = custom_cyan_orange_palette,
  breaks = color_breaks,
  cluster_rows = TRUE,
  cluster_cols = FALSE,
  show_rownames = FALSE,
  show_colnames = TRUE,
  fontsize_row = 6,
  fontsize_col = 10,
  main = "Heatmap of Orthologs Expression in Thorax Tissue - STRATEGY 2"
)

Version	Author	Date
34c299a	Maeva TECHER	2025-02-06
aab712a	Maeva TECHER	2025-02-04

sessionInfo()

R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS 15.3

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Asia/Tokyo
tzcode source: internal

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] data.table_1.16.4   lubridate_1.9.4     forcats_1.0.0      
 [4] stringr_1.5.1       purrr_1.0.4         tidyverse_2.0.0    
 [7] readr_2.1.5         DT_0.33             gridExtra_2.3      
[10] VennDiagram_1.7.3   futile.logger_1.4.3 tibble_3.2.1       
[13] kableExtra_1.4.0    viridis_0.6.5       viridisLite_0.4.2  
[16] RColorBrewer_1.1-3  tidyr_1.3.1         pheatmap_1.0.12    
[19] ggVennDiagram_1.5.3 htmlwidgets_1.6.4   plotly_4.10.4      
[22] ggplot2_3.5.1       dplyr_1.1.4         knitr_1.49         

loaded via a namespace (and not attached):
 [1] gtable_0.3.6         xfun_0.50            bslib_0.9.0         
 [4] tzdb_0.4.0           crosstalk_1.2.1      vctrs_0.6.5         
 [7] tools_4.4.1          generics_0.1.3       pkgconfig_2.0.3     
[10] lifecycle_1.0.4      farver_2.1.2         compiler_4.4.1      
[13] git2r_0.35.0         textshaping_1.0.0    munsell_0.5.1       
[16] httpuv_1.6.15        htmltools_0.5.8.1    sass_0.4.9          
[19] yaml_2.3.10          lazyeval_0.2.2       later_1.4.1         
[22] pillar_1.10.1        jquerylib_0.1.4      whisker_0.4.1       
[25] cachem_1.1.0         tidyselect_1.2.1     digest_0.6.37       
[28] stringi_1.8.4        labeling_0.4.3       rprojroot_2.0.4     
[31] fastmap_1.2.0        colorspace_2.1-1     cli_3.6.3           
[34] magrittr_2.0.3       withr_3.0.2          scales_1.3.0        
[37] promises_1.3.2       timechange_0.3.0     rmarkdown_2.29      
[40] lambda.r_1.2.4       httr_1.4.7           workflowr_1.7.1     
[43] ragg_1.3.3           hms_1.1.3            evaluate_1.0.3      
[46] rlang_1.1.5          futile.options_1.0.1 Rcpp_1.0.14         
[49] glue_1.8.0           formatR_1.14         xml2_1.3.6          
[52] svglite_2.1.3        rstudioapi_0.17.1    jsonlite_1.8.9      
[55] R6_2.5.1             systemfonts_1.2.1    fs_1.6.5

Cross-species comparisons and DEGs overlap

Maeva Techer

2025-02-11

Load libraries

STRATEGY 1: One genome S. gregaria

1. DEGs comparison among species

2. Overlap DEGs between tissues

gregaria

piceifrons

cancellata

americana

cubense

nitens

3. Overlap DEGs among species

Locusts

Head tissues

Thorax tissues

piceifrons-americana-cubense

Head tissues

Thorax tissues

Plastic species

Head tissues

Thorax tissues

All species

Combined tissues

Head tissues

Thorax tissues

STRATEGY 2: Own RefSeq genome

1. DEGs comparison among species

2. Overlap DEGs between tissues

gregaria

piceifrons

cancellata

americana

cubense

nitens

3. Overlap DEGs among species

Locusts

Head tissues

Thorax tissues

piceifrons-americana-cubense

Head tissues

Thorax tissues

Plastic species

Head tissues

Thorax tissues

All species

Combined tissues

Head tissues

Thorax tissues