Last updated: 2023-01-22

Checks: 2 0

Knit directory: dgrp-starve/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 709ca14. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .RData

Untracked files:
    Untracked:  analysis/gremlo.R
    Untracked:  analysis/linearReg.Rmd
    Untracked:  analysis/rewrite.Rmd
    Untracked:  code/aaaTest
    Untracked:  code/analysisSR.R
    Untracked:  code/geneGO.R
    Untracked:  code/multiPrep.R
    Untracked:  code/regress.81916.err
    Untracked:  code/regress.81916.out
    Untracked:  code/regress.81918.err
    Untracked:  code/regress.81918.out
    Untracked:  code/regress.R
    Untracked:  code/regress.sbatch
    Untracked:  code/regressF.81919.err
    Untracked:  code/regressF.81919.out
    Untracked:  code/regressF.R
    Untracked:  code/regressF.sbatch
    Untracked:  code/regress_f_adj.109973.err
    Untracked:  code/regress_f_adj.109973.out
    Untracked:  code/regress_f_adj.109974.err
    Untracked:  code/regress_f_adj.109974.out
    Untracked:  code/regress_f_adj.R
    Untracked:  code/regress_f_adj.sbatch
    Untracked:  code/regress_m_adj.109971.err
    Untracked:  code/regress_m_adj.109971.out
    Untracked:  code/regress_m_adj.109972.err
    Untracked:  code/regress_m_adj.109972.out
    Untracked:  code/regress_m_adj.R
    Untracked:  code/regress_m_adj.sbatch
    Untracked:  code/snpGene.77509.err
    Untracked:  code/snpGene.77509.out
    Untracked:  code/snpGene.77515.err
    Untracked:  code/snpGene.77515.out
    Untracked:  code/snpGene.sbatch
    Untracked:  data/eQTL_traits_females.csv
    Untracked:  data/eQTL_traits_males.csv
    Untracked:  data/fMeans.txt
    Untracked:  data/fRegress.txt
    Untracked:  data/fRegress_adj.txt
    Untracked:  data/f_adj.txt
    Untracked:  data/goGroups.txt
    Untracked:  data/mMeans.txt
    Untracked:  data/mPart.txt
    Untracked:  data/mRegress.txt
    Untracked:  data/mRegress_adj.txt
    Untracked:  data/m_adj.txt
    Untracked:  data/multiReg.rData
    Untracked:  data/starve-f.txt
    Untracked:  data/starve-m.txt
    Untracked:  data/xp-f.txt
    Untracked:  data/xp-m.txt
    Untracked:  data/y_save.txt
    Untracked:  figure/
    Untracked:  lmm.R
    Untracked:  qqdum.R
    Untracked:  scoreAnalysisMulticomp.R
    Untracked:  temp.Rmd

Unstaged changes:
    Deleted:    analysis/database.Rmd
    Modified:   analysis/index.Rmd
    Modified:   analysis/linReg.Rmd
    Modified:   analysis/multiComp.Rmd
    Modified:   analysis/multiReg.Rmd
    Deleted:    analysis/scripts.Rmd
    Modified:   code/baseScript-lineComp.R
    Modified:   code/fourLinePrep.R

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/recap.Rmd) and HTML (docs/recap.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 709ca14 nklimko 2023-01-22 wflow_publish("analysis/recap.Rmd")
html e4cf8cb nklimko 2023-01-10 Build site.
Rmd c826636 nklimko 2023-01-10 wflow_publish("analysis/recap.Rmd")
html bf1900e nklimko 2023-01-10 Build site.
Rmd a05fc48 nklimko 2023-01-10 wflow_publish("analysis/recap.Rmd")

2023/01/17

• Upload rds/Rdata file of y_adj, mu, TRM, and validate for further review.

• Perform cross validation using own code: generate TRM with both model and validation set.

• Use fitG and fitB for transcriptomic predictors and intercepts respectively.

• Summary statistics of accuracy: compare predicted phenotype to observed.

• Address qqman not taking input/arguments.

• Following meeting 2:00pm on Monday the 23rd.

2023/01/10

• Adjust phenotypes for Wolbachia infection status and inversions for Linear Regression

• Use validate to perform partitioned prediction on data using Pearson’s correlation coefficient ($corr) from qqman methods

• Troubleshoot stat from qqman not taking input/arguments -> determine issue

• Plan to review progress each Monday at 11:00 am

• Troubleshoot stat from qqman not taking input/arguments -> determine issue

2022/12/21

• Fit data to model using starvation resistance vector

• Reorganize site to hyperlinks from main index rather than new pages every week

• Next meeting will be January 10th at CHG

• Merry Christmas!

2022/12/14

• Fix x axis uniform distribution label: incorrect.

• Inflection in qq plot indicates inflation of p-values.

• Address this by using mixed model of fixed effect and random effect.

• Sample data will be sent later on how to perform this with simulated data.

• Send shift schedule to plan next week’s meeting.

2022/12/06

• Fix correlations: lm not reading in data properly, p-values wrong

• Add -log10 scale on qqplot, not just Manhattan plot

• qqman package for both qqplot and Manhattan plot

• Next meeting Wednesday, Dec. 14th 2:30-3:00 via Zoom due to exams.

2022/11/29

• Use gene expression data to preform simple regression with starvation resistance.

• Run summary(lm(y~x)) for starvation resistance against gene expression for each gene in both sexes.

• p-value of slope can be accessed from summary()[[4]][8]

• Create a -10 log scale QQ-plot for a uniform distribution versus p-values.

• Determine why this statistical approach is inappropriate.

2022/11/22

• Multiple comparison scatter plots should be sex specific.

• PCA should be performed with genotype data, not gene expression.

o Calculate mean expression and variance for gene expression.

o Scatter plot of overlapping genes between males and females.

o Count # of genes expressed in males, females, and both.

2022/11/15

• Continue multiple comparisons with trendlines, multiple tips:

o cor.test() for summary statistics

o List notation to store ggplots

o Use print() with ggplot, cowplot for grid arrangement

• Try to find an efficient method for column means of gene expression

• Principal Component analysis – begin brainstorming

• Discuss further PhD plans mid-February

2022/11/11

• Presented starvation analysis of DGRP lines

o Include “code_folding: hide” in yaml headers for Rmd

o Include variance in results

• Multiple comparison of starvation against all other traits

• Perform analysis of gene expression by line

2022/11/01

• Get creative with analysis – scatterplot trendline, normality, beyond

• WorkflowR from cmd line, develop website

• Base repositories in /data/morgante_lab/nklimko instead of home drive

• Next meeting Friday the 11th at 10am by Zoom due to election

2022/10/26

• Set github ssh keys and config settings on personal laptop and secretariat

• Recap of workflowR and walkthrough of model layout

• Introductory project to analyze starvation resistance trait in male and female lines on computational node in secretariat.

• Postdoctoral candidate meeting and seminar on Wed 10/27 – provide feedback on both candidates

2022/10/18

• Presented SNP Prediction Paper on Plant and Animal breeding

• Begin working with github and workflowR to begin data processing in upcoming weeks

2022/10/04

• Create simple presentation for https://www.genetics.org/content/genetics/193/2/327.full.pdf by 10/11 - build habit of summarizing papers/distilling main points

• Class selection: Adv. Biochem, Seminar, Intro to Quant Gen, and Regression + Least Squares

o Consult on further class selection semester basis

• Talk with Dr. Starr-Moss regarding credit hours for master’s research – three recommended

2022/09/27

• Review of Presentation for https://academic.oup.com/g3journal/article/10/12/4599/6048688

o Transcriptome data higher accuracy than SNP variation

o External data boosts prediction accuracy of transcriptome alone(TBLUP vs GO-TBLUP)

o Redundancy of genome and transcriptome – additive vs non-additive

• Complete final Dataquest subunit by Friday

• Review https://www.genetics.org/content/genetics/193/2/327.full.pdf

2022/09/13

• Goal to complete R modules on Dataquest by 9/30: Dataquest is paid software

• Read two papers focused on DGRP creation and usage

• Begin reading additional paper for class presentation

• Meeting time shift to 2:30pm for travel considerations