Last updated: 2020-10-15
Checks: 7 0
Knit directory: Blancetal/analysis/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20200217) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 0e39940. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: data/.DS_Store
Ignored: data/quaint-results.rda
Ignored: output/.DS_Store
Untracked files:
Untracked: data/Frey_Cold.txt
Untracked: data/Frey_Cold.txt~
Untracked: data/schaefer_clusters.csv~
Untracked: figures/Supplement_Ve.png
Untracked: figures/phist.png
Untracked: output/Genenames_0.05.txt
Unstaged changes:
Modified: analysis/Environmental_variance.Rmd
Modified: analysis/Identifying_quaint.Rmd
Modified: analysis/Selection_on_Expression_of_Env_Rsponse_Genes.Rmd
Modified: output/all_sigenes_annotate.csv
Deleted: output/names_0.05_all.txt
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/Selection_on_expression_of_coexpreession_clusters.Rmd) and HTML (docs/Selection_on_expression_of_coexpreession_clusters.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.
| File | Version | Author | Date | Message |
|---|---|---|---|---|
| Rmd | 0e39940 | jgblanc | 2020-10-15 | Added GO terms |
| html | 9cc4fa0 | jgblanc | 2020-09-30 | Build site. |
| Rmd | 3294a81 | jgblanc | 2020-09-30 | new clusters |
| html | 9ede1d4 | jgblanc | 2020-04-24 | Build site. |
| Rmd | 69c5a29 | jgblanc | 2020-04-24 | Finished clusters |
| html | e26e754 | jgblanc | 2020-04-24 | Build site. |
| Rmd | 3a6666a | jgblanc | 2020-04-24 | Finished clusters |
| Rmd | 8298d4d | GitHub | 2020-04-16 | Merge branch ‘master’ into master |
| Rmd | 6b00f47 | em | 2020-03-27 | stuff |
| Rmd | a63fa86 | em | 2020-03-26 | stuff |
| html | 7591f0d | jgblanc | 2020-03-16 | Build site. |
| Rmd | df4fe40 | jgblanc | 2020-03-16 | added more pvals |
| Rmd | 2cb89cd | jgblanc | 2020-03-13 | adding coexpression stuff |
Here is the code to test for expression on coeexpression clusters from Walleey et al.
For this analysis we want to test for selection within specific coexpression modules. We used coexpression modules from Walley et al. (2016) who used weight gene coexpression network analysis (WGCNA) to to group together genes that were similarly expressed in at least 4 tissues in one maize inbred line. Their analysis resulted in 66 co-expression networks. Below we will load their co-expression networks and select all the clusters that have at least 100 genes in them.
modules <- read.delim("../data/Modules.txt",na.strings=c("","NA"), stringsAsFactors = F)
num_genes <- apply(modules, 2, function(x) length(which(!is.na(x))))
num_genes_100 <- which(num_genes >= 100)
modules <- modules[,num_genes_100]
We now have 51 modules that have at least 100 genes. We will now get the median expression of all the genes that are present in our RNAseq dataset for each module and then test for selection on the median values in each cluster. The code below is similar code used to identify selected genes, we are simply conducting the test on median expression values in each line instead of a single gene expression in each line. Again we test for selection on the first 5 PCs and use the last half of PCs to estimate \(V_a\). Th function will return the p-values from the \(Q_{pc}\) test for all tissues.
cluster_analysis_func <- function(modules) {
alltissues = c('GRoot',"Kern","LMAD26","LMAN26", "L3Tip","GShoot","L3Base")
alltissuemodules = lapply(alltissues, function(mytissue){
##get the names of genes we have expression data for in each tissue
exp <- read.table(paste("../data/Raw_expression/",mytissue,".txt", sep="")) # read in expression level
geneNames = names(exp)
#get the median expression level for each module in this tissue
moduleExpression = apply(modules, 2, function(x){
olap = x[x %in% geneNames] #get genes in the module tha we have expression data for
moduleExp = dplyr::select(exp, all_of(olap)) #pull out the expression data for these genes
medianExp = apply(moduleExp, 1, median)
return(medianExp)
})
####identify selection in each of these modules
# Read in tissue specific kinship matrix
myF <- read.table(paste('../data/Kinship_matrices/F_',mytissue,'.txt',sep=""))
## Get Eigen Values and Vectors
myE <- eigen(myF)
E_vectors <- myE$vectors
E_values <- myE$values
## Testing for selection on first 5 PCs
myM = 1:5
## Using the last 1/2 of PCs to estimate Va
myL = 6:dim(myF)[1]
### test for selection on all modules
moduleSelection <- apply(moduleExpression, 2, function(x){
meanCenteredX = x[-length(x)] - mean(x)
myQpc = calcQpc(myZ = meanCenteredX, myU = E_vectors, myLambdas = E_values, myL = myL, myM = myM)
return(myQpc$pvals)
})
##make a dataframe with info to return
mydf = data.frame(t(moduleSelection))
names(mydf) = c('PC1','PC2','PC3','PC4','PC5')
mydf$module = row.names(mydf)
row.names(mydf) <- NULL
mydflong = tidyr::gather(mydf, 'PC','pval', PC1:PC5)
mydflong$tissue = mytissue
return(mydflong)})
return(alltissuemodules)
}
We have the p-values, let’s calculate the FDR and look for significant p-values.
alltissuemodules = cluster_analysis_func(modules)
#combine all into one list
bigdf <- do.call('rbind', alltissuemodules)
bigdf$fdr <- p.adjust(bigdf$pval, method='fdr') ##calculate an FDR
summary(bigdf$fdr) ##nothing is significant
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.4299 0.9996 0.9996 0.9990 0.9996 0.9996
kable(bigdf[which.min(bigdf[,3]),])
| module | PC | pval | tissue | fdr | |
|---|---|---|---|---|---|
| 732 | RNA_Root_Meristem | PC5 | 0.0002409 | LMAD26 | 0.4299294 |
There is no significant selection on coexpression clusters. The lowest FDR is for the “RNA Root Meristem” cluster in adult leaf tissue on PC 5. We can also look at the clusters with with lowest bonferroni corrected p-values
t <- bigdf %>% group_by(PC, tissue) %>% mutate(bonferroni = p.adjust(pval, method='bonferroni')) %>% arrange(bonferroni)
kable(head(t))
| module | PC | pval | tissue | fdr | bonferroni |
|---|---|---|---|---|---|
| RNA_Root_Meristem | PC5 | 0.0002409 | LMAD26 | 0.4299294 | 0.0122837 |
| RNA_FemaleSpikIt_._VegMeristem | PC1 | 0.0012924 | Kern | 0.7985099 | 0.0659101 |
| Protein_Germinating_Kernals | PC1 | 0.0017164 | Kern | 0.7985099 | 0.0875356 |
| RNA_Root_Cortex_2 | PC1 | 0.0017894 | Kern | 0.7985099 | 0.0912583 |
| Protein_EndoCrown.PericarpAl | PC1 | 0.0039731 | Kern | 0.9996316 | 0.2026281 |
| RNA_Root_Cortex | PC1 | 0.0053132 | Kern | 0.9996316 | 0.2709719 |
ZmRoot Clusters - get clustes with >= 100 genes
clusters <- fread("../data/schaefer_clusters.csv")
clusters <- clusters[,1:4]
dat <- clusters %>% group_by(ZmRoot) %>% mutate(num_genes = n()) %>% filter(num_genes >= 100)
names <- unique(dat$ZmRoot)
gene_list <- list()
max <- nrow(dat[dat$ZmRoot == 1,])
for(i in 1:length(names)) {
df <- clusters %>% filter(ZmRoot == names[i])
genes <- c(df$V1, rep(NA, max-nrow(df)))
gene_list[[i]] <- genes
}
modules <- do.call(cbind, gene_list)
Run analysis and FDR correction
alltissuemodules = cluster_analysis_func(modules)
#combine all into one list
bigdf <- do.call('rbind', alltissuemodules)
bigdf$fdr <- p.adjust(bigdf$pval, method='fdr') ##calculate an FDR
summary(bigdf$fdr) ##nothing is significant
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.4377 0.9951 0.9951 0.9889 0.9951 0.9951
kable(bigdf[which.min(bigdf[,3]),])
| module | PC | pval | tissue | fdr | |
|---|---|---|---|---|---|
| 158 | 4 | PC5 | 0.0011368 | LMAD26 | 0.4376829 |
Look at lowest p-value clusters
t <- bigdf %>% group_by(PC, tissue) %>% mutate(bonferroni = p.adjust(pval, method='bonferroni')) %>% arrange(bonferroni)
kable(head(t))
| module | PC | pval | tissue | fdr | bonferroni |
|---|---|---|---|---|---|
| 4 | PC5 | 0.0011368 | LMAD26 | 0.4376829 | 0.0125052 |
| 3 | PC5 | 0.0023537 | LMAD26 | 0.4530808 | 0.0258903 |
| 9 | PC1 | 0.0053389 | Kern | 0.5741055 | 0.0587274 |
| 8 | PC2 | 0.0063947 | LMAD26 | 0.5741055 | 0.0703414 |
| 9 | PC3 | 0.0074559 | LMAN26 | 0.5741055 | 0.0820151 |
| 11 | PC1 | 0.0186208 | Kern | 0.9950821 | 0.2048286 |
Let’s look at the GO enrichment terms for the most significant clusters (bonferroni p < 0.05)
info <- fread("~/Downloads/GO_clusters.csv")
zmroot <- info %>% filter(Network == "ZmRoot")
# Module 4
zmroot %>% filter(`MCL Cluster` == "MCL4")
Network MCL Cluster GO Term Num Genes common Num GO genes Num MCL Genes
1: ZmRoot MCL4 GO:0006558 4 18 745
2: ZmRoot MCL4 GO:1902221 4 18 745
3: ZmRoot MCL4 GO:0006559 3 10 745
4: ZmRoot MCL4 GO:0009074 3 10 745
5: ZmRoot MCL4 GO:1902222 3 10 745
6: ZmRoot MCL4 GO:0016841 3 11 745
pval
1: 0.00181
2: 0.00181
3: 0.00282
4: 0.00282
5: 0.00282
6: 0.00379
GO Term Name
1: L-phenylalanine metabolic process
2: erythrose 4-phosphate/phosphoenolpyruvate family amino acid metabolic process
3: L-phenylalanine catabolic process
4: aromatic amino acid family catabolic process
5: erythrose 4-phosphate/phosphoenolpyruvate family amino acid catabolic process
6: ammonia-lyase activity
GO Term Desription
1: ""The chemical reactions and pathways involving L-phenylalanine, the L-enantiomer of 2-amino-3-phenylpropanoic acid, i.e. (2S)-2-amino-3-phenylpropanoic acid."" [CHEBI:17295, GOC:go_curators, GOC:jsg, GOC:mah]
2: ""The chemical reactions and pathways involving erythrose 4-phosphate/phosphoenolpyruvate family amino acid."" [GOC:pr, GOC:TermGenie]
3: ""The chemical reactions and pathways resulting in the breakdown of phenylalanine, 2-amino-3-phenylpropanoic acid."" [GOC:go_curators]
4: ""The chemical reactions and pathways resulting in the breakdown of aromatic amino acid family, amino acids with aromatic ring (phenylalanine, tyrosine, tryptophan)."" [GOC:go_curators]
5: ""The chemical reactions and pathways resulting in the breakdown of erythrose 4-phosphate/phosphoenolpyruvate family amino acid."" [GOC:pr, GOC:TermGenie]
6: ""Catalysis of the release of ammonia by the cleavage of a carbon-nitrogen bond or the reverse reaction with ammonia as a substrate."" [EC:4.3.-.-, GOC:krc]
# Module 3
zmroot %>% filter(`MCL Cluster` == "MCL3")
Network MCL Cluster GO Term Num Genes common Num GO genes Num MCL Genes
1: ZmRoot MCL3 GO:0006260 24 56 996
2: ZmRoot MCL3 GO:0000786 37 151 996
3: ZmRoot MCL3 GO:0032993 37 151 996
4: ZmRoot MCL3 GO:0044815 37 151 996
5: ZmRoot MCL3 GO:0006334 37 154 996
6: ZmRoot MCL3 GO:0034728 37 154 996
7: ZmRoot MCL3 GO:0065004 37 154 996
8: ZmRoot MCL3 GO:0071824 37 154 996
9: ZmRoot MCL3 GO:0044427 42 210 996
10: ZmRoot MCL3 GO:0003777 25 71 996
11: ZmRoot MCL3 GO:0005875 25 71 996
12: ZmRoot MCL3 GO:0006325 39 194 996
13: ZmRoot MCL3 GO:0007018 26 85 996
14: ZmRoot MCL3 GO:0007017 30 120 996
15: ZmRoot MCL3 GO:0044430 32 145 996
16: ZmRoot MCL3 GO:0034622 41 245 996
17: ZmRoot MCL3 GO:0006928 26 99 996
18: ZmRoot MCL3 GO:0003774 26 101 996
19: ZmRoot MCL3 GO:0006259 41 282 996
20: ZmRoot MCL3 GO:0016986 12 20 996
21: ZmRoot MCL3 GO:0006270 11 22 996
22: ZmRoot MCL3 GO:0034061 9 25 996
23: ZmRoot MCL3 GO:0003887 8 23 996
24: ZmRoot MCL3 GO:0006352 14 83 996
25: ZmRoot MCL3 GO:0000280 5 9 996
26: ZmRoot MCL3 GO:0007067 5 9 996
27: ZmRoot MCL3 GO:1903047 5 9 996
28: ZmRoot MCL3 GO:0009360 4 7 996
29: ZmRoot MCL3 GO:0042575 4 7 996
30: ZmRoot MCL3 GO:0048285 5 13 996
31: ZmRoot MCL3 GO:0032774 14 108 996
32: ZmRoot MCL3 GO:0071554 14 108 996
33: ZmRoot MCL3 GO:0022402 6 21 996
34: ZmRoot MCL3 GO:0016779 16 138 996
35: ZmRoot MCL3 GO:0008107 4 14 996
36: ZmRoot MCL3 GO:0008417 4 14 996
37: ZmRoot MCL3 GO:0031127 4 14 996
38: ZmRoot MCL3 GO:0042546 4 15 996
39: ZmRoot MCL3 GO:0004990 3 8 996
Network MCL Cluster GO Term Num Genes common Num GO genes Num MCL Genes
pval GO Term Name
1: 3.54e-19 DNA replication
2: 4.40e-19 nucleosome
3: 4.40e-19 protein-DNA complex
4: 4.40e-19 DNA packaging complex
5: 9.00e-19 nucleosome assembly
6: 9.00e-19 nucleosome organization
7: 9.00e-19 protein-DNA complex assembly
8: 9.00e-19 protein-DNA complex subunit organization
9: 5.92e-18 chromosomal part
10: 1.86e-17 microtubule motor activity
11: 1.86e-17 microtubule associated complex
12: 7.56e-17 chromatin organization
13: 2.24e-16 microtubule-based movement
14: 5.45e-16 microtubule-based process
15: 2.78e-15 cytoskeletal part
16: 9.82e-15 cellular macromolecular complex assembly
17: 1.34e-14 movement of cell or subcellular component
18: 2.26e-14 motor activity
19: 1.22e-12 DNA metabolic process
20: 1.67e-12 obsolete transcription initiation factor activity
21: 2.10e-10 DNA replication initiation
22: 3.18e-07 DNA polymerase activity
23: 1.98e-06 DNA-directed DNA polymerase activity
24: 5.60e-06 DNA-templated transcription, initiation
25: 1.18e-05 nuclear division
26: 1.18e-05 mitotic nuclear division
27: 1.18e-05 mitotic cell cycle process
28: 8.44e-05 DNA polymerase III complex
29: 8.44e-05 DNA polymerase complex
30: 1.05e-04 organelle fission
31: 1.16e-04 RNA biosynthetic process
32: 1.16e-04 cell wall organization or biogenesis
33: 1.39e-04 cell cycle process
34: 1.50e-04 nucleotidyltransferase activity
35: 1.92e-03 galactoside 2-alpha-L-fucosyltransferase activity
36: 1.92e-03 fucosyltransferase activity
37: 1.92e-03 alpha-(1,2)-fucosyltransferase activity
38: 2.54e-03 cell wall biogenesis
39: 3.17e-03 oxytocin receptor activity
pval GO Term Name
GO Term Desription
1: ""The cellular metabolic process in which a cell duplicates one or more molecules of DNA. DNA replication begins when specific sequences, known as origins of replication, are recognized and bound by initiation proteins, and ends when the original DNA molecule has been completely duplicated and the copies topologically separated. The unit of replication usually corresponds to the genome of the cell, an organelle, or a virus. The template for replication can either be an existing DNA molecule or RNA."" [GOC:mah]See also the biological process terms 'DNA-dependent DNA replication ; GO:0006261' and 'RNA-dependent DNA replication ; GO:0006278'.
2: ""A complex comprised of DNA wound around a multisubunit core and associated proteins, which forms the primary packing unit of DNA into higher order structures."" [GOC:elh]
3: ""A macromolecular complex containing both protein and DNA molecules."" [GOC:mah]Note that this term is intended to classify complexes that have DNA as one of the members of the complex, that is, the complex does not exist if DNA is not present. Protein complexes that interact with DNA e.g. transcription factor complexes should not be classified here.
4: ""A protein complex that plays a role in the process of DNA packaging."" [GOC:jl]
5: ""The aggregation, arrangement and bonding together of a nucleosome, the beadlike structural units of eukaryotic chromatin composed of histones and DNA."" [GOC:mah]
6: ""A process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of one or more nucleosomes."" [GOC:mah]
7: ""The aggregation, arrangement and bonding together of proteins and DNA molecules to form a protein-DNA complex."" [GOC:jl]
8: ""Any process in which macromolecules aggregate, disaggregate, or are modified, resulting in the formation, disassembly, or alteration of a protein-DNA complex."" [GOC:mah]
9: ""Any constituent part of a chromosome, a structure composed of a very long molecule of DNA and associated proteins (e.g. histones) that carries hereditary information."" [GOC:jl]Note that this term is in the subset of terms that should not be used for direct gene product annotation. Instead, select a child term or, if no appropriate child term exists, please request a new term. Direct annotations to this term may be amended during annotation QC.
10: ""Catalysis of movement along a microtubule, coupled to the hydrolysis of a nucleoside triphosphate (usually ATP)."" [GOC:mah, ISBN:0815316194]Consider also annotating to the molecular function term 'microtubule binding ; GO:0008017'.
11: ""Any multimeric complex connected to a microtubule."" [GOC:jl]
12: ""Any process that results in the specification, formation or maintenance of the physical structure of eukaryotic chromatin."" [GOC:mah]
13: ""A microtubule-based process that is mediated by motor proteins and results in the movement of organelles, other microtubules, or other particles along microtubules."" [GOC:cjm, ISBN:0815316194]
14: ""Any cellular process that depends upon or alters the microtubule cytoskeleton, that part of the cytoskeleton comprising microtubules and their associated proteins."" [GOC:mah]
15: ""Any constituent part of the cytoskeleton, a cellular scaffolding or skeleton that maintains cell shape, enables some cell motion (using structures such as flagella and cilia), and plays important roles in both intra-cellular transport (e.g. the movement of vesicles and organelles) and cellular division. Includes constituent parts of intermediate filaments, microfilaments, microtubules, and the microtrabecular lattice."" [GOC:jl]Note that this term is in the subset of terms that should not be used for direct gene product annotation. Instead, select a child term or, if no appropriate child term exists, please request a new term. Direct annotations to this term may be amended during annotation QC.
16: ""The aggregation, arrangement and bonding together of a set of macromolecules to form a complex, carried out at the cellular level."" [GOC:mah]
17: ""The directed, self-propelled movement of a cell or subcellular component without the involvement of an external agent such as a transporter or a pore."" [GOC:dgh, GOC:dph, GOC:jl, GOC:mlg]Note that in GO cellular components include whole cells (cell is_a cellular component).
18: ""Catalysis of the generation of force resulting either in movement along a microfilament or microtubule, or in torque resulting in membrane scission, coupled to the hydrolysis of a nucleoside triphosphate."" [GOC:mah, GOC:vw, ISBN:0815316194, PMID:11242086]
19: ""Any cellular metabolic process involving deoxyribonucleic acid. This is one of the two main types of nucleic acid, consisting of a long, unbranched macromolecule formed from one, or more commonly, two, strands of linked deoxyribonucleotides."" [ISBN:0198506732]
20: ""OBSOLETE. Plays a role in regulating transcription initiation."" [GOC:curators]This term was obsoleted because it is essentially identical to a Process term (specifically the Biological Process term which has been selected as a term to consider for reannotation), i.e. it is defined only in terms of the process it acts in and it does NOT convey any information about the molecular nature of the function or whether the function is based on binding DNA, on interacting with other proteins, or some other mechanism. To transfer all annotations without review, the BP term indicated is considered to be equivalent and thus the only appropriate destination for all annotations. To reannotate to a MF term, you will probably need to revisit the original literature or other primary data because this ""MF"" term was not defined in terms of mechanism of action and there are multiple possibilities in the revised MF structure. In reannotation, please also consider descendent terms of the suggested MF terms as a more specific term may be more appropriate than the MF terms indicated. Please be aware that you may wish to request a new term if the mechanism of action of this gene product is not yet represented or if you are annotating for an RNAP different than one for which there is a specific suggested term. Also note that if there is no information about how the gene product acts, it may be appropriate to annotate to the root term for molecular_function.
21: ""The process in which DNA-dependent DNA replication is started; this involves the separation of a stretch of the DNA double helix, the recruitment of DNA polymerases and the initiation of polymerase action."" [ISBN:071673706X, ISBN:0815316194]
22: ""Catalysis of the reaction: deoxynucleoside triphosphate + DNA(n) = diphosphate + DNA(n+1); the synthesis of DNA from deoxyribonucleotide triphosphates in the presence of a nucleic acid template and a 3'hydroxyl group."" [EC:2.7.7.7, GOC:mah]
23: ""Catalysis of the reaction: deoxynucleoside triphosphate + DNA(n) = diphosphate + DNA(n+1); the synthesis of DNA from deoxyribonucleotide triphosphates in the presence of a DNA template and a 3'hydroxyl group."" [EC:2.7.7.7, GOC:vw, ISBN:0198547684]
24: ""Any process involved in the assembly of the RNA polymerase preinitiation complex (PIC) at the core promoter region of a DNA template, resulting in the subsequent synthesis of RNA from that promoter. The initiation phase includes PIC assembly and the formation of the first few bonds in the RNA chain, including abortive initiation, which occurs when the first few nucleotides are repeatedly synthesized and then released. The initiation phase ends just before and does not include promoter clearance, or release, which is the transition between the initiation and elongation phases of transcription."" [GOC:jid, GOC:txnOH, PMID:18280161]Note that promoter clearance is represented as a separate step, not part_of either initiation or elongation.
25: ""The division of a cell nucleus into two nuclei, with DNA and other nuclear contents distributed between the daughter nuclei."" [GOC:mah]
26: ""A cell cycle process comprising the steps by which the nucleus of a eukaryotic cell divides; the process involves condensation of chromosomal DNA into a highly compacted form. Canonically, mitosis produces two daughter nuclei whose chromosome complement is identical to that of the mother cell."" [GOC:dph, GOC:ma, GOC:mah, ISBN:0198547684]
27: ""A process that is part of the mitotic cell cycle."" [GO_REF:0000060, GOC:mtg_cell_cycle, GOC:TermGenie]
28: ""The DNA polymerase III holoenzyme is a complex that contains 10 different types of subunits. These subunits are organized into 3 functionally essential sub-assemblies: the pol III core, the beta sliding clamp processivity factor and the clamp-loading complex. The pol III core carries out the polymerase and the 3'-5' exonuclease proofreading activities. The polymerase is tethered to the template via the sliding clamp processivity factor. The clamp-loading complex assembles the beta processivity factor onto the primer template and plays a central role in the organization and communication at the replication fork."" [PMID:11525729, PMID:12940977, UniProt:P06710]
29: ""A protein complex that possesses DNA polymerase activity and is involved in template directed synthesis of DNA."" [GOC:jl, PMID:12045093]
30: ""The creation of two or more organelles by division of one organelle."" [GOC:jid]
31: ""The chemical reactions and pathways resulting in the formation of RNA, ribonucleic acid, one of the two main type of nucleic acid, consisting of a long, unbranched macromolecule formed from ribonucleotides joined in 3',5'-phosphodiester linkage. Includes polymerization of ribonucleotide monomers. Refers not only to transcription but also to e.g. viral RNA replication."" [GOC:mah, GOC:txnOH]Note that, in some cases, viral RNA replication and viral transcription from RNA actually refer to the same process, but may be called differently depending on the focus of a specific research study.
32: ""A process that results in the biosynthesis of constituent macromolecules, assembly, arrangement of constituent parts, or disassembly of a cell wall."" [GOC:mah]
33: ""The cellular process that ensures successive accurate and complete genome replication and chromosome segregation."" [GOC:isa_complete, GOC:mtg_cell_cycle]
34: ""Catalysis of the transfer of a nucleotidyl group to a reactant."" [ISBN:0198506732]
35: ""Catalysis of the reaction: GDP-L-fucose + beta-D-galactosyl-R = GDP + alpha-L-fucosyl-(1,2)-beta-D-galactosyl-R."" [EC:2.4.1.69]
36: ""Catalysis of the transfer of a fucosyl group to an acceptor molecule, typically another carbohydrate or a lipid."" [GOC:ai]
37: ""Catalysis of the transfer of an L-fucosyl group from GDP-beta-L-fucose to an acceptor molecule to form an alpha-(1->2) linkage."" [GOC:mah]
38: ""A cellular process that results in the biosynthesis of constituent macromolecules, assembly, and arrangement of constituent parts of a cell wall. Includes biosynthesis of constituent macromolecules, such as proteins and polysaccharides, and those macromolecular modifications that are involved in synthesis or assembly of the cellular component. A cell wall is the rigid or semi-rigid envelope lying outside the cell membrane of plant, fungal and most prokaryotic cells, maintaining their shape and protecting them from osmotic lysis."" [GOC:jl, GOC:mah, GOC:mtg_sensu, ISBN:0198506732]
39: ""Combining with oxytocin to initiate a change in cell activity."" [GOC:ai]
GO Term Desription
# Module 9
zmroot %>% filter(`MCL Cluster` == "MCL9")
Network MCL Cluster GO Term Num Genes common Num GO genes Num MCL Genes
1: ZmRoot MCL9 GO:0016706 3 9 204
2: ZmRoot MCL9 GO:0000003 5 64 204
3: ZmRoot MCL9 GO:0019953 5 64 204
4: ZmRoot MCL9 GO:0044703 5 64 204
5: ZmRoot MCL9 GO:0022414 5 66 204
6: ZmRoot MCL9 GO:0051213 3 28 204
7: ZmRoot MCL9 GO:0004470 2 11 204
pval
1: 4.53e-05
2: 1.90e-04
3: 1.90e-04
4: 1.90e-04
5: 2.20e-04
6: 1.57e-03
7: 3.57e-03
GO Term Name
1: oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, 2-oxoglutarate as one donor, and incorporation of one atom each of oxygen into both donors
2: reproduction
3: sexual reproduction
4: multi-organism reproductive process
5: reproductive process
6: dioxygenase activity
7: malic enzyme activity
GO Term Desription
1: ""Catalysis of the reaction: A + 2-oxoglutarate + O2 = B + succinate + CO2. This is an oxidation-reduction (redox) reaction in which hydrogen or electrons are transferred from 2-oxoglutarate and one other donor, and one atom of oxygen is incorporated into each donor."" [GOC:mah]
2: ""The production of new individuals that contain some portion of genetic material inherited from one or more parent organisms."" [GOC:go_curators, GOC:isa_complete, GOC:jl, ISBN:0198506732]
3: ""A reproduction process that creates a new organism by combining the genetic material of two organisms. It occurs both in eukaryotes and prokaryotes: in multicellular eukaryotic organisms, an individual is created anew; in prokaryotes, the initial cell has additional or transformed genetic material. In a process called genetic recombination, genetic material (DNA) originating from two different individuals (parents) join up so that homologous sequences are aligned with each other, and this is followed by exchange of genetic information. After the new recombinant chromosome is formed, it is passed on to progeny."" [GOC:jl, http://en.wikipedia.org/wiki/Sexual_reproduction, ISBN:0387520546]Sexual reproduction may be seen as the regular alternation, in the life cycle of haplontic, diplontic and diplohaplontic organisms, of meiosis and fertilization which provides for the production offspring. In diplontic organisms there is a life cycle in which the products of meiosis behave directly as gametes, fusing to form a zygote from which the diploid, or sexually reproductive polyploid, adult organism will develop. In diplohaplontic organisms a haploid phase (gametophyte) exists in the life cycle between meiosis and fertilization (e.g. higher plants, many algae and Fungi); the products of meiosis are spores that develop as haploid individuals from which haploid gametes develop to form a diploid zygote; diplohaplontic organisms show an alternation of haploid and diploid generations. In haplontic organisms meiosis occurs in the zygote, giving rise to four haploid cells (e.g. many algae and protozoa), only the zygote is diploid and this may form a resistant spore, tiding organisms over hard times.
4: ""A biological process that directly contributes to the process of producing new individuals, involving another organism."" [GOC:jl]
5: ""A biological process that directly contributes to the process of producing new individuals by one or two organisms. The new individuals inherit some proportion of their genetic material from the parent or parents."" [GOC:dph, GOC:isa_complete]
6: ""Catalysis of an oxidation-reduction (redox) reaction in which both atoms of oxygen from one molecule of O2 are incorporated into the (reduced) product(s) of the reaction. The two atoms of oxygen may be distributed between two different products."" [DOI:10.1016/S0040-4020(03)00944-X, GOC:bf, http://www.onelook.com/]
7: ""Catalysis of the oxidative decarboxylation of malate with the concomitant production of pyruvate."" [ISBN:0198506732]
# Module 8
zmroot %>% filter(`MCL Cluster` == "MCL8")
Network MCL Cluster GO Term Num Genes common Num GO genes Num MCL Genes
1: ZmRoot MCL8 GO:0006869 7 60 206
2: ZmRoot MCL8 GO:0008289 7 102 206
3: ZmRoot MCL8 GO:0019439 4 35 206
4: ZmRoot MCL8 GO:1901361 4 43 206
5: ZmRoot MCL8 GO:0004367 2 6 206
6: ZmRoot MCL8 GO:0046434 2 7 206
7: ZmRoot MCL8 GO:0004966 2 9 206
8: ZmRoot MCL8 GO:0006559 2 10 206
9: ZmRoot MCL8 GO:0009074 2 10 206
10: ZmRoot MCL8 GO:1902222 2 10 206
11: ZmRoot MCL8 GO:0016841 2 11 206
pval
1: 6.86e-07
2: 2.44e-05
3: 2.03e-04
4: 4.54e-04
5: 1.02e-03
6: 1.42e-03
7: 2.41e-03
8: 3.00e-03
9: 3.00e-03
10: 3.00e-03
11: 3.64e-03
GO Term Name
1: lipid transport
2: lipid binding
3: aromatic compound catabolic process
4: organic cyclic compound catabolic process
5: glycerol-3-phosphate dehydrogenase [NAD+] activity
6: organophosphate catabolic process
7: galanin receptor activity
8: L-phenylalanine catabolic process
9: aromatic amino acid family catabolic process
10: erythrose 4-phosphate/phosphoenolpyruvate family amino acid catabolic process
11: ammonia-lyase activity
GO Term Desription
1: ""The directed movement of lipids into, out of or within a cell, or between cells, by means of some agent such as a transporter or pore. Lipids are compounds soluble in an organic solvent but not, or sparingly, in an aqueous solvent."" [ISBN:0198506732]
2: ""Interacting selectively and non-covalently with a lipid."" [GOC:ai]
3: ""The chemical reactions and pathways resulting in the breakdown of aromatic compounds, any substance containing an aromatic carbon ring."" [GOC:ai]
4: ""The chemical reactions and pathways resulting in the breakdown of organic cyclic compound."" [GOC:TermGenie]
5: ""Catalysis of the reaction: sn-glycerol 3-phosphate + NAD+ = glycerone phosphate + NADH + H+."" [EC:1.1.1.8, EC:1.1.1.94]
6: ""The chemical reactions and pathways resulting in the breakdown of organophosphates, any phosphate-containing organic compound."" [GOC:ai]
7: ""Combining with galanin to initiate a change in cell activity."" [GOC:ai]
8: ""The chemical reactions and pathways resulting in the breakdown of phenylalanine, 2-amino-3-phenylpropanoic acid."" [GOC:go_curators]
9: ""The chemical reactions and pathways resulting in the breakdown of aromatic amino acid family, amino acids with aromatic ring (phenylalanine, tyrosine, tryptophan)."" [GOC:go_curators]
10: ""The chemical reactions and pathways resulting in the breakdown of erythrose 4-phosphate/phosphoenolpyruvate family amino acid."" [GOC:pr, GOC:TermGenie]
11: ""Catalysis of the release of ammonia by the cleavage of a carbon-nitrogen bond or the reverse reaction with ammonia as a substrate."" [EC:4.3.-.-, GOC:krc]
ZmSAM Clusters - get clustes with >= 100 genes
dat <- clusters %>% group_by(ZmSAM) %>% mutate(num_genes = n()) %>% filter(num_genes >= 100)
names <- unique(dat$ZmSAM)
gene_list <- list()
max <- nrow(dat[dat$ZmSAM == 0,])
for(i in 1:length(names)) {
df <- clusters %>% filter(ZmSAM == names[i])
genes <- c(df$V1, rep(NA, max-nrow(df)))
gene_list[[i]] <- genes
}
modules <- do.call(cbind, gene_list)
Run analysis and FDR correction
alltissuemodules = cluster_analysis_func(modules)
#combine all into one list
bigdf <- do.call('rbind', alltissuemodules)
bigdf$fdr <- p.adjust(bigdf$pval, method='fdr') ##calculate an FDR
summary(bigdf$fdr) ##nothing is significant
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
kable(bigdf[which.min(bigdf[,3]),])
| module | PC | pval | tissue | fdr | |
|---|---|---|---|---|---|
| 184 | 2 | PC5 | 0.0033344 | LMAD26 | 0.9999109 |
Look at lowest p-value clusters
t <- bigdf %>% group_by(PC, tissue) %>% mutate(bonferroni = p.adjust(pval, method='bonferroni')) %>% arrange(bonferroni)
kable(head(t))
| module | PC | pval | tissue | fdr | bonferroni |
|---|---|---|---|---|---|
| 2 | PC5 | 0.0033344 | LMAD26 | 0.9999109 | 0.0433477 |
| 7 | PC3 | 0.0315067 | LMAN26 | 0.9999109 | 0.4095866 |
| 10 | PC1 | 0.0320505 | Kern | 0.9999109 | 0.4166559 |
| 4 | PC5 | 0.0436062 | LMAD26 | 0.9999109 | 0.5668812 |
| 2 | PC1 | 0.0536048 | Kern | 0.9999109 | 0.6968619 |
| 5 | PC5 | 0.0560459 | LMAN26 | 0.9999109 | 0.7285967 |
Let’s look at the GO enrichment terms for the most significant clusters (bonferroni p < 0.05)
zmsam <- info %>% filter(Network == "ZmSAM")
# Module 2
zmsam %>% filter(`MCL Cluster` == "MCL2")
Network MCL Cluster GO Term Num Genes common Num GO genes Num MCL Genes
1: ZmSAM MCL2 GO:0033180 8 10 942
2: ZmSAM MCL2 GO:0004347 3 5 942
3: ZmSAM MCL2 GO:0030120 6 24 942
4: ZmSAM MCL2 GO:0000502 3 7 942
5: ZmSAM MCL2 GO:0006094 3 7 942
pval GO Term Name
1: 4.62e-09 proton-transporting V-type ATPase, V1 domain
2: 1.72e-03 glucose-6-phosphate isomerase activity
3: 1.93e-03 vesicle coat
4: 5.53e-03 proteasome complex
5: 5.53e-03 gluconeogenesis
GO Term Desription
1: ""A protein complex that forms part of a proton-transporting V-type ATPase and catalyzes ATP hydrolysis. The V1 complex consists of: (1) a globular headpiece with three alternating copies of subunits A and B that form a ring, (2) a central rotational stalk composed of single copies of subunits D and F, and (3) a peripheral stalk made of subunits C, E, G and H. Subunits A and B mediate the hydrolysis of ATP at three reaction sites associated with subunit A."" [GOC:mah, ISBN:0716743663, PMID:16449553]
2: ""Catalysis of the reaction: D-glucose 6-phosphate = D-fructose 6-phosphate."" [EC:5.3.1.9]
3: ""A membrane coat found on a coated vesicle."" [GOC:mah]
4: ""A large multisubunit complex which catalyzes protein degradation, found in eukaryotes, archaea and some bacteria. In eukaryotes, this complex consists of the barrel shaped proteasome core complex and one or two associated proteins or complexes that act in regulating entry into or exit from the core."" [GOC:rb, http://en.wikipedia.org/wiki/Proteasome]
5: ""The formation of glucose from noncarbohydrate precursors, such as pyruvate, amino acids and glycerol."" [MetaCyc:GLUCONEO-PWY]
ZmPAN Clusters - get clustes with >= 100 genes
dat <- clusters %>% group_by(ZmPAN) %>% mutate(num_genes = n()) %>% filter(num_genes >= 100)
names <- unique(dat$ZmPAN)
gene_list <- list()
max <- nrow(dat[dat$ZmPAN == 0,])
for(i in 1:length(names)) {
df <- clusters %>% filter(ZmPAN == names[i])
genes <- c(df$V1, rep(NA, max-nrow(df)))
gene_list[[i]] <- genes
}
modules <- do.call(cbind, gene_list)
Run analysis and FDR correction
alltissuemodules = cluster_analysis_func(modules)
#combine all into one list
bigdf <- do.call('rbind', alltissuemodules)
bigdf$fdr <- p.adjust(bigdf$pval, method='fdr') ##calculate an FDR
summary(bigdf$fdr) ##nothing is significant
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.9992 0.9992 0.9992 0.9992 0.9992 0.9992
kable(bigdf[which.min(bigdf[,3]),])
| module | PC | pval | tissue | fdr | |
|---|---|---|---|---|---|
| 241 | 3 | PC5 | 0.0045783 | LMAD26 | 0.9991676 |
Look at lowest p-value clusters
t <- bigdf %>% group_by(PC, tissue) %>% mutate(bonferroni = p.adjust(pval, method='bonferroni')) %>% arrange(bonferroni)
kable(head(t))
| module | PC | pval | tissue | fdr | bonferroni |
|---|---|---|---|---|---|
| 3 | PC5 | 0.0045783 | LMAD26 | 0.9991676 | 0.0778318 |
| 2 | PC5 | 0.0090431 | GShoot | 0.9991676 | 0.1537334 |
| 1 | PC5 | 0.0092036 | LMAN26 | 0.9991676 | 0.1564620 |
| 13 | PC1 | 0.0129806 | Kern | 0.9991676 | 0.2206705 |
| 13 | PC5 | 0.0142503 | LMAD26 | 0.9991676 | 0.2422548 |
| 10 | PC2 | 0.0157494 | LMAN26 | 0.9991676 | 0.2677405 |
Let’s look at the GO enrichment terms for the most significant clusters (bonferroni p < 0.05)
zmpan <- info %>% filter(Network == "ZmPAN")
# Module 3
zmpan %>% filter(`MCL Cluster` == "MCL3")
Network MCL Cluster GO Term Num Genes common Num GO genes Num MCL Genes
1: ZmPAN MCL3 GO:0006414 20 37 793
2: ZmPAN MCL3 GO:0044391 27 84 793
3: ZmPAN MCL3 GO:0015935 17 49 793
4: ZmPAN MCL3 GO:0022613 9 12 793
5: ZmPAN MCL3 GO:0042254 9 12 793
6: ZmPAN MCL3 GO:0003746 12 25 793
7: ZmPAN MCL3 GO:0004298 12 27 793
8: ZmPAN MCL3 GO:0070003 12 27 793
9: ZmPAN MCL3 GO:0008135 20 90 793
10: ZmPAN MCL3 GO:0005839 12 29 793
11: ZmPAN MCL3 GO:0044085 9 27 793
12: ZmPAN MCL3 GO:0015934 10 36 793
13: ZmPAN MCL3 GO:0006334 20 154 793
14: ZmPAN MCL3 GO:0034728 20 154 793
15: ZmPAN MCL3 GO:0065004 20 154 793
16: ZmPAN MCL3 GO:0071824 20 154 793
17: ZmPAN MCL3 GO:0044429 18 141 793
18: ZmPAN MCL3 GO:0019843 5 10 793
19: ZmPAN MCL3 GO:0006626 4 6 793
20: ZmPAN MCL3 GO:0007007 4 6 793
21: ZmPAN MCL3 GO:0042719 4 6 793
22: ZmPAN MCL3 GO:0045039 4 6 793
23: ZmPAN MCL3 GO:0007005 4 7 793
24: ZmPAN MCL3 GO:0007006 4 7 793
25: ZmPAN MCL3 GO:0044743 4 7 793
26: ZmPAN MCL3 GO:0065002 4 7 793
27: ZmPAN MCL3 GO:0070585 4 7 793
28: ZmPAN MCL3 GO:0071806 4 7 793
29: ZmPAN MCL3 GO:0072655 4 7 793
30: ZmPAN MCL3 GO:0090151 4 7 793
31: ZmPAN MCL3 GO:1990542 4 7 793
32: ZmPAN MCL3 GO:0098798 6 20 793
33: ZmPAN MCL3 GO:0006325 20 194 793
34: ZmPAN MCL3 GO:0000786 17 151 793
35: ZmPAN MCL3 GO:0032993 17 151 793
36: ZmPAN MCL3 GO:0044815 17 151 793
37: ZmPAN MCL3 GO:0034622 23 245 793
38: ZmPAN MCL3 GO:0005853 4 8 793
39: ZmPAN MCL3 GO:0016272 4 10 793
40: ZmPAN MCL3 GO:0006413 8 48 793
41: ZmPAN MCL3 GO:0003743 8 50 793
42: ZmPAN MCL3 GO:0016986 5 20 793
43: ZmPAN MCL3 GO:0017038 4 13 793
44: ZmPAN MCL3 GO:0015929 3 7 793
45: ZmPAN MCL3 GO:0006465 3 8 793
Network MCL Cluster GO Term Num Genes common Num GO genes Num MCL Genes
pval GO Term Name
1: 8.91e-19 translational elongation
2: 1.30e-17 ribosomal subunit
3: 3.34e-12 small ribosomal subunit
4: 5.43e-11 ribonucleoprotein complex biogenesis
5: 5.43e-11 ribosome biogenesis
6: 5.61e-11 translation elongation factor activity
7: 1.74e-10 threonine-type endopeptidase activity
8: 1.74e-10 threonine-type peptidase activity
9: 3.77e-10 translation factor activity, RNA binding
10: 4.82e-10 proteasome core complex
11: 6.69e-07 cellular component biogenesis
12: 1.07e-06 large ribosomal subunit
13: 4.13e-06 nucleosome assembly
14: 4.13e-06 nucleosome organization
15: 4.13e-06 protein-DNA complex assembly
16: 4.13e-06 protein-DNA complex subunit organization
17: 1.57e-05 mitochondrial part
18: 2.27e-05 rRNA binding
19: 3.72e-05 protein targeting to mitochondrion
20: 3.72e-05 inner mitochondrial membrane organization
21: 3.72e-05 mitochondrial intermembrane space protein transporter complex
22: 3.72e-05 protein import into mitochondrial inner membrane
23: 8.41e-05 mitochondrion organization
24: 8.41e-05 mitochondrial membrane organization
25: 8.41e-05 intracellular protein transmembrane import
26: 8.41e-05 intracellular protein transmembrane transport
27: 8.41e-05 protein localization to mitochondrion
28: 8.41e-05 protein transmembrane transport
29: 8.41e-05 establishment of protein localization to mitochondrion
30: 8.41e-05 establishment of protein localization to mitochondrial membrane
31: 8.41e-05 mitochondrial transmembrane transport
32: 1.02e-04 mitochondrial protein complex
33: 1.21e-04 chromatin organization
34: 1.32e-04 nucleosome
35: 1.32e-04 protein-DNA complex
36: 1.32e-04 DNA packaging complex
37: 1.61e-04 cellular macromolecular complex assembly
38: 1.63e-04 eukaryotic translation elongation factor 1 complex
39: 4.58e-04 prefoldin complex
40: 6.18e-04 translational initiation
41: 8.19e-04 translation initiation factor activity
42: 9.95e-04 obsolete transcription initiation factor activity
43: 1.41e-03 protein import
44: 2.04e-03 hexosaminidase activity
45: 3.16e-03 signal peptide processing
pval GO Term Name
GO Term Desription
1: ""The successive addition of amino acid residues to a nascent polypeptide chain during protein biosynthesis."" [GOC:ems]
2: ""Either of the two subunits of a ribosome: the ribosomal large subunit or the ribosomal small subunit."" [GOC:jl]
3: ""The smaller of the two subunits of a ribosome."" [GOC:mah]
4: ""A cellular process that results in the biosynthesis of constituent macromolecules, assembly, and arrangement of constituent parts of a complex containing RNA and proteins. Includes the biosynthesis of the constituent RNA and protein molecules, and those macromolecular modifications that are involved in synthesis or assembly of the ribonucleoprotein complex."" [GOC:isa_complete, GOC:mah]
5: ""A cellular process that results in the biosynthesis of constituent macromolecules, assembly, and arrangement of constituent parts of ribosome subunits; includes transport to the sites of protein synthesis."" [GOC:ma]
6: ""Functions in chain elongation during polypeptide synthesis at the ribosome."" [ISBN:0198506732]
7: ""Catalysis of the hydrolysis of internal peptide bonds in a polypeptide chain by a mechanism in which the hydroxyl group of a threonine residue at the active center acts as a nucleophile."" [GOC:mah, http://merops.sanger.ac.uk/about/glossary.htm#CATTYPE, http://merops.sanger.ac.uk/about/glossary.htm#ENDOPEPTIDASE]
8: ""Catalysis of the hydrolysis of peptide bonds in a polypeptide chain by a mechanism in which the hydroxyl group of a threonine residue at the active center acts as a nucleophile."" [GOC:mah, http://merops.sanger.ac.uk/about/glossary.htm#CATTYPE]
9: ""Functions during translation by interacting selectively and non-covalently with RNA during polypeptide synthesis at the ribosome."" [GOC:ai, GOC:vw]
10: ""A multisubunit barrel shaped endoprotease complex, which is the core of the proteasome complex."" [GOC:rb, PMID:10806206]
11: ""A process that results in the biosynthesis of constituent macromolecules, assembly, and arrangement of constituent parts of a cellular component. Includes biosynthesis of constituent macromolecules, and those macromolecular modifications that are involved in synthesis or assembly of the cellular component."" [GOC:jl, GOC:mah]
12: ""The larger of the two subunits of a ribosome. Two sites on the ribosomal large subunit are involved in translation, namely the aminoacyl site (A site) and peptidyl site (P site)."" [ISBN:0198506732]
13: ""The aggregation, arrangement and bonding together of a nucleosome, the beadlike structural units of eukaryotic chromatin composed of histones and DNA."" [GOC:mah]
14: ""A process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of one or more nucleosomes."" [GOC:mah]
15: ""The aggregation, arrangement and bonding together of proteins and DNA molecules to form a protein-DNA complex."" [GOC:jl]
16: ""Any process in which macromolecules aggregate, disaggregate, or are modified, resulting in the formation, disassembly, or alteration of a protein-DNA complex."" [GOC:mah]
17: ""Any constituent part of a mitochondrion, a semiautonomous, self replicating organelle that occurs in varying numbers, shapes, and sizes in the cytoplasm of virtually all eukaryotic cells. It is notably the site of tissue respiration."" [GOC:jl]Note that this term is in the subset of terms that should not be used for direct gene product annotation. Instead, select a child term or, if no appropriate child term exists, please request a new term. Direct annotations to this term may be amended during annotation QC.
18: ""Interacting selectively and non-covalently with ribosomal RNA."" [GOC:jl]
19: ""The process of directing proteins towards and into the mitochondrion, usually mediated by mitochondrial proteins that recognize signals contained within the imported protein."" [GOC:mcc, ISBN:0716731363]
20: ""A process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of the mitochondrial inner membrane."" [GOC:ai, GOC:dph, GOC:jl, GOC:mah]See also the cellular component term 'mitochondrial inner membrane ; GO:0005743'.
21: ""Soluble complex of the mitochondrial intermembrane space composed of various combinations of small Tim proteins; acts as a protein transporter to guide proteins to the Tim22 complex for insertion into the mitochondrial inner membrane."" [PMID:12581629]
22: ""The process comprising the import of proteins into the mitochondrion from outside the organelle and their insertion into the mitochondrial inner membrane. The translocase of the outer membrane complex mediates the passage of these proteins across the outer membrane, after which they are guided by either of two inner membrane translocase complexes into their final destination in the inner membrane."" [GOC:mcc, GOC:vw, PMID:18672008]
23: ""A process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of a mitochondrion; includes mitochondrial morphogenesis and distribution, and replication of the mitochondrial genome as well as synthesis of new mitochondrial components."" [GOC:dph, GOC:jl, GOC:mah, GOC:sgd_curators, PMID:9786946]
24: ""A process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of a mitochondrial membrane, either of the lipid bilayer surrounding a mitochondrion."" [GOC:ai, GOC:dph, GOC:jl, GOC:mah]
25: ""The directed movement of proteins into an intracellular organelle, across a membrane."" [GOC:jl]
26: ""The directed movement of proteins in a cell, from one side of a membrane to another by means of some agent such as a transporter or pore."" [GOC:isa_complete]Note that this term is not intended for use in annotating lateral movement within membranes.
27: ""A process in which a protein is transported to, or maintained in, a location within the mitochondrion."" [GOC:ecd]
28: ""The directed movement of a protein across a membrane by means of some agent such as a transporter or pore."" [GOC:mah, GOC:vw]Note that this term is not intended for use in annotating lateral movement within membranes.
29: ""The directed movement of a protein to a specific location in the mitochondrion."" [GOC:mah]
30: ""The directed movement of a protein to a specific location in the mitochondrial membrane."" [GOC:ascb_2009, GOC:dph, GOC:tb]
31: ""The process in which a solute is transported from one side of a membrane to the other into, out of or within a mitochondrion."" [PMID:20533899]
32: ""A protein complex that is part of a mitochondion."" [GOC:dos]Note that this term is in the subset of terms that should not be used for direct gene product annotation. Instead, select a child term or, if no appropriate child term exists, please request a new term. Direct annotations to this term may be amended during annotation QC.
33: ""Any process that results in the specification, formation or maintenance of the physical structure of eukaryotic chromatin."" [GOC:mah]
34: ""A complex comprised of DNA wound around a multisubunit core and associated proteins, which forms the primary packing unit of DNA into higher order structures."" [GOC:elh]
35: ""A macromolecular complex containing both protein and DNA molecules."" [GOC:mah]Note that this term is intended to classify complexes that have DNA as one of the members of the complex, that is, the complex does not exist if DNA is not present. Protein complexes that interact with DNA e.g. transcription factor complexes should not be classified here.
36: ""A protein complex that plays a role in the process of DNA packaging."" [GOC:jl]
37: ""The aggregation, arrangement and bonding together of a set of macromolecules to form a complex, carried out at the cellular level."" [GOC:mah]
38: ""A multisubunit nucleotide exchange complex that binds GTP and aminoacyl-tRNAs, and catalyzes their codon-dependent placement at the A-site of the ribosome. In humans, the complex is composed of four subunits, alpha, beta, delta and gamma."" [GOC:jl, PMID:10216950]
39: ""A multisubunit chaperone that is capable of delivering unfolded proteins to cytosolic chaperonin, which it acts as a cofactor for. In humans, the complex is a heterohexamer of two PFD-alpha and four PFD-beta type subunits. In Saccharomyces cerevisiae, it also acts in the nucleus to regulate the rate of elongation by RNA polymerase II via a direct effect on histone dynamics."" [GOC:jl, PMID:17384227, PMID:24068951, PMID:9630229]
40: ""The process preceding formation of the peptide bond between the first two amino acids of a protein. This includes the formation of a complex of the ribosome, mRNA, and an initiation complex that contains the first aminoacyl-tRNA."" [ISBN:019879276X]
41: ""Functions in the initiation of ribosome-mediated translation of mRNA into a polypeptide."" [ISBN:0198506732]
42: ""OBSOLETE. Plays a role in regulating transcription initiation."" [GOC:curators]This term was obsoleted because it is essentially identical to a Process term (specifically the Biological Process term which has been selected as a term to consider for reannotation), i.e. it is defined only in terms of the process it acts in and it does NOT convey any information about the molecular nature of the function or whether the function is based on binding DNA, on interacting with other proteins, or some other mechanism. To transfer all annotations without review, the BP term indicated is considered to be equivalent and thus the only appropriate destination for all annotations. To reannotate to a MF term, you will probably need to revisit the original literature or other primary data because this ""MF"" term was not defined in terms of mechanism of action and there are multiple possibilities in the revised MF structure. In reannotation, please also consider descendent terms of the suggested MF terms as a more specific term may be more appropriate than the MF terms indicated. Please be aware that you may wish to request a new term if the mechanism of action of this gene product is not yet represented or if you are annotating for an RNAP different than one for which there is a specific suggested term. Also note that if there is no information about how the gene product acts, it may be appropriate to annotate to the root term for molecular_function.
43: ""The targeting and directed movement of proteins into a cell or organelle. Not all import involves an initial targeting event."" [GOC:ai]
44: ""Catalysis of the cleavage of hexosamine or N-acetylhexosamine residues (e.g. N-acetylglucosamine) residues from gangliosides or other glycoside oligosaccharides."" [http://www.onelook.com/, ISBN:0721662544]
45: ""The proteolytic removal of a signal peptide from a protein during or after transport to a specific location in the cell."" [GOC:mah, ISBN:0815316194]
GO Term Desription
sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] knitr_1.29 data.table_1.12.8 dplyr_1.0.0 quaint_0.0.0.9000
[5] ggpubr_0.3.0 ggplot2_3.3.2 reshape2_1.4.4 workflowr_1.6.2
loaded via a namespace (and not attached):
[1] tidyselect_1.1.0 xfun_0.15 purrr_0.3.4 haven_2.3.1
[5] lattice_0.20-41 carData_3.0-4 colorspace_1.4-1 vctrs_0.3.1
[9] generics_0.0.2 htmltools_0.5.0 yaml_2.2.1 rlang_0.4.6
[13] later_1.1.0.1 pillar_1.4.4 foreign_0.8-72 glue_1.4.1
[17] withr_2.2.0 readxl_1.3.1 lifecycle_0.2.0 plyr_1.8.6
[21] stringr_1.4.0 cellranger_1.1.0 munsell_0.5.0 ggsignif_0.6.0
[25] gtable_0.3.0 zip_2.0.4 evaluate_0.14 rio_0.5.16
[29] forcats_0.5.0 httpuv_1.5.4 curl_4.3 highr_0.8
[33] broom_0.5.6 Rcpp_1.0.4.6 promises_1.1.1 scales_1.1.1
[37] backports_1.1.8 abind_1.4-5 fs_1.4.1 hms_0.5.3
[41] digest_0.6.25 stringi_1.4.6 openxlsx_4.1.5 rstatix_0.6.0
[45] grid_3.6.2 rprojroot_1.3-2 tools_3.6.2 magrittr_1.5
[49] tibble_3.0.1 crayon_1.3.4 whisker_0.4 tidyr_1.1.0
[53] car_3.0-8 pkgconfig_2.0.3 ellipsis_0.3.1 rmarkdown_2.3
[57] R6_2.4.1 nlme_3.1-148 git2r_0.27.1 compiler_3.6.2