Warning in as.POSIXlt.POSIXct(Sys.time()): unknown timezone 'zone/tz/2018c.
1.0/zoneinfo/America/Chicago'
Last updated: 2018-04-22
Code version: b535190
We preprocessed the data to remove technical biases due to PCR amplification and sequencing run. The analysis proceeds in two sequential steps:
Normalization: apply Cumulative Sum Scaling as described in Paulson et al, 2015.
Batch correction: fit a linear model to each OTU at the speciees level, then estimate and remove batch effect due to sequencin run for each OTU.
After aggregating reads at the species level, we are left with 553 features in 197 samples.
\(~\)
library(knitr)
library(kableExtra)
library(dplyr)
library(metagenomeSeq)
library(CountClust)
Read in filtered data.
MRobj = readRDS("../data/nasal_filtered.rds")
MRobj
MRexperiment (storageMode: environment)
assayData: 4423 features, 197 samples
element names: counts
protocolData: none
phenoData
sampleNames: EM0042 EM0047 ... E0168 (197 total)
varLabels: StudyID age ... infnone (94 total)
varMetadata: labelDescription
featureData
featureNames: denovo6 denovo17 ... denovo42992 (4423 total)
fvarLabels: Kingdom Phylum ... Species (7 total)
fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation:
\(~\)
We calculate the normalization scaling factors using cumulative sum scaling, see Paulson et al for more information.
MRobj = cumNorm(MRobj)
\(~\)
Batch correction for sequencing run effect is performed for each taxonomy level. Here, we focus on species level analysis and perform batch correcdtion on the raw counts at teh species level. Details for batch correction are described in the manuscript.
obj <- aggTax(MRobj,lvl="Species")
obj@expSummary$expSummary$normFactors <- normFactors(MRobj)
counts <- t(MRcounts(obj,norm=FALSE,log=FALSE))
batchcorrect_counts <- BatchCorrectedCounts(counts,
batch_lab = pData(MRobj)$seqrun,
use_parallel=FALSE)
counts = t(batchcorrect_counts)
assayData(obj)[["counts"]] = counts
obj
Save normalized, batch corrected counts.
saveRDS(obj, file = "../data/nasal_filtered_normed_batchcorrected.rds")
\(~\)
R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] CountClust_1.4.1 ggplot2_2.2.1 metagenomeSeq_1.20.1
[4] RColorBrewer_1.1-2 glmnet_2.0-13 foreach_1.4.4
[7] Matrix_1.2-12 limma_3.34.9 Biobase_2.38.0
[10] BiocGenerics_0.24.0 dplyr_0.7.4 kableExtra_0.7.0
[13] knitr_1.20
loaded via a namespace (and not attached):
[1] maptpx_1.9-3 Rcpp_0.12.15 ape_5.0
[4] lattice_0.20-35 gtools_3.5.0 assertthat_0.2.0
[7] rprojroot_1.3-2 digest_0.6.15 slam_0.1-42
[10] R6_2.2.2 plyr_1.8.4 backports_1.1.2
[13] stats4_3.4.1 evaluate_0.10.1 httr_1.3.1
[16] pillar_1.2.1 gplots_3.0.1 rlang_0.2.0
[19] lazyeval_0.2.1 gdata_2.18.0 vegan_2.4-6
[22] rmarkdown_1.9 readr_1.1.1 stringr_1.3.0
[25] munsell_0.4.3 compiler_3.4.1 pkgconfig_2.0.1
[28] SQUAREM_2017.10-1 mgcv_1.8-23 htmltools_0.3.6
[31] nnet_7.3-12 tibble_1.4.2 codetools_0.2-15
[34] matrixStats_0.53.1 permute_0.9-4 viridisLite_0.3.0
[37] MASS_7.3-49 bitops_1.0-6 grid_3.4.1
[40] nlme_3.1-131.1 gtable_0.2.0 git2r_0.21.0
[43] magrittr_1.5 scales_0.5.0 KernSmooth_2.23-15
[46] stringi_1.1.6 reshape2_1.4.3 flexmix_2.3-14
[49] bindrcpp_0.2 xml2_1.2.0 boot_1.3-20
[52] cowplot_0.9.2 iterators_1.0.9 tools_3.4.1
[55] picante_1.6-2 glue_1.2.0 hms_0.4.1
[58] yaml_2.1.18 colorspace_1.3-2 cluster_2.0.6
[61] caTools_1.17.1 rvest_0.3.2 bindr_0.1
[64] modeltools_0.2-21
This R Markdown site was created with workflowr