Skip to content

Commit

Permalink
Merge pull request #5 from kbroman/main
Browse files Browse the repository at this point in the history
Add MUGA maps + cross2_to_grcm39()
  • Loading branch information
kbroman authored Oct 13, 2021
2 parents 33cfa10 + 2d02282 commit 3ec082f
Show file tree
Hide file tree
Showing 19 changed files with 379 additions and 62 deletions.
2 changes: 2 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@ Makefile$
\.Rout$
^\.github$
^README\.Rmd$
\.csv$
^\.here$
66 changes: 20 additions & 46 deletions .github/workflows/R-CMD-check.yaml
Original file line number Diff line number Diff line change
@@ -1,14 +1,10 @@
# For help debugging build failures open an issue on the RStudio community with the 'github-actions' tag.
# https://community.rstudio.com/new-topic?category=Package%20development&tags=github-actions
# Workflow derived from https://github.com/r-lib/actions/tree/master/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
on:
push:
branches:
- main
- master
branches: [main, master]
pull_request:
branches:
- main
- master
branches: [main, master]

name: R-CMD-check

Expand All @@ -22,59 +18,37 @@ jobs:
fail-fast: false
matrix:
config:
- {os: macOS-latest, r: 'release'}
- {os: windows-latest, r: 'release'}
- {os: macOS-latest, r: 'release'}
- {os: ubuntu-20.04, r: 'release', rspm: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"}
- {os: ubuntu-20.04, r: 'devel', rspm: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"}
- {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'}
- {os: ubuntu-latest, r: 'release'}
- {os: ubuntu-latest, r: 'oldrel-1'}

env:
R_REMOTES_NO_ERRORS_FROM_WARNINGS: true
RSPM: ${{ matrix.config.rspm }}
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
R_KEEP_PKG_SOURCE: yes

steps:
- uses: actions/checkout@v2

- uses: r-lib/actions/setup-pandoc@v1

- uses: r-lib/actions/setup-r@v1
with:
r-version: ${{ matrix.config.r }}
http-user-agent: ${{ matrix.config.http-user-agent }}
use-public-rspm: true

- uses: r-lib/actions/setup-pandoc@v1

- name: Query dependencies
run: |
install.packages('remotes')
saveRDS(remotes::dev_package_deps(dependencies = TRUE), ".github/depends.Rds", version = 2)
writeLines(sprintf("R-%i.%i", getRversion()$major, getRversion()$minor), ".github/R-version")
shell: Rscript {0}

- name: Cache R packages
if: runner.os != 'Windows'
uses: actions/cache@v2
- uses: r-lib/actions/setup-r-dependencies@v1
with:
path: ${{ env.R_LIBS_USER }}
key: ${{ runner.os }}-${{ hashFiles('.github/R-version') }}-1-${{ hashFiles('.github/depends.Rds') }}
restore-keys: ${{ runner.os }}-${{ hashFiles('.github/R-version') }}-1-

- name: Install system dependencies
if: runner.os == 'Linux'
run: |
while read -r cmd
do
eval sudo $cmd
done < <(Rscript -e 'writeLines(remotes::system_requirements("ubuntu", "20.04"))')
extra-packages: rcmdcheck

- name: Install dependencies
run: |
remotes::install_deps(dependencies = TRUE)
remotes::install_cran("rcmdcheck")
shell: Rscript {0}
- uses: r-lib/actions/check-r-package@v1

- name: Check
env:
_R_CHECK_CRAN_INCOMING_REMOTE_: false
run: rcmdcheck::rcmdcheck(args = c("--no-manual", "--as-cran"), error_on = "warning", check_dir = "check")
shell: Rscript {0}
- name: Show testthat output
if: always()
run: find check -name 'testthat.Rout*' -exec cat '{}' \; || true
shell: bash

- name: Upload check results
if: failure()
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
inst/scripts/*.csv
Empty file added .here
Empty file.
13 changes: 6 additions & 7 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,22 +1,21 @@
Package: mmconvert
Version: 0.1-5
Date: 2021-09-28
Version: 0.2-3
Date: 2021-10-12
Title: Mouse Map Converter
Description: Function to convert mouse genome positions between the build 39 physical map and the Cox genetic map <doi:10.1534/genetics.109.105486>.
Author: Karl W Broman [aut, cre] (<https://orcid.org/0000-0002-4914-6671>)
Maintainer: Karl W Broman <[email protected]>
Authors@R: person("Karl W", "Broman", role=c("aut", "cre"),
email="[email protected]", comment=c(ORCID = "0000-0002-4914-6671"))
email="[email protected]", comment=c(ORCID = "0000-0002-4914-6671"))
Depends:
R (>= 3.5.0)
Imports:
Rcpp (>= 0.12.12),
utils,
stats
Rcpp (>= 0.12.12)
Suggests:
testthat,
devtools,
roxygen2
roxygen2,
qtl2
License: GPL-3
URL: https://github.com/rqtl/mmconvert
BugReports: https://github.com/rqtl/mmconvert/issues
Expand Down
5 changes: 4 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.PHONY: doc test all

all: doc README.md
all: doc README.md data/MUGAmaps.RData

README.md: README.Rmd
R -e "knitr::knit('$<')"
Expand All @@ -12,3 +12,6 @@ doc:
# run tests
test:
R -e 'devtools::test()'

data/MUGAmaps.RData: inst/scripts/grab_muga_array_annot.R
R -e "source('$<')"
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Generated by roxygen2: do not edit by hand

export(cross2_to_grcm39)
export(mmconvert)
importFrom(Rcpp,sourceCpp)
useDynLib(mmconvert, .registration=TRUE)
11 changes: 11 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
## mmconvert 0.2-3 (2021-10-12)

- Added a dataset with the MUGA array annotations for markers on the
autosomes or X chromosome, with mouse build GRCm39 positions and
the revised Cox Map genetic map locations.

- Add function `cross2_to_grcm39()` for converting an R/qtl2 cross2
object to use the new GRCm39 mouse build and the revised Cox genetic
map.


## mmconvert 0.1-5 (2021-09-28)

- New package
26 changes: 26 additions & 0 deletions R/MUGAmaps-data.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#' @name MUGAmaps
#' @aliases MUGAmaps
#'
#' @title Array annotation information for the mouse MUGA arrays
#' in mouse genome build 39.
#'
#' @description A list of four data frames with annotation information for the four MUGA arrays,
#' GigaMUGA ("gm"), MegaMUGA ("mm"), MiniMUGA ("mini") and the original MUGA ("muga").
#' Each has columns marker, chromosome, build 39 basepair position, and sex-averaged cM position (in Cox Map v3).
#'
#' @details
#' SNP probes for the MUGA arrays were blasted against mouse genome
#' build GRCm39 and locations interpolated using theRevised Cox maps.
#' See <https://github.com/kbroman/MUGAarrays> for the array
#' annotations and <https://github.com/kbroman/CoxMapV3> for the
#' genetic maps. Note that for the genetic map locations, markers were
#' shifted so that 0 cM corresponds to 3 Mbp, using the chromosome-
#' and sex-specific recombination rate.
#'
#' @source <https://github.com/kbroman/MUGAarrays>
#'
#' @keywords datasets
#'
#' @examples
#' data(MUGAmaps)
NULL
111 changes: 111 additions & 0 deletions R/cross2_to_grcm39.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
#' Convert a cross2 object to use mouse build GRCm39
#'
#' Convert a cross2 object (with genotypes from one of the MUGA
#' arrays) to use mouse build GRCm39 and the revised Cox map
#' positions, revising marker order and omitting markers that are not
#' found.
#'
#' @param cross Object of class `"cross2"`, as produced by
#' [qtl2::read_cross2()]. Must have markers from just one of the MUGA arrays.
#'
#' @param array Character string indicating which of the MUGA arrays
#' was used ("gm" for GigaMUGA, "mm" for MegaMUGA, "mini" for
#' MiniMUGA, or "muga" for the original MUGA), or "guess" (the
#' default) to pick the array with the most matching marker names.
#'
#' @export
#'
#' @return The input `cross` object with markers subset to those in build GRCm39
#' and with `pmap` and `gmap` replaced with the GRCm39 physical map and
#' revised Cox genetic map, respectively.
#'
#' @seealso [MUGAmaps]
#'
#' @examples
#' \dontrun{
#' file <- paste0("https://raw.githubusercontent.com/rqtl/",
#' "qtl2data/master/DOex/DOex.zip")
#' DOex <- read_cross2(file)
#' DOex_rev <- cross2_to_grcm39(DOex)
#' }

cross2_to_grcm39 <-
function(cross, array=c("guess", "gm", "mm", "mini", "muga"))
{
# check that it's cross2
if(!inherits(cross, "cross2")) stop('Input cross must have class "cross2"')

# markers in the cross object
markers <- unlist(lapply(cross$geno, colnames))

# MUGA maps (internal dataset)
muga_maps <- mmconvert::MUGAmaps

array <- match.arg(array)
if(array == "guess") {
# compare marker names to the four MUGA arrays
# use the one with the most matches

n_match <- vapply(muga_maps, function(a,b) sum(b %in% a$marker), 0, markers)
if(!any(n_match > 0)) {
stop("No markers found in the MUGA arrays")
}

array <- names(n_match)[which.max(n_match)]
}

map <- muga_maps[[array]]

# number of markers found
n_found <- sum(markers %in% map$marker)

# if no markers found, stop with an error
if(n_found == 0) {
stop("No markers found in MUGA array")
}

n_notfound <- length(markers) - n_found

# if more than 5% of markers omitted, give a warning
if(n_notfound / length(markers) >= 0.05) {
warning("Omitting ", n_notfound, " (", round(n_notfound/length(markers)*100), "%) markers")
} else if(n_notfound > 0) {
message("Omitting ", n_notfound, " markers")
}

map <- map[map$marker %in% markers, , drop=FALSE]

gmap <- map_df_to_list(map, "chr", "cM_cox", "marker")
map$Mbp_grcm39 <- map$bp_grcm39/1e6
pmap <- map_df_to_list(map, "chr", "Mbp_grcm39", "marker")

# reorder markers in geno
cross$geno <- cross$geno[names(pmap)]
for(chr in names(pmap)) {
g <- cross$geno[[chr]]
g <- g[, names(pmap[[chr]]), drop=FALSE]
cross$geno[[chr]] <- g
}


# reorder markers in founder_geno
if("founder_geno" %in% names(cross)) {
cross$founder_geno <- cross$founder_geno[names(pmap)]
for(chr in names(pmap)) {
fg <- cross$founder_geno[[chr]]
fg <- fg[, names(pmap[[chr]]), drop=FALSE]
cross$founder_geno[[chr]] <- fg
}
}

# paste in new genetic map
cross$gmap <- gmap

# paste in new physical map
cross$pmap <- pmap

# make sure that is_x_chr gets subset, if necessary
cross$is_x_chr <- cross$is_x_chr[names(pmap)]

cross
}
13 changes: 13 additions & 0 deletions R/test_util.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# used to skip tests when not on *my* computer in interactive mode
# needed for testing mult-core code, which isn't allowed within R CMD check
isnt_karl <-
function()
{
!(interactive() &&
identical(Sys.getenv("KARL_LOCAL"), "true"))
}

# is a number? (from assertthat)
is_number <- function(x) is.numeric(x) && length(x)==1
is_nonneg_number <- function(x) is_number(x) && x >= 0
is_pos_number <- function(x) is_number(x) && x > 0
33 changes: 29 additions & 4 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,12 @@ Install the mmconvert package from

### Usage

[mmconvert](https://github.com/rqtl/mmconvert) contains a single
function `mmconvert()`. It takes a set of positions as input, plus and
[mmconvert](https://github.com/rqtl/mmconvert) contains two functions:
`mmconvert()` and `cross2_to_grcm39()`.

#### `mmconvert()`

`mmconvert()` takes a set of positions as input, plus and
indication of whether they are basepairs or Mbp (in build 39) or
sex-averaged, female, or male cM (from the [revised Cox genetic
map](https://github.com/kbroman/CoxMapV3)).
Expand All @@ -47,8 +51,8 @@ third column or included as row names.

```{r input_df}
input_df <- data.frame(chr=c(14,14,14),
pos=c(6738536, 67215850, 121955310),
marker=c("rs13482072", "rs13482231", "gnf14.117.278"))
pos=c(6738536, 67215850, 121955310),
marker=c("rs13482072", "rs13482231", "gnf14.117.278"))
```

For either of these cases, the output is a data frame with seven
Expand All @@ -73,6 +77,27 @@ But note that the bp or Mbp positions must be in mouse genome build
39, and cM positions must be according to the
[Cox Map V3](https://github.com/kbroman/CoxMapV3).

#### `cross2_to_grcm39()`

`cross2_to_grcm39()` takes a cross2 object from
[R/qtl2](https://kbroman.org/qtl2/) with mouse genotype data from one
of the MUGA arrays and converts it to mouse genome build GRCm39, by
possibly subsetting the markers, reordering them according to the
GRCm39 build, and plugging in GRCm39 Mbp positions and the revised Cox
genetic map. See <https://github.com/kbroman/MUGAarrays> for the
MUGA array annotations and <https://github.com/kbroman/CoxMapV3> for
the revised Cox genetic map.

```{r cross2_to_grcm39, eval=FALSE}
file <- paste0("https://raw.githubusercontent.com/rqtl/",
"qtl2data/master/DOex/DOex.zip")
library(qtl2)
DOex <- read_cross2(file)
DOex_rev <- cross2_to_grcm39(DOex)
```

---

#### License
Expand Down
Loading

0 comments on commit 3ec082f

Please sign in to comment.