This function quantifies proteoform ions in a mass spectrometric (MS) data file, which comprises several mass spectra recorded at different time points during a chromatographic run. To this end, quantify_ions() employs a two-step integration approach: First, each of the \(i\) ions is quantified in each of the \(n\) mass spectra by integrating intensity in the m/z tolerance window associated with the respective ion. This step yields an \(i \times n\) matrix of abundances (areas under the curve), with one row per ion and one column per scan. Rows in this matrix correspond to extracted ion currents (XICs). Second, each XIC is integrated within the \(r\) given retention time limits. This step yields an \(i \times r\) matrix of abundances, with one row per ion and one column per retention time window.

quantify_ions(
  ms_data,
  ions,
  rt_limits = c(-Inf, +Inf),
  filter_ms1 = TRUE,
  ifun_spectrum = c("adaptive", "trapezoidal"),
  ifun_xic = c("trapezoidal", "adaptive")
)

Arguments

ms_data

Mass spectrometric data stored in an mzR object as returned by mzR::openMSfile().

ions

A data frame specifying ions (see details).

rt_limits

A specification of retention time limits (see details).

filter_ms1

If true, only scans from MS level 1 are integrated.

ifun_spectrum

Integration method for spectra (see details).

ifun_xic

Integration method for XICs (see details).

Value

A list of class quaxi with seven elements:

ms_file

Path of the MS data file.

ifuns

Character vector denoting the integration methods used for spectra and XICs, respectively.

ions

Data frame passed to the argument ions describing \(i\) ions, with an additional column ion_id prepended (this column contains a unique identifier for each ion).

rt_limits

Data frame with \(r\) retention time limits (columns rt_min and rt_max) and scan indices enclosed by these limits (columns scan_min and scan_max).

rt

Numeric vector containing \(n\) retention times associated with the \(n\) scans in the MS data file.

xics

An \(i \times n\) matrix containing XIC intensity values, with one row per ion and one column per scan.

abundances

An \(i \times r\) matrix containing abundances, with one row per ion and one column per retention time window.

Details

Ion specification

Ions are provided via a data frame that must contain two columns mz_min and mz_max specifying the lower and upper integration limit for each ion. The data frame may contain additional columns, which are ignored.

Retention time limits

rt_limits accepts one of the following:

  • a numeric vector with two elements specifying a single retention time window

  • a data frame with two colums rt_min and rt_max (other columns are ignored) specifying one retention time window per row

  • a data type that can be converted to a data frame, such as a named list

The values -Inf and +Inf are replaced by the minimum and maximum retention time, respectively.

Integration methods

The integration method may be selected separately for the spectrum integration step (by ifun_spectrum) and the XIC integration step (by ifun_xic).

"adaptive"

Default method for spectrum integration (step 1). Adaptive quadrature via approxfun() and stats::integrate(). This method is faster, but may yield wrong results if the integrand is zero over nearly all its range (which typically occurs if XICs are integrated over the whole chromatogram).

"trapezoidal"

Default method for XIC integration (step 2). Trapezoidal integration, i.e., the exact area under the polygonal chain formed by (mz or retention time, intensity) tuples. This method is slower, but always gives correct results for XIC integration.

Examples

ms_data <- mzR::openMSfile( system.file("extdata", "mzml", "mab1.mzML", package = "fragquaxi") ) proteins <- define_proteins( system.file("extdata", "mab_sequence.fasta", package = "fragquaxi"), .disulfides = 16 ) modcoms <- define_ptm_compositions(sample_modcoms) pfm_ions <- assemble_proteoforms(proteins, modcoms) %>% ionize(36L:40L) # no rentention time limits: use all spectra quantify_ions(ms_data, pfm_ions)
#> Abundances of 30 ions quantified in 352 mass spectra using 1 retention time window.
#>
#> ── Parameters ──
#>
#> MS data file: #> /tmp/RtmpMZpe4N/temp_libpath126f21a1228f4/fragquaxi/extdata/mzml/mab1.mzML
#>
#> Ions:
#> # A tibble: 30 x 9 #> ion_id protein_name modcom_name formula mass z #> <chr> <int> <chr> <mol> <dbl> <int> #> 1 id_1 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 36 #> 2 id_2 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 37 #> 3 id_3 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 38 #> 4 id_4 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 39 #> 5 id_5 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 40 #> # … with 25 more rows, and 3 more variables: mz <dbl>, mz_min <dbl>, #> # mz_max <dbl>
#>
#> Retention time limits:
#> # A tibble: 1 x 3 #> rt_min rt_max scans #> <dbl> <dbl> <list> #> 1 2.58 900. <int [352]>
#>
#> ── Results ──
#>
#> # A tibble: 1 x 32 #> rt_min rt_max id_1 id_2 id_3 id_4 id_5 id_6 id_7 id_8 id_9 #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 2.58 900. 362597. 194880. 1.81e5 1.26e5 1.16e5 7.62e6 6.27e6 4.47e6 2.83e6 #> # … with 21 more variables: id_10 <dbl>, id_11 <dbl>, id_12 <dbl>, id_13 <dbl>, #> # id_14 <dbl>, …
# define a single retention time window quantify_ions(ms_data, pfm_ions, rt_limits = c(240, 330))
#> Abundances of 30 ions quantified in 352 mass spectra using 1 retention time window.
#>
#> ── Parameters ──
#>
#> MS data file: #> /tmp/RtmpMZpe4N/temp_libpath126f21a1228f4/fragquaxi/extdata/mzml/mab1.mzML
#>
#> Ions:
#> # A tibble: 30 x 9 #> ion_id protein_name modcom_name formula mass z #> <chr> <int> <chr> <mol> <dbl> <int> #> 1 id_1 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 36 #> 2 id_2 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 37 #> 3 id_3 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 38 #> 4 id_4 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 39 #> 5 id_5 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 40 #> # … with 25 more rows, and 3 more variables: mz <dbl>, mz_min <dbl>, #> # mz_max <dbl>
#>
#> Retention time limits:
#> # A tibble: 1 x 3 #> rt_min rt_max scans #> <dbl> <dbl> <list> #> 1 240 330 <int [39]>
#>
#> ── Results ──
#>
#> # A tibble: 1 x 32 #> rt_min rt_max id_1 id_2 id_3 id_4 id_5 id_6 id_7 id_8 id_9 #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 240 330 251662. 154038. 1.49e5 1.10e5 1.01e5 5.45e6 4.60e6 3.31e6 2.17e6 #> # … with 21 more variables: id_10 <dbl>, id_11 <dbl>, id_12 <dbl>, id_13 <dbl>, #> # id_14 <dbl>, …
# define several retention time windows; # instead of a list, you may also pass a data frame to rt_limits quantify_ions(ms_data, pfm_ions, rt_limits = list(rt_min = c(0, 240), rt_max = c(60, 330)))
#> Abundances of 30 ions quantified in 352 mass spectra using 2 retention time windows.
#>
#> ── Parameters ──
#>
#> MS data file: #> /tmp/RtmpMZpe4N/temp_libpath126f21a1228f4/fragquaxi/extdata/mzml/mab1.mzML
#>
#> Ions:
#> # A tibble: 30 x 9 #> ion_id protein_name modcom_name formula mass z #> <chr> <int> <chr> <mol> <dbl> <int> #> 1 id_1 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 36 #> 2 id_2 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 37 #> 3 id_3 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 38 #> 4 id_4 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 39 #> 5 id_5 1 G0F/G0 C6570 H10124 N1714 O2088 S44 147942. 40 #> # … with 25 more rows, and 3 more variables: mz <dbl>, mz_min <dbl>, #> # mz_max <dbl>
#>
#> Retention time limits:
#> # A tibble: 2 x 3 #> rt_min rt_max scans #> <dbl> <dbl> <list> #> 1 0 60 <int [24]> #> 2 240 330 <int [39]>
#>
#> ── Results ──
#>
#> # A tibble: 2 x 32 #> rt_min rt_max id_1 id_2 id_3 id_4 id_5 id_6 id_7 id_8 id_9 #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 0 60 0 0 0 0 0 0 0 0 0 #> 2 240 330 251662. 154038. 1.49e5 1.10e5 1.01e5 5.45e6 4.60e6 3.31e6 2.17e6 #> # … with 21 more variables: id_10 <dbl>, id_11 <dbl>, id_12 <dbl>, id_13 <dbl>, #> # id_14 <dbl>, …