convert a matrix of characters into a matrix of strings in R -


i have large matrix of characters , want convert matrix of strings, without looping on each row individually, wondering there smart way fast, tried paste(data[,4:((i*2)+3)],collapse=""), problem combines rows large 1 string, while need have same initial number of rows original matrix, , each row contains 1 column string contains characters in specific row in other words: want convert matrix

a= { d  e  r  p  g  k  s  k  p   s  l  n s  k  p   s  l  n s  k  p   s  l  n s  k  p   s  l  n } 

into

a= {  derpgki  skpasln  skpasln  skpasln  skpasln } 

apply loop, should still pretty efficient in case. it's use be:

apply(x, 1, paste, collapse = "") 

alternatively, can try:

do.call(paste0, data.frame(x)) 

which might faster....


a reproducible example (not sure why i'm wasting time here)...

x <- structure(c("d", "s", "s", "s", "s", "e", "k", "k", "k", "k",                   "r", "p", "p", "p", "p", "p", "a", "a", "a", "a",                   "g", "s", "s", "s", "s", "k", "l", "l", "l", "l",                   "i", "n", "n", "n", "n"), .dim = c(5l, 7l)) x #      [,1] [,2] [,3] [,4] [,5] [,6] [,7] # [1,] "d"  "e"  "r"  "p"  "g"  "k"  "i"  # [2,] "s"  "k"  "p"  "a"  "s"  "l"  "n"  # [3,] "s"  "k"  "p"  "a"  "s"  "l"  "n"  # [4,] "s"  "k"  "p"  "a"  "s"  "l"  "n"  # [5,] "s"  "k"  "p"  "a"  "s"  "l"  "n"  

let's compare options:

library(microbenchmark)  fun1 <- function(inmat) apply(inmat, 1, paste, collapse = "") fun2 <- function(inmat) do.call(paste0, data.frame(inmat))  fun1(x) # [1] "derpgki" "skpasln" "skpasln" "skpasln" "skpasln" fun2(x) # [1] "derpgki" "skpasln" "skpasln" "skpasln" "skpasln"  microbenchmark(fun1(x), fun2(x)) # unit: microseconds #     expr      min        lq    median        uq      max neval #  fun1(x)   97.634  104.4805  112.0725  117.7735  268.503   100 #  fun2(x) 1258.000 1282.6275 1301.5555 1316.5015 1576.506   100 

and, on longer data.

x <- do.call(rbind, replicate(100000, x, simplify=false)) dim(x) # [1] 500000      7  microbenchmark(fun1(x), fun2(x), times = 10) # unit: milliseconds #     expr       min        lq    median       uq      max neval #  fun1(x) 4189.8940 4226.9354 4382.0403 4570.032 4596.983    10 #  fun2(x)  825.9816  835.4351  888.5102 1031.509 1056.832    10 

i suspect on wider data, apply still more efficient though.


Comments

Popular posts from this blog

windows - Single EXE to Install Python Standalone Executable for Easy Distribution -

c# - Access objects in UserControl from MainWindow in WPF -

javascript - How to name a jQuery function to make a browser's back button work? -