Skip to content

how to get a data frame from split_sheet? #10

@benmarwick

Description

@benmarwick

I'd like to get the output of jailbreakr::split_sheet() as a data frame. How can I do that?

Here's what I've tried:

mg <- rexcel::rexcel_read("mini-gap.xlsx")
mg_split_sheet <- jailbreakr::split_sheet(mg)
thing <- mg_split_sheet[[1]]$values()
thing
#>      [,1]           [,2]        [,3]   [,4]      [,5]    [,6]       
#> [1,] "country"      "continent" "year" "lifeExp" "pop"   "gdpPercap"
#> [2,] "Algeria"      "Africa"    1952   43.077    9279525 2449.008   
#> [3,] "Angola"       "Africa"    1952   30.015    4232095 3520.61    
#> [4,] "Benin"        "Africa"    1952   38.223    1738315 1062.752   
#> [5,] "Botswana"     "Africa"    1952   47.622    442308  851.2411   
#> [6,] "Burkina Faso" "Africa"    1952   31.975    4469979 543.2552

It looks like a matrix or array, but the output of str is list (is that normal for an array? I don't work with them much)...

str(thing)
#> List of 36
#>  $ : chr "country"
#>  $ : chr "Algeria"
#>  $ : chr "Angola"
#>  $ : chr "Benin"
#>  $ : chr "Botswana"
#>  $ : chr "Burkina Faso"
#>  $ : chr "continent"
#>  $ : chr "Africa"
#>  $ : chr "Africa"
#>  $ : chr "Africa"
#>  $ : chr "Africa"
#>  $ : chr "Africa"
#>  $ : chr "year"
#>  $ : num 1952
#>  $ : num 1952
#>  $ : num 1952
#>  $ : num 1952
#>  $ : num 1952
#>  $ : chr "lifeExp"
#>  $ : num 43.1
#>  $ : num 30
#>  $ : num 38.2
#>  $ : num 47.6
#>  $ : num 32
#>  $ : chr "pop"
#>  $ : num 9279525
#>  $ : num 4232095
#>  $ : num 1738315
#>  $ : num 442308
#>  $ : num 4469979
#>  $ : chr "gdpPercap"
#>  $ : num 2449
#>  $ : num 3521
#>  $ : num 1063
#>  $ : num 851
#>  $ : num 543
#>  - attr(*, "dim")= Named int [1:2] 6 6
#>   ..- attr(*, "names")= chr [1:2] "row" "col"

I see in https://github.com/rsheets/linen/blob/master/R/export.R that the values element is created by array(lapply(...), which is why I posted here, rather than at the jailbreakr repo.

I tried as.data.frame.array, but that gives a data frame of lists:

thing_asdfa <- as.data.frame.array(thing)

str(thing_asdfa)
#> 'data.frame':    6 obs. of  6 variables:
#>  $ V1:List of 6
#>   ..$ : chr "country"
#>   ..$ : chr "Algeria"
#>   ..$ : chr "Angola"
#>   ..$ : chr "Benin"
#>   ..$ : chr "Botswana"
#>   ..$ : chr "Burkina Faso"
#>  $ V2:List of 6
#>   ..$ : chr "continent"
#>   ..$ : chr "Africa"
#>   ..$ : chr "Africa"
#>   ..$ : chr "Africa"
#>   ..$ : chr "Africa"
#>   ..$ : chr "Africa"
#>  $ V3:List of 6
#>   ..$ : chr "year"
#>   ..$ : num 1952
#>   ..$ : num 1952
#>   ..$ : num 1952
#>   ..$ : num 1952
#>   ..$ : num 1952
#>  $ V4:List of 6
#>   ..$ : chr "lifeExp"
#>   ..$ : num 43.1
#>   ..$ : num 30
#>   ..$ : num 38.2
#>   ..$ : num 47.6
#>   ..$ : num 32
#>  $ V5:List of 6
#>   ..$ : chr "pop"
#>   ..$ : num 9279525
#>   ..$ : num 4232095
#>   ..$ : num 1738315
#>   ..$ : num 442308
#>   ..$ : num 4469979
#>  $ V6:List of 6
#>   ..$ : chr "gdpPercap"
#>   ..$ : num 2449
#>   ..$ : num 3521
#>   ..$ : num 1063
#>   ..$ : num 851
#>   ..$ : num 543

And data frame-style indexing gives unexpected results:

thing[1,]
#> [[1]]
#> [1] "country"
#> 
#> [[2]]
#> [1] "continent"
#> 
#> [[3]]
#> [1] "year"
#> 
#> [[4]]
#> [1] "lifeExp"
#> 
#> [[5]]
#> [1] "pop"
#> 
#> [[6]]
#> [1] "gdpPercap"

I've also tried reshape2::melt and plyr::adply, but had no luck with those.

What do you recommend for getting this into a basic data frame?

Session info
devtools::session_info()
#> Session info --------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.3.2 (2016-10-31)
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_Australia.1252      
#>  tz       America/Los_Angeles         
#>  date     2017-02-20
#> Packages ------------------------------------------------------------------
#>  package    * version    date       source                             
#>  assertthat   0.1        2013-12-06 CRAN (R 3.2.2)                     
#>  backports    1.0.5      2017-01-18 CRAN (R 3.3.2)                     
#>  cellranger   1.1.0.9000 2017-02-20 Github (rsheets/cellranger@024d5ba)
#>  devtools     1.12.0     2016-06-24 CRAN (R 3.3.2)                     
#>  digest       0.6.12     2017-01-27 CRAN (R 3.3.2)                     
#>  evaluate     0.10       2016-10-11 CRAN (R 3.3.1)                     
#>  htmltools    0.3.5      2016-03-21 CRAN (R 3.2.4)                     
#>  jailbreakr   0.0.1      2017-02-20 Github (rsheets/jailbreakr@2fbec5f)
#>  knitr        1.15.1     2016-11-22 CRAN (R 3.3.2)                     
#>  lazyeval     0.2.0.9000 2016-11-07 Github (hadley/lazyeval@c155c3d)   
#>  linen        0.0.4      2017-02-20 Github (rsheets/linen@7618a13)     
#>  magrittr     1.5        2014-11-22 CRAN (R 3.3.1)                     
#>  memoise      1.0.0      2016-01-29 CRAN (R 3.2.5)                     
#>  R6           2.2.0      2016-10-05 CRAN (R 3.3.1)                     
#>  Rcpp         0.12.9     2017-01-14 CRAN (R 3.3.2)                     
#>  rexcel       0.0.1      2017-02-20 Github (rsheets/rexcel@e8dd5d3)    
#>  rmarkdown    1.3.1      2017-02-16 Github (rstudio/rmarkdown@e672d41) 
#>  rprojroot    1.2        2017-01-16 CRAN (R 3.3.2)                     
#>  stringi      1.1.2      2016-10-01 CRAN (R 3.3.1)                     
#>  stringr      1.1.0      2016-08-19 CRAN (R 3.3.1)                     
#>  tibble       1.2        2016-08-26 CRAN (R 3.3.1)                     
#>  withr        1.0.2      2016-06-20 CRAN (R 3.3.0)                     
#>  xml2         1.1.1      2017-02-20 Github (hadley/xml2@c84db5e)       
#>  yaml         2.1.14     2016-11-12 CRAN (R 3.3.2)

p.s. thanks for this project, and for the reprex pkg for making it easier to post code and output!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions