Once we developed a star database in R, we would like to exploit it directly in R to develop multidimensional queries, but that is part of a future project. Currently we need to export the result to other formats to be able to use other OLAP query tools.
This document shows the possibilities offered by the package in this
context: Export an object of the star_database class to
other formats. After presenting the object that we will use as an
example and how to select it, a section is dedicated to each type of
element obtained: dm, list of tibble objects,
relational databases, xlsx and csv files, and
geomultistar::multistar object. Finally, it finish with the
conclusions.
star_database objectThe variable mrs_db, obtained in the vignette titled
Obtaining and transforming flat tables,
vignette("v05-flat-table-op"), contains an object of class
star_database that we will use in the examples.
We can see a representation of the tables it contains using the
draw_tables() function, as shown below.
Strictly speaking, a star database is composed of a fact table and
several associated dimension tables. A constellation is made up of
several star databases that can share dimensions. In the
rolap package they are treated in a unified way under the
star_database class: It is used both to define star
databases and to define constellations.
It is possible that we are interested in exporting only a star
database of the constellation, or a subset of it (also a constellation).
For this reason, in this situation, the first step before performing the
export operation would be to select the star databases that interest us
from the constellation. This operation can be carried out using the
get_star_database() function, where the names of the star
databases are indicated. The names can be obtained using the
get_fact_names() function.
Next, we select one of the star databases (also an object of class
star_database) and display its tables.
dm class objectThe dm package
allows us to work in R with tables that correspond to others from
relational databases, both to import and export them. It also allows
them to be represented graphically (the graphical representations of the
tables presented in the previous section have been made using the
dm package).
We can directly obtain an object of the dm class from
the tables of our star databases using the as_dm_class()
function.
tibble objectsWe can generate a tibble list from the component tables
or from the flat tables obtained from them.
Using the as_tibble_list() function, we get a
tibble list with the dimension and fact tables, as shown
below.
tl <- mrs_db |>
  as_tibble_list()
tl
#> $when
#> # A tibble: 2,076 × 4
#>    when_key year  week  week_ending_date
#>       <int> <chr> <chr> <chr>           
#>  1        1 1962  " 1"  01/06/1962      
#>  2        2 1962  " 2"  01/13/1962      
#>  3        3 1962  " 5"  02/03/1962      
#>  4        4 1962  " 6"  02/10/1962      
#>  5        5 1962  " 8"  02/24/1962      
#>  6        6 1962  " 9"  03/03/1962      
#>  7        7 1962  "12"  03/24/1962      
#>  8        8 1962  "14"  04/07/1962      
#>  9        9 1962  "15"  04/14/1962      
#> 10       10 1962  "18"  05/05/1962      
#> # ℹ 2,066 more rows
#> 
#> $where
#> # A tibble: 122 × 9
#>    where_key region state city        city_state     status    pop   lat   long 
#>        <int> <chr>  <chr> <chr>       <chr>          <chr>     <chr> <chr> <chr>
#>  1         1 1      CT    Bridgeport  Bridgeport CT  non-capi… "  1… 41.2  " -7…
#>  2         2 1      CT    Hartford    Hartford CT    state ca… "  1… 41.8  " -7…
#>  3         3 1      CT    New Haven   New Haven CT   non-capi… "  1… 41.3  " -7…
#>  4         4 1      CT    Waterbury   Waterbury CT   non-capi… "  1… 41.6  " -7…
#>  5         5 1      MA    Cambridge   Cambridge MA   non-capi… "  1… 42.4  " -7…
#>  6         6 1      MA    Fall River  Fall River MA  non-capi… "   … 41.7  " -7…
#>  7         7 1      MA    Lowell      Lowell MA      non-capi… "  1… 42.6  " -7…
#>  8         8 1      MA    Lynn        Lynn MA        non-capi… "   … 42.5  " -7…
#>  9         9 1      MA    New Bedford New Bedford MA non-capi… "   … 41.7  " -7…
#> 10        10 1      MA    Somerville  Somerville MA  non-capi… "   … 42.4  " -7…
#> # ℹ 112 more rows
#> 
#> $who
#> # A tibble: 5 × 2
#>   who_key age        
#>     <int> <chr>      
#> 1       1 1-24 years 
#> 2       2 25-44 years
#> 3       3 45-64 years
#> 4       4 65+ years  
#> 5       5 <1 year    
#> 
#> $mrs_cause
#> # A tibble: 3,677 × 5
#>    when_key where_key pneumonia_and_influenza_deaths all_deaths nrow_agg
#>       <int>     <int>                          <int>      <int>    <int>
#>  1        1        87                              2         69        1
#>  2        2         8                              1         28        1
#>  3        2        10                              2         21        1
#>  4        2        64                             10        220        1
#>  5        2        97                              2         12        1
#>  6        3        15                              2         32        1
#>  7        4         6                              0         31        1
#>  8        4        15                              1         32        1
#>  9        4       119                              2         51        1
#> 10        5        36                              0         39        1
#> # ℹ 3,667 more rows
#> 
#> $mrs_age
#> # A tibble: 18,228 × 5
#>    when_key where_key who_key all_deaths nrow_agg
#>       <int>     <int>   <int>      <int>    <int>
#>  1        1        87       1          1        1
#>  2        1        87       2          5        1
#>  3        1        87       3         17        1
#>  4        1        87       4         41        1
#>  5        1        87       5          5        1
#>  6        2         8       1          0        1
#>  7        2         8       2          2        1
#>  8        2         8       3         11        1
#>  9        2         8       4         14        1
#> 10        2         8       5          1        1
#> # ℹ 18,218 more rowsUsing the as_single_tibble_list() function, we also get
a tibble list but in this case the fact and dimension
tables have been merged to form a flat table as shown below.
tl <- mrs_db |>
  as_single_tibble_list()
tl
#> $mrs_cause
#> # A tibble: 3,677 × 14
#>    year  week  week_ending_date region state city  city_state status pop   lat  
#>    <chr> <chr> <chr>            <chr>  <chr> <chr> <chr>      <chr>  <chr> <chr>
#>  1 1962  " 1"  01/06/1962       7      OK    Tulsa Tulsa OK   non-c… "  3… 36.1 
#>  2 1962  " 2"  01/13/1962       1      MA    Lynn  Lynn MA    non-c… "   … 42.5 
#>  3 1962  " 2"  01/13/1962       1      MA    Some… Somervill… non-c… "   … 42.4 
#>  4 1962  " 2"  01/13/1962       5      DC    Wash… Washingto… non-c… "  5… 38.9 
#>  5 1962  " 2"  01/13/1962       8      CO    Colo… Colorado … non-c… "  3… 38.9 
#>  6 1962  " 5"  02/03/1962       2      NJ    Eliz… Elizabeth… non-c… "  1… 40.7 
#>  7 1962  " 6"  02/10/1962       1      MA    Fall… Fall Rive… non-c… "   … 41.7 
#>  8 1962  " 6"  02/10/1962       2      NJ    Eliz… Elizabeth… non-c… "  1… 40.7 
#>  9 1962  " 6"  02/10/1962       9      WA    Spok… Spokane WA non-c… "  1… 47.7 
#> 10 1962  " 8"  02/24/1962       3      IL    Rock… Rockford … non-c… "  1… 42.3 
#> # ℹ 3,667 more rows
#> # ℹ 4 more variables: long <chr>, pneumonia_and_influenza_deaths <int>,
#> #   all_deaths <int>, nrow_agg <int>
#> 
#> $mrs_age
#> # A tibble: 18,228 × 14
#>    year  week  week_ending_date region state city  city_state status pop   lat  
#>    <chr> <chr> <chr>            <chr>  <chr> <chr> <chr>      <chr>  <chr> <chr>
#>  1 1962  " 1"  01/06/1962       7      OK    Tulsa Tulsa OK   non-c… "  3… 36.1 
#>  2 1962  " 1"  01/06/1962       7      OK    Tulsa Tulsa OK   non-c… "  3… 36.1 
#>  3 1962  " 1"  01/06/1962       7      OK    Tulsa Tulsa OK   non-c… "  3… 36.1 
#>  4 1962  " 1"  01/06/1962       7      OK    Tulsa Tulsa OK   non-c… "  3… 36.1 
#>  5 1962  " 1"  01/06/1962       7      OK    Tulsa Tulsa OK   non-c… "  3… 36.1 
#>  6 1962  " 2"  01/13/1962       1      MA    Lynn  Lynn MA    non-c… "   … 42.5 
#>  7 1962  " 2"  01/13/1962       1      MA    Lynn  Lynn MA    non-c… "   … 42.5 
#>  8 1962  " 2"  01/13/1962       1      MA    Lynn  Lynn MA    non-c… "   … 42.5 
#>  9 1962  " 2"  01/13/1962       1      MA    Lynn  Lynn MA    non-c… "   … 42.5 
#> 10 1962  " 2"  01/13/1962       1      MA    Lynn  Lynn MA    non-c… "   … 42.5 
#> # ℹ 18,218 more rows
#> # ℹ 4 more variables: long <chr>, age <chr>, all_deaths <int>, nrow_agg <int>To export the component tables to a relational database, we can use
the as_rdb() function. We have to pass it as a parameter a
connection to the database, which we manage.
con <- DBI::dbConnect(RSQLite::SQLite())
mrs_db |>
  as_rdb(con)
DBI::dbListTables(con)
#> [1] "mrs_age"   "mrs_cause" "when"      "where"     "who"
DBI::dbDisconnect(con)Additionally, in the the vignette titled Deployment of star
databases with incremental refresh,
vignette("v50-deploy"), it is described how to deploy a
star database in a relational database so that it is automatically
updated through periodic refresh operations.
Using the as_xlsx_file() function, we get an
xslx file where each table is stored in a sheet. We have to
indicate the name of the file. For the example, we select a temporary
file.
f <- mrs_db |>
  as_xlsx_file(file = tempfile())
#> java.home option:
#> JAVA_HOME environment variable: C:\Program Files\Java\jdk-17
#> Warning in fun(libname, pkgname): Java home setting is INVALID, it will be ignored.
#> Please do NOT set it unless you want to override system settings.
f
#> [1] "C:\\Users\\jsamos\\AppData\\Local\\Temp\\RtmpSgcVHx\\file9fa06657998.xlsx"As we can see, it is responsible for assigning the appropriate extension.
Using the as_csv_files() function we can store each
table in a different csv file, in the indicated folder, as shown
below.
geomultistar::multistar objectUsing the as_multistar() function we can get a
geomultistar::multistar object; with this object we can use
the query and integration functions with geographic information offered
by the geomultistar
package.
This document presents the functions to export the tables that make up the star databases to other types of formats.
The objective of these functions is that the rolap
package can be used to transform the data and that other tools can be
easily used to analyse it.