The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
table
objects
with the frab
packageTo cite in publications, please use Hankin (2023).
TLDR: In R, adding two objects of class
table
has a natural interpretation. However, in base R,
adding two tables can give plausible but incorrect results. The
frab
package provides a consistent and efficient way to add
table
objects, subject to disordR
discipline
(Hankin 2022). The underlying mathematical
structure is the Free Abelian group, hence “frab
”.
table()
Suppose we have three R tables:
x <- table(c("a","a","b","c","d","d","a"))
y <- table(c("a","a","b","d","d","d","e"))
z <- table(c("a","a","b","d","d","e","f"))
Can we ascribe any meaning to x+y
without referring to
the arguments sent to table()
? Well yes, we should simply
sum the counts of the various letters. However:
##
## a b c d
## 3 1 1 2
##
## a b d e
## 2 1 3 1
##
## a b c d
## 5 2 4 3
The sum is defined in this case. However, close inspection shows that
the result is clearly incorrect. Although entries for a
and
b
are correct, the third and fourth entries are not as
expected: in this case R idiom simply adds the entries elementwise with
no regard to labels. We would expect x+y
to respect the
fact that we have 5 d
entries, even though element
d
is the fourth entry of x
and the third of
y
. Further:
##
## a b c d
## 3 1 1 2
##
## a b d e f
## 2 1 2 1 1
x+z
## Error in x + z: non-conformable arrays
Above we see that x
and z
do not have a
well-defined sum, in the sense that x+z
returns, quite
reasonably, an error.
A named vector is a vector with a names
attribute. Each element of a named vector is associated with a name or
label. The names are not necessarily unique. It allows you to assign a
name to each element, making it easier to refer to specific values
within the vector using their respective names. Named vectors are a
convenient and useful feature of the R programming language (R Core Team 2022). However, consider the
following two named vectors:
Given that x+y
returns a named vector, there are at
least two plausible values that it might give, viz:
c(a=5,b=3,c=4)
or
c(a=2,b=3,c=7)
.
In the first case the elements of x
and y
are added pairwise, and the names
attribute is taken from
the first of the addends. In the second, the names are
considered to be primary and the value of each name in the sum is the
sum of the values of that name of the addends. Note further that there
is no good reason why the first answer could not be
c(c=5,b=3,a=4)
, obtained by using the names attribute of
y
instead of x
.
frab
packageThe frab
package furnishes efficient methods to give a
consistent and meaningful way of adding two R tables together, using
standard R syntax. It uses the names of a named vector as the indexing
mechanism. Package idiom is straightforward:
## A frab object with entries
## a b d
## 1 2 7
## A frab object with entries
## a b c
## -1 1 4
## A frab object with entries
## b c d
## 3 4 7
Above, note how y
is defined with its entries in
non-standard order, but the resulting frab
object has its
entries ordered alphabetically. In x+y
, the entry for
a
has vanished, as it cancels in the summation. The numeric
entries for each letter are summed, accounting for the different names
[viz a,b,d
and a,b,c
respectively]. The result
is presented using the frab
print method.
Package idiom includes extraction and replacement methods, all of which should work as expected:
## A frab object with entries
## a c d e f g x
## 3 3 1 2 4 9 5
## A disord object with hash 8e06d464d006d7ce8c6fa1e5101a1e042bddadf6 and elements
## [1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE
## (in some order)
## A disord object with hash 8e06d464d006d7ce8c6fa1e5101a1e042bddadf6 and elements
## [1] FALSE FALSE TRUE TRUE FALSE FALSE FALSE
## (in some order)
## A frab object with entries
## f g x
## 4 9 5
## A frab object with entries
## d e
## 1 2
## A frab object with entries
## a c d e f g x
## 3 3 100 100 4 9 5
Above we see that extraction and replacement methods follow
disordR
discipline (Hankin
2022). Results are coerced to disord
objects if
needed. Tables may be added to frab
objects:
## A frab object with entries
## a b c d g i
## 3 6 1 5 7 5
##
## a b c d e f g
## 2 2 1 2 2 2 1
## A frab object with entries
## a b c d e f g i
## 5 8 2 7 2 2 8 5
Above we see the +
operator is defined between a
frab
and a table
, coercing R tables to
frab
objects to give consistent results.
The ideas above have a natural generalization to two-dimensional R tables.
## bar
## foo A B C D F
## b 3 0 8 0 2
## d 5 16 0 0 6
## f 1 0 0 4 0
## bar
## foo A C D E F
## a 0 0 0 9 0
## b 0 0 0 0 8
## e 0 0 4 0 0
## f 7 9 8 0 0
## bar
## foo A B C D E F
## a 0 0 0 0 9 0
## b 3 0 8 0 0 10
## d 5 16 0 0 0 6
## e 0 0 0 4 0 0
## f 8 0 9 12 0 0
Above, note that the resulting sum is automatically resized to
accommodate both addends, and also that entries with nonzero values in
both x
and y
are correctly summed.
The one- and two- dimensional R tables above have somewhat
specialized print methods and the general case with dimension \(\geqslant 3\) uses methods similar to those
of the spray
package. We can generate a
sparsetable
object quite easily:
A <- matrix(0.95,3,3)
diag(A) <- 1
x <- round(rmvnorm(300,mean=rep(10,3),sigma=A/7))
x[] <- letters[x]
head(x)
## [,1] [,2] [,3]
## [1,] "i" "i" "i"
## [2,] "j" "j" "j"
## [3,] "j" "j" "k"
## [4,] "j" "j" "j"
## [5,] "j" "j" "i"
## [6,] "j" "j" "j"
## val
## i i i = 22
## i i j = 2
## i j i = 5
## i j j = 4
## j i i = 2
## j i j = 1
## j j i = 3
## j j j = 223
## j j k = 7
## j k j = 3
## j k k = 1
## k j j = 2
## k j k = 4
## k k j = 1
## k k k = 20
But we can add sx
to other sparsetable
objects:
## val
## i k k = 1003
## j j j = 1004
## j j k = 1001
## k k j = 1002
Then the usual semantics for addition operate:
## val
## i i i = 22
## i i j = 2
## i j i = 5
## i j j = 4
## i k k = 1003
## j i i = 2
## j i j = 1
## j j i = 3
## j j j = 1227
## j j k = 1008
## j k j = 3
## j k k = 1
## k j j = 2
## k j k = 4
## k k j = 1003
## k k k = 20
The word “table” means something unrelated in SQL
. A
short discussion of frab
functionality implemented in
SQL
“table” objects is given in
inst/sql.Rmd
.
disordR
Package.” arXiv. https://doi.org/10.48550/ARXIV.2210.03856.
frab
Package.” arXiv. https://doi.org/10.48550/ARXIV.2307.13184.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.