Melding the Associative and AbstractArray interfaces into a coherent set of indexing
rules, etc.
This package attempts to demonstrate that we can extend some of the (APL-inspired) rules for
indexing AbstractArrays in Julia, to other Associatives.
To acheive this, I have created my own types Assoc, Dic, AbstractArr and Arr to
"mock" these common container interfaces/types in Base. I have also made
AbstractArr a subtype of Assoc; however this detail is not super important.
The important thing is that now:
- We can index a dictionary with a dictionary (to get a dictionary).
- We can index an array with a dictionary (to get a dictionary).
- We can index a dictionary with an array (to get an array).
Some slightly interesting consequences:
- We can have common semantics (and in some cases merge common methods) for
indices(keysinBase),similar,mapand so-on, between arrays and dictionaries. The common semantics should provide a more seamless and intuitive interface for generic programming. - I propose a new
emptyfunction for creating empty dictionaries and vectors.similarwould always preserve the indices of a dictionary (the values may be unintialized). - There is a path forward for multidimensional (Cartesian) dictionaries and structures, that use (a generalization of) the APL indexing rules.
Notes:
- Most of the benefits can be realized in
BasewithoutAbstractArray <: Associative(however this requirement was useful here to help build a conformant interface). - I favoured using
indicesandindextype, to matchgetindexandsetindex!, however this may not be necessary (Basecurrently uses the term "key" for this). - One could also consider indexing of and by
Tuples, such that(a, b, c)[(2,3)] = (b, c),Dict(:a=>1, :b=>2, :c=>3)[(:a,:c)] = (1, 3)and(a,b,c)[[2,1]] = [b,a], etc. (However, might clash with dictionary keys being themselves tuples). Similarly for named tuples.
The overarching philisophy is that an array is just a special kind of associative container
where lookup is fast (using linear indexing or Cartesian indexing). We then search for
commanilities between the current powerful AbstractArray interface and what makes sense
also for Associative.
An Assoc{I, T} (associative container) a is a mapping from a unique set of indices of
type I, to values of type T. Iterating an Assoc returns just the values, not the
indices. A single value can be retrieved via getindex (a[i]) and set via setindex!
(a[i] = v). The set of indices can be retrieved via indices(a) - note that these
generally might not be ordered, and can be compared with the new issetequal predicate. The
values are of type eltype(a) and the indices type is retrieved via a new function,
indextype(a).
There is a pairs function which returns a Pairs view of the Assoc. Iterating this will
return pairs i => v, and moreover, the Pairs type preserves the ability to perform
indexing such that pairs(a)[i] == (i => a[i]).
To make sense of arrays in this context, we note that the fundamental scalar indices of
(multidimensional) arrays are CartesianIndexs, and the set of indices form a
CartesianRange.
AbstractArr is meant to mirror the behavior of AbstractArray, however we have defined
AbstractArr{T, N} <: Assoc{CartesianIndex{N}, T}. The subtyping enforces that we create
a common interface.
We all love that we can index an array like [10, 11, 12][2:3] == [11, 12]. Here we
generalize this to all associative containers a1 and a2 using the following rules to
construct out = a1[a2]:
- The output container shares the indices of
a2. - The values
out[i]correspond toa1[a2[i]].
Similary, for non-sclar setindex! with a1[a2] = v, we use the following rule
- For scalar
v, assign the valuevto the indices indicated by the values ofa2, so thata1[a2[i]] = vfor alli in indices(a2). - For
v::Assoc, we expect to match the indices ofvwith those ofa2, such that we acheivea1[a2[i]] = v[i]for alli in indices(a2).
Note that these rules are fully compatible with the built-in Julia arrays using one-based indexing and the OffsetArrays.jl package.
Many operations like getindex, map, etc require us to build and return new containers.
Here we overload the similar method to have the behaviour such that
similar(a::Assoc, ::Type{T}, inds)creates a newAssocthat has the indicesindsand unitialized values of typeT.similar(a::Assoc) = similar(a, eltype(a), indices(a))- etc for optional new indices or eltype...
Since arrays are associative containers with CartesianRange as its indices, it is easy to
determine the situations where an array is the appropriate output.
Currently similar(::Dict) creates an empty dictionary. Here we can create empty vectors
and dictionaries (that we expect to add elements to) to using the new empty function:
empty(a::Assoc, ::Type{I}, ::Type{V}creates a new container with the appropriate index and element types.empty(a::Assoc, ::Type{T}) == empty(a, indextype(a), T)empty(a::Assoc) == empty(a, indextype(a), eltype(a))
From here it is relatively straightforward to define a map function which works well on
arrays and dictionaries via preseriving the indices of the input(s). Similar to empty and
similar, one might want appropriate methods for zeros and ones and so-on. Functions
such as reduce only depend on the iterator interface.
Speculatively:
- Functions like
filtercould present a lazy view of associatives (and arrays) where some indices are removed (and the remainder preserved, not moved). - One might try to generalize the multidimensional behavior of arrays to arbitrary
associatives, and also have a look at
broadcastin this context.
Performance work is ongoing in this package (for instance, linear vs Cartesian indexing,
using Dict more efficiently, etc).