Collections
Noteworthy characteristics
Collections are meant for storing/denormalizing relatively small amount of data.
- Individual collections are not indexed internally. Which means that even to access a single element of a collection, the whole collection has to be read (and reading one is not paged internally).
- While insertion operations on sets and maps never incur a read-before-write internally, some operations on lists do. Further, some lists operations are not idempotent by nature, making their retry in case of timeout problematic. It is thus advised to prefer sets over lists when possible.
Maps
CREATE KEYSPACE admatic WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 1};
USE admatic;
- A map of
textkeys, andtextvalues
CREATE TABLE users (
id text PRIMARY KEY,
name text,
favs map<text, text>
);
INSERT INTO users (id, name, favs) VALUES ('jsmith', 'John Smith', { 'fruit' : 'Apple', 'band' : 'Beatles' });
SELECT * FROM users;
id | favs | name
--------+---------------------------------------+------------
jsmith | {'band': 'Beatles', 'fruit': 'Apple'} | John Smith
(1 rows)
- Replace the existing map entirely.
UPDATE users SET favs = { 'fruit' : 'Banana' } WHERE id = 'jsmith';
SELECT * FROM users;
id | favs | name
--------+---------------------+------------
jsmith | {'fruit': 'Banana'} | John Smith
(1 rows)
- Updating or inserting one or more elements
UPDATE users SET favs['author'] = 'Ed Poe' WHERE id = 'jsmith';
SELECT * FROM users;
id | favs | name
--------+-----------------------------------------+------------
jsmith | {'author': 'Ed Poe', 'fruit': 'Banana'} | John Smith
(1 rows)
UPDATE users SET favs = favs + { 'movie' : 'Cassablanca', 'band' : 'ZZ Top' } WHERE id = 'jsmith';
SELECT * FROM users;
id | favs | name
--------+-----------------------------------------------------------------------------------+------------
jsmith | {'author': 'Ed Poe', 'band': 'ZZ Top', 'fruit': 'Banana', 'movie': 'Cassablanca'} | John Smith
(1 rows)
- Removing one or more element (if an element doesn’t exist, removing it is a no-op but no error is thrown)
DELETE favs['author'] FROM users WHERE id = 'jsmith';
SELECT * FROM users;
id | favs | name
--------+---------------------------------------------------------------+------------
jsmith | {'band': 'ZZ Top', 'fruit': 'Banana', 'movie': 'Cassablanca'} | John Smith
(1 rows)
UPDATE users SET favs = favs - { 'movie', 'band'} WHERE id = 'jsmith';
SELECT * FROM users;
id | favs | name
--------+---------------------+------------
jsmith | {'fruit': 'Banana'} | John Smith
(1 rows)
- TTLs are allowed for both
INSERTandUPDATE, but in both cases, the TTL set only apply to the newly inserted/updated elements.
UPDATE users USING TTL 10 SET favs['color'] = 'green' WHERE id = 'jsmith';
SELECT * FROM users;
id | favs | name
--------+---------------------------------------+------------
jsmith | {'color': 'green', 'fruit': 'Banana'} | John Smith
(1 rows)
After 10 seconds
SELECT * FROM users;
id | favs | name
--------+---------------------+------------
jsmith | {'fruit': 'Banana'} | John Smith
(1 rows)
Sets
A set is a (sorted) collection of unique values.
- A set of text values
CREATE TABLE images (
name text PRIMARY KEY,
owner text,
tags set<text>
);
INSERT INTO images (name, owner, tags) VALUES ('cat.jpg', 'jsmith', { 'pet', 'cute' });
SELECT * FROM images;
name | owner | tags
---------+--------+-----------------
cat.jpg | jsmith | {'cute', 'pet'}
(1 rows)
- Replace the existing set entirely
UPDATE images SET tags = { 'kitten', 'cat', 'lol' } WHERE name = 'cat.jpg';
SELECT * FROM images;
name | owner | tags
---------+--------+--------------------------
cat.jpg | jsmith | {'cat', 'kitten', 'lol'}
(1 rows)
- Adding one or multiple elements (as this is a set, inserting an already existing element is a no-op)
UPDATE images SET tags = tags + { 'gray', 'cuddly' } WHERE name = 'cat.jpg';
SELECT * FROM images;
name | owner | tags
---------+--------+--------------------------------------------
cat.jpg | jsmith | {'cat', 'cuddly', 'gray', 'kitten', 'lol'}
(1 rows)
- Removing one or multiple elements (if an element doesn’t exist, removing it is a no-op but no error is thrown)
UPDATE images SET tags = tags - { 'cat' } WHERE name = 'cat.jpg';
SELECT * FROM images;
name | owner | tags
---------+--------+-------------------------------------
cat.jpg | jsmith | {'cuddly', 'gray', 'kitten', 'lol'}
(1 rows)
- Lastly, as for maps, TTLs if used only apply to the newly inserted values.
UPDATE images USING TTL 10 SET tags = tags + { 'cat' } WHERE name = 'cat.jpg';
SELECT * FROM images;
name | owner | tags
---------+--------+--------------------------------------------
cat.jpg | jsmith | {'cat', 'cuddly', 'gray', 'kitten', 'lol'}
(1 rows)
After 10 seconds
SELECT * FROM images;
name | owner | tags
---------+--------+-------------------------------------
cat.jpg | jsmith | {'cuddly', 'gray', 'kitten', 'lol'}
(1 rows)
Lists
Lists have limitations and specific performance considerations that you should take into account before using them. In general, if you can use a set instead of list, always prefer a set.
A list is a (sorted) collection of non-unique values where elements are ordered by their position in the list.
- A list of integers
CREATE TABLE plays (
id text PRIMARY KEY,
game text,
players int,
scores list<int>
);
INSERT INTO plays (id, game, players, scores) VALUES ('123-afde', 'quake', 3, [17, 4, 2]);
SELECT * FROM plays;
id | game | players | scores
----------+-------+---------+------------
123-afde | quake | 3 | [17, 4, 2]
(1 rows)
- Replace the existing list entirely
UPDATE plays SET scores = [ 3, 9, 4] WHERE id = '123-afde';
SELECT * FROM plays;
id | game | players | scores
----------+-------+---------+-----------
123-afde | quake | 3 | [3, 9, 4]
(1 rows)
- Appending and prepending values to a list
UPDATE plays SET players = 5, scores = scores + [ 14, 21 ] WHERE id = '123-afde';
SELECT * FROM plays;
id | game | players | scores
----------+-------+---------+-------------------
123-afde | quake | 5 | [3, 9, 4, 14, 21]
(1 rows)
UPDATE plays SET players = 6, scores = [ 3 ] + scores WHERE id = '123-afde';
SELECT * FROM plays;
id | game | players | scores
----------+-------+---------+----------------------
123-afde | quake | 6 | [3, 3, 9, 4, 14, 21]
(1 rows)
- Setting the value at a particular position in the list. This imply that the list has a pre-existing element for that position or an error will be thrown that the list is too small
UPDATE plays SET scores[1] = 7 WHERE id = '123-afde';
cqlsh:admatic> SELECT * FROM plays;
id | game | players | scores
----------+-------+---------+----------------------
123-afde | quake | 6 | [3, 7, 9, 4, 14, 21]
(1 rows)
- Removing an element by its position in the list. This imply that the list has a pre-existing element for that position or an error will be thrown that the list is too small.
DELETE scores[1] FROM plays WHERE id = '123-afde';
SELECT * FROM plays;
id | game | players | scores
----------+-------+---------+-------------------
123-afde | quake | 6 | [3, 9, 4, 14, 21]
(1 rows)
- Deleting all the occurrences of particular values in the list (if a particular element doesn’t occur at all in the list, it is simply ignored and no error is thrown)
UPDATE plays SET scores = scores - [ 12, 21 ] WHERE id = '123-afde';
SELECT * FROM plays;
id | game | players | scores
----------+-------+---------+---------------
123-afde | quake | 6 | [3, 9, 4, 14]
(1 rows)
- Lastly, as for maps, TTLs when used only apply to the newly inserted values.
UPDATE plays USING TTL 10 SET scores = [ 3 ] + scores WHERE id = '123-afde';
SELECT * FROM plays;
id | game | players | scores
----------+-------+---------+------------------
123-afde | quake | 6 | [3, 3, 9, 4, 14]
(1 rows)
After 10 seconds
SELECT * FROM plays;
id | game | players | scores
----------+-------+---------+---------------
123-afde | quake | 6 | [3, 9, 4, 14]
(1 rows)
Warnings
- The append and prepend operations are not idempotent by nature. So in particular, if one of these operation timeout, then retrying the operation is not safe and it may (or may not) lead to appending/prepending the value twice.
- Setting and removing an element by position and removing occurences of particular values incur an internal read-before-write. They will thus run more slowly and take more ressources than usual updates (with the exclusion of conditional write that have their own cost).