CrossFit Gyms

Components of the Cassandra data model

Compound key

One property of CrossFit gyms is that each gym must have a unique name i.e. no two gyms are allowed to share the same name.

The table below is useful for looking up a gym when we know the name of the gym we’re looking for.

CREATE KEYSPACE IF NOT EXISTS admatic WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': '2', 'dc2': '2'};
use admatic;

CREATE TABLE crossfit_gyms (
   gym_name text,
   city text,
   state_province text,
   country_code text,
   PRIMARY KEY (gym_name)
);

INSERT INTO crossfit_gyms (country_code, state_province, city, gym_name) VALUES ('USA', 'CA', 'San Francisco', 'San Francisco CrossFit');

SELECT * FROM crossfit_gyms WHERE gym_name='San Francisco CrossFit';

 gym_name               | city          | country_code | state_province
------------------------+---------------+--------------+----------------
 San Francisco CrossFit | San Francisco |          USA |             CA

(1 rows)

SELECT * FROM crossfit_gyms WHERE country_code='USA';
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"

Now suppose we want to look up gyms by location. If we use the crossfit_gyms table, we’ll need to iterate over the entire result set. Instead, we’ll create a new table that will allow us to query gyms by country.

CREATE TABLE crossfit_gyms_by_location (
   country_code text,
   state_province text,
   city text,
   gym_name text,
   PRIMARY KEY (country_code, state_province, city, gym_name)
);

While useful for searching gyms by country, using this table to identify gyms within a particular state or city requires iterating over all gyms within the country in which the state or city is located.

Clustering key

Clustering keys are responsible for sorting data within a partition. So when we query for all gyms in the United States, the result set will be ordered first by state_province in ascending order, followed by city in ascending order, and finally gym_name in ascending order.

INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'CA', 'Zan Francisco', 'Zan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'CA', 'Yan Francisco', 'Yan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'CA', 'Xan Francisco', 'Xan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'DA', 'Zan Francisco', 'Zan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'DA', 'Yan Francisco', 'Yan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'DA', 'Xan Francisco', 'Xan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'EA', 'Zan Francisco', 'Zan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'EA', 'Yan Francisco', 'Yan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'EA', 'Xan Francisco', 'Xan Francisco CrossFit');

SELECT * FROM crossfit_gyms_by_location WHERE country_code='USA';

 country_code | state_province | city          | gym_name
--------------+----------------+---------------+------------------------
          USA |             CA | Xan Francisco | Xan Francisco CrossFit
          USA |             CA | Yan Francisco | Yan Francisco CrossFit
          USA |             CA | Zan Francisco | Zan Francisco CrossFit
          USA |             DA | Xan Francisco | Xan Francisco CrossFit
          USA |             DA | Yan Francisco | Yan Francisco CrossFit
          USA |             DA | Zan Francisco | Zan Francisco CrossFit
          USA |             EA | Xan Francisco | Xan Francisco CrossFit
          USA |             EA | Yan Francisco | Yan Francisco CrossFit
          USA |             EA | Zan Francisco | Zan Francisco CrossFit

(9 rows)

Order by

To sort in descending order, add a WITH clause to the end of the CREATE TABLE statement.

DROP TABLE crossfit_gyms_by_location ;
CREATE TABLE crossfit_gyms_by_location (
   country_code text,
   state_province text,
   city text,
   gym_name text,
   PRIMARY KEY (country_code, state_province, city, gym_name)
) WITH CLUSTERING ORDER BY (state_province DESC, city ASC, gym_name ASC);

The result set will now contain gyms ordered first by state_province in descending order, followed by city in ascending order, and finally gym_name in ascending order. You must specify the sort order for each of the clustering keys in the ORDER BY statement. The partition key is not part of the ORDER BY statement because its values are hashed and therefore won’t be close to each other in the cluster.

INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'CA', 'Zan Francisco', 'Zan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'CA', 'Yan Francisco', 'Yan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'CA', 'Xan Francisco', 'Xan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'DA', 'Zan Francisco', 'Zan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'DA', 'Yan Francisco', 'Yan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'DA', 'Xan Francisco', 'Xan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'EA', 'Zan Francisco', 'Zan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'EA', 'Yan Francisco', 'Yan Francisco CrossFit');
INSERT INTO crossfit_gyms_by_location (country_code, state_province, city, gym_name) VALUES ('USA', 'EA', 'Xan Francisco', 'Xan Francisco CrossFit');

SELECT * FROM crossfit_gyms_by_location WHERE country_code='USA';

 country_code | state_province | city          | gym_name
--------------+----------------+---------------+------------------------
          USA |             EA | Xan Francisco | Xan Francisco CrossFit
          USA |             EA | Yan Francisco | Yan Francisco CrossFit
          USA |             EA | Zan Francisco | Zan Francisco CrossFit
          USA |             DA | Xan Francisco | Xan Francisco CrossFit
          USA |             DA | Yan Francisco | Yan Francisco CrossFit
          USA |             DA | Zan Francisco | Zan Francisco CrossFit
          USA |             CA | Xan Francisco | Xan Francisco CrossFit
          USA |             CA | Yan Francisco | Yan Francisco CrossFit
          USA |             CA | Zan Francisco | Zan Francisco CrossFit

(9 rows)

Composite key

Composite keys are partition keys that consist of multiple columns. To avoid wide rows, we can move to a composite key consisting of additional columns.

CREATE TABLE crossfit_gyms_by_city (
 country_code text,
 state_province text,
 city text,
 gym_name text,
 opening_date timestamp,
 PRIMARY KEY ((country_code, state_province, city), opening_date, gym_name)
) WITH CLUSTERING ORDER BY ( opening_date ASC, gym_name ASC );

Now, each combination of country_code, state_province, and city will have its own hash value and be stored in a separate partition within the cluster.

INSERT INTO crossfit_gyms_by_city (country_code, state_province, city, gym_name, opening_date) VALUES ('USA', 'CA', 'Zan Francisco', 'Zan Francisco CrossFit', dateof(NOW()));
INSERT INTO crossfit_gyms_by_city (country_code, state_province, city, gym_name, opening_date) VALUES ('USA', 'CA', 'Yan Francisco', 'Yan Francisco CrossFit', dateof(NOW()));
INSERT INTO crossfit_gyms_by_city (country_code, state_province, city, gym_name, opening_date) VALUES ('USA', 'CA', 'Xan Francisco', 'Xan Francisco CrossFit', dateof(NOW()));
INSERT INTO crossfit_gyms_by_city (country_code, state_province, city, gym_name, opening_date) VALUES ('USA', 'DA', 'Zan Francisco', 'Zan Francisco CrossFit', dateof(NOW()));
INSERT INTO crossfit_gyms_by_city (country_code, state_province, city, gym_name, opening_date) VALUES ('USA', 'DA', 'Yan Francisco', 'Yan Francisco CrossFit', dateof(NOW()));
INSERT INTO crossfit_gyms_by_city (country_code, state_province, city, gym_name, opening_date) VALUES ('USA', 'DA', 'Xan Francisco', 'Xan Francisco CrossFit', dateof(NOW()));
INSERT INTO crossfit_gyms_by_city (country_code, state_province, city, gym_name, opening_date) VALUES ('USA', 'EA', 'Zan Francisco', 'Zan Francisco CrossFit', dateof(NOW()));
INSERT INTO crossfit_gyms_by_city (country_code, state_province, city, gym_name, opening_date) VALUES ('USA', 'EA', 'Yan Francisco', 'Yan Francisco CrossFit', dateof(NOW()));
INSERT INTO crossfit_gyms_by_city (country_code, state_province, city, gym_name, opening_date) VALUES ('USA', 'EA', 'Xan Francisco', 'Xan Francisco CrossFit', dateof(NOW()));

SELECT token(country_code, state_province, city) FROM crossfit_gyms_by_city;

 system.token(country_code, state_province, city)
--------------------------------------------------
                             -8458359574911311540
                             -8276155982978925814
                             -3658271257738167517
                             -2492327001946213135
                             -2107058130455252847
                              1189007303748993961
                              2557468856498952304
                              6596881853665117539
                              7691624642959914655

(9 rows)

SELECT token(country_code) FROM crossfit_gyms_by_city;
ServerError: java.lang.NullPointerException

SELECT token(country_code) FROM crossfit_gyms_by_city;
ServerError: java.lang.NullPointerException

A note about querying clustered composite keys

When issuing a CQL query, you must include all partition key columns, at a minimum. You can then apply an additional filter by adding each clustering key in the order in which the clustering keys appear.

SELECT * FROM crossfit_gyms_by_city WHERE country_code='USA';
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"

SELECT * FROM crossfit_gyms_by_city WHERE country_code='USA' AND state_province='EA' AND city='Xan Francisco';

 country_code | state_province | city          | opening_date                    | gym_name
--------------+----------------+---------------+---------------------------------+------------------------
          USA |             EA | Xan Francisco | 2018-07-05 18:17:38.151000+0000 | Xan Francisco CrossFit

(1 rows)

results matching ""

    No results matching ""